N1Ql Query for group by and latest when compared with date ordered by desc

hari · June 1, 2020, 8:10am

Hi,

I am trying to work out on creating a n1ql query which has to fetch the latest document in the list when compared to a given date ordered by desc and group by two fields.

Ex: Documents will be in below fashion
{
_Id = 1
"Key1 " : “7777”,
“Key2” : “abcdef”
“DateTime”: “2015-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 11”
“Other Field 2”: “Other Field 21”
…
}
{
_Id = 2
"Key1 " : “7777”,
“Key2” : “abcdef”
“DateTime”: “2016-03-27T00:00:00Z”
“Date 2”: “2020-03-28T00:00:00Z”
“Other Field 1”: “Other Field 12”
“Other Field 2”: “Other Field 22”
…
}
{
_Id = 3
"Key1 " : “7777”,
“Key2” : “abcdef”
“DateTime”: “2017-03-27T00:00:00Z”
“Date 2”: “2020-03-29T00:00:00Z”
“Other Field 1”: “Other Field 13”
“Other Field 2”: “Other Field 23”
…
}

{
_Id = 4
"Key1 " : “7777”,
“Key2” : “ghijkl”
“DateTime”: “2015-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 14”
“Other Field 2”: “Other Field 24”
…
}
{
_Id = 5
"Key1 " : “7777”,
“Key2” : “ghijkl”
“DateTime”: “2016-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 15”
“Other Field 2”: “Other Field 25”
…
}

Similarly documents with Key1 as “8888” as above.

{
_Id = 6
"Key1 " : “8888”,
“Key2” : “abcdef”
“DateTime”: “2015-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 11”
“Other Field 2”: “Other Field 21”
…
}
{
_Id = 7
"Key1 " : “8888”,
“Key2” : “abcdef”
“DateTime”: “2016-03-27T00:00:00Z”
“Date 2”: “2020-03-28T00:00:00Z”
“Other Field 1”: “Other Field 12”
“Other Field 2”: “Other Field 22”
…
}
{
_Id = 8
"Key1 " : “8888”,
“Key2” : “abcdef”
“DateTime”: “2017-03-27T00:00:00Z”
“Date 2”: “2020-03-29T00:00:00Z”
“Other Field 1”: “Other Field 13”
“Other Field 2”: “Other Field 23”
…
}

{
_Id = 9
"Key1 " : “8888”,
“Key2” : “ghijkl”
“DateTime”: “2015-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 14”
“Other Field 2”: “Other Field 24”
…
}
{
_Id = 10
"Key1 " : “8888”,
“Key2” : “ghijkl”
“DateTime”: “2016-03-27T00:00:00Z”
“Date 2”: “2020-03-27T00:00:00Z”
“Other Field 1”: “Other Field 15”
“Other Field 2”: “Other Field 25”
…
}

When queried with list of Key1 , list of Key2, and on particular date, it should fetch the top one from the list when ordered by desc for each Key1 and Key2 combination. The resultant should just provide me documents with _id just 2 and 7.

Here i need to select just Key1, Key2, Date2, OtherField1 and not every thing else from the documents. I have come up something like this,

select p.jp.Key1, p.jp.Key2, p.jp.Date2, p.jp.OtherField1 from (
SELECT max ([a.DateTime, a]) [1] as jp
from bucket as a
join bucket as b on KEYS META(a).id
WHERE a.Key1 in [“7777”,“8888”]
AND a.Key2 in [“abcdef”,“ghijkl”]
and a.DateTime <= 2016-03-28
group by a.Key1,a.Key2) p;

Is there any other best way to do achieve this? Is the query correct in the first place?

Tried something like below, but it doesn’t seem to work.

SELECT a.key1, a.key2, cmbyid[0].Date2 as Date2, cmbyid[0].OtherField1 as OtherField1,
from bucket as a
join bucket as b on KEYS META(a).id
WHERE a.Key1 in [“7777”,“8888”]
AND a.Key2 in [“abcdef”,“ghijkl”]
and a.DateTime <= 2016-03-28
group by a.key1, a.key2
LETTING cmbyid = ARRAY_AGG(a);

Thanks.

Please let me know if this can be achieved using spring JPA ?

vsr1 · June 1, 2020, 1:42pm

CREATE INDEX ix1 ON bucket(Key1, DateTime, Key2, Date2, OtherField1);
SELECT a.Key1, a.Key2, MAX([a.DateTime, {a.Date2, a.OtherField1}])[1].*
FROM bucket AS a
WHERE a.Key1 IN ["7777","8888"]
AND a.DateTime <= 2016-03-28
GROUP by a.Key1, a.Key2;

hari · June 1, 2020, 4:00pm

Thank you for the quick reply @vsr1

hari · June 2, 2020, 10:46pm

Hi @vsr1,
May be a silly question.
Can we have a map of values included in query such that the key of the given map should match against the key of query and values from map should get included in the result?
Ex: In general, we can use a static constant in select query as such

SELECT p._class,p.Key1, p.Key2, p.DateTime, p.Date2, “test” from bucket p where p._class=“com.xyz.abc” and p.key1 in [“7777”,“8888”]
Result will be like,
[
{
“$1”: “test”,
“_class”: “com.xyz.abc”,
"Key1 " : “7777”,
“Key2” : “ghijkl”,
“DateTime”: “2016-03-27T00:00:00Z”,
“Date 2”: “2020-03-27T00:00:00Z”
},
{
“$1”: “test”,
“_class”: “com.xyz.abc”,
“Key1 " : “8888”,
“Key2” : “ghijkl”
“DateTime”: “2016-03-27T00:00:00Z”,
“Date 2”: “2020-03-27T00:00:00Z”
}
]

I am just looking for something like,
SELECT p._class,p.Key1, p.Key2, p.DateTime, p.Date2,[“7777”:“A1”, "8888:“B2”] from bucket p where p._class=“com.xyz.abc” and p.key1 in [“7777”,“8888”] GROUP by p.Key1, p.Key2 …;

While retrieving the results, the values from the given input map ["7777":"A1", "8888:"B2"]when matched against the p.key1 values should get appended in the result documents. And variable should be specified as something like “Key3”

[
{
“Key1”: “A1”,
“_class”: “com.xyz.abc”,
"Key1 " : “7777”,
“Key2” : “ghijkl”,
“DateTime”: “2016-03-27T00:00:00Z”,
“Date 2”: “2020-03-27T00:00:00Z”
},
{
“Key3”: “B2”,
“_class”: “com.xyz.abc”,
“Key1 " : “8888”,
“Key2” : “ghijkl”
“DateTime”: “2016-03-27T00:00:00Z”,
“Date 2”: “2020-03-27T00:00:00Z”
}
]

Thank you.

vsr1 · June 2, 2020, 11:03pm

SELECT p.Key1, 
       p.Key2,
       {"7777":"A1", "8888":"B2"}.[p.Key1] AS key3
FROM `bucket`  AS p 
WHERE  p._class="com.xyz.abc" AND  p.Key1 IN ["7777","8888"] 
GROUP BY p.Key1, p.Key2;


{"7777":"A1", "8888":"B2"} can be constant, named parameters or expression (in case of expression GROUP BY restrictions apply). 
p.key2 Must evaluate to string.

hari · June 2, 2020, 11:28pm

Thanks a lot @vsr1 !!

hari · May 20, 2021, 7:10am

Hi @vsr1 ,
Could you please help me in writing an extended query for the above?

Currently doing MAX function on only one field a.DateTime but want to leverage the same for 2 fields (a.DateTime and a.Date2).
Can we use MAX([a.DateTime,a.Date2, {a.OtherField1}])[2].* ? How does MAX function behaves if we use this way?
Getting the below exception if the query being used as it is,
org.springframework.data.couchbase.core.CouchbaseQueryExecutionException: Unable to retrieve enough metadata for N1QL to entity mapping, have you selected _ID and _CAS
so i’ve tried to use in this way. Is there any other way to fetch _ID and _CAS with just one MAX function?

SELECT a.Key1, a.Key2, MAX([a.DateTime, {a.Date2, a.OtherField1}])[1].*,
(MAX([a.DateTime, {META(a).id} ])[1].id) AS _ID,
(MAX([a.DateTime, {META(a).cas} ])[1].cas) AS _CAS
FROM bucket AS a
WHERE a.Key1 IN [“7777”,“8888”]
AND a.DateTime <= 2016-03-28
GROUP by a.Key1, a.Key2;

How to do any aliasing or any other operation if we want to do like doing an UPPER(a.OtherField1) after doing a max operation.

Could you please help me in resolving these issues?

Thanks,
Hari

vsr1 · May 20, 2021, 1:55pm

MAX([a.DateTime,a.Date2, {a.OtherField1}])[2].*
Max is done a.DateTime, If there is duplicates in a.DateTime, then MAX a.Date2 is used to resolve duplicates.

SELECT a.Key1, a.Key2, MAX([a.DateTime, {a.Date2, "_ID": META(a).id, "_CAS": META(a).cas, "ofiled":a.OtherField1}])[1].*
FROM bucket AS a
WHERE a.Key1 IN ["7777","8888"] AND a.DateTime <= 2016-03-28
GROUP by a.Key1, a.Key2;

hari · May 20, 2021, 5:08pm

Awesome…!! Thank you @vsr1.
BTW, if possible, could you please point me to a good material or tutorial to learn more about N1ql queries creation with examples ?

Topic		Replies	Views
Query for latest records of a given set of users SQL++ query , n1ql	5	5260	April 12, 2018
Group most recent documents by key SQL++ query	1	2084	May 3, 2018
N1Q1 get documents where field is of the highest value when group by 2 fields SQL++	12	5482	October 3, 2017
N1QL query using distinct keyword SQL++ n1ql	2	6959	June 17, 2019
Help with correlated subquery please SQL++	2	1046	September 5, 2018

N1Ql Query for group by and latest when compared with date ordered by desc

Related topics