GROUP BY usage question

I am using Version: 4.5.0-2601 Enterprise Edition (build-2601) ,
and my sample JSON data as following:

{"id":"p1","type":"parent","name":"parent1","local":"cn"};
{"id":"p2","type":"parent","name":"parent2","local":"cn"};
{"id":"p3","type":"parent","name":"parent3","local":"cn"};
{"id":"c1","type":"child","name":"child1","local":"cn","parent_id":"p1",tag":"tag1"};
{"id":"c2","type":"child","name":"child2","local":"cn","parent_id":"p1",tag":"tag1"};
{"id":"c3","type":"child","name":"child3","local":"cn","parent_id":"p1",tag":"tag2"};
{"id":"c4","type":"child","name":"child4","local":"cn","parent_id":"p2",tag":"tag1"};
{"id":"c5","type":"child","name":"child5","local":"cn","parent_id":"p2",tag":"tag2"};

I want to get the following data by local == "cn" :

[
{"id":"p1","type":"parent","name":"parent1","local":"cn","count_tag1":2,"count_tag2":1},
{"id":"p2","type":"parent","name":"parent2","local":"cn","count_tag1":1,"count_tag2":1},
{"id":"p2","type":"parent","name":"parent3","local":"cn","count_tag1":0,"count_tag2":0}
]

which
count_tag1 match type == "child" AND local == "cn" and tag == "tag1"
count_tag2 match type == "child" AND local == "cn" and tag == "tag2"

I can get the result by group by 3 times ,but it toke long time,is there a better way to get the data?
PS: I try to using CASE WHEN but failed in group by clause.

My N1QL as following:

SELECT a.id,a.type,a.name,a.local,count(a.count_tag1) AS count_tag1,count(a.count_tag2) AS count_tag2
FROM
(
SELECT meta(p).id,a.type,a.name,a.local,count(meta(c).id) AS count_tag1
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag1"
GROUP BY meta(p).id,a.type,a.name,a.local

UNION ALL

SELECT meta(p).id,a.type,a.name,a.local,count(meta(c).id) AS count_tag2
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag2"
GROUP BY meta(p).id,a.type,a.name,a.local
) a
GROUP BY a.id,a.type,a.name,a.local

but it took long time (even I have index on type and local)

A couple of suggestions.

You cannot use “a” inside the subquery. It does not exist at that point.

You don’t have to JOIN to parent until after you compute the GROUP BY. You can compute the GROUP BY in a subquery, and then JOIN with parent afterwards. If you can avoid the JOIN altogether, even better.

Your index needs to be on type, local, and tag. Put the more selective keys first in the index definition.

sorry,The N1QL should looks like this:

SELECT a.id,a.type,a.name,a.local,count(a.count_tag1) AS count_tag1,count(a.count_tag2) AS count_tag2
FROM
(
SELECT meta(p).id,p.type,p.name,p.local,count(meta(c).id) AS count_tag1
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag1"
GROUP BY meta(p).id,p.type,p.name,p.local

UNION ALL

SELECT meta(p).id,p.type,p.name,p.local,count(meta(c).id) AS count_tag2
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag2"
GROUP BY meta(p).id,p.type,p.name,p.local
) a
GROUP BY a.id,a.type,a.name,a.local

and yes, your suggestions are very useful.
I change my N1QL only using one group by,and create correct index, now it return data within 100ms.
Following is my N1QL

SELECT a.id,a.type,a.name,a.local,count(a.tag1_id) AS count_tag1,count(a.tag2_id) AS count_tag2
FROM
(
SELECT meta(p).id,p.type,p.name,p.local,meta(c).id AS tag1_id
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag1"
GROUP BY meta(p).id,p.type,p.name,p.local

UNION ALL

SELECT meta(p).id,p.type,p.name,p.local,meta(c).id AS tag2_id
FROM mydb c JOIN mydb p ON KEYS c.parent_id
WHERE c.type == "child" AND p.type == "parent" AND c.local == "cn"
AND c.tag == "tag2"
GROUP BY meta(p).id,p.type,p.name,p.local
) a
GROUP BY a.id,a.type,a.name,a.local

Thank you very much.

1 Like