Join condition : array onto multiple recordset

selfy · April 12, 2023, 9:56am

Given two different set of records:

{ “shopid”:“1”, “shopcodes”:[“a”,“c”] }
{ “shopid”:“2”, “shopcodes”:[“b”] }

{ “shopcode”:“a”, shopcodedesciption:“first code” }
{ “shopcode”:“b”, shopcodedesciption:“second code” }
{ “shopcode”:“c”, shopcodedesciption:“third code” }

Is it possible to join the first set with the second set to yield a unified/nested result set like the one below:

{ “shopid”:“1”, “shopcodes”:[“a”,“c”], shopcodedescription:[“first code”,“third code”] }
{ “shopid”:“2”, “shopcodes”:[“b”], shopcodedescription:[“second code”] }

For every shop code in a record, nest all of its corresponding shopcodedescription.

dh · April 12, 2023, 10:21am

You can achieve this by UNNESTing, joining then aggregating the results (not necessarily the best way, but a way):

CREATE INDEX ixt1 ON default(shopid);
CREATE INDEX ixt2 ON default(shopcode);

SELECT t.shopid, ARRAY_AGG(t.sc) shopcodes, ARRAY_AGG(t.shopdescription) shopdescription
FROM (
      SELECT d.shopid, sc, d2.shopdescription
      FROM   default d UNNEST shopcodes as sc
            ,default d2
      WHERE d.shopid IS VALUED
      AND d2.shopcode = sc
) t
GROUP BY t.shopid
;

It is uncommon to want correlated arrays like this; usually it would be preferred to have a single array of codes & descriptions in an object binding them together.

SELECT t.shopid, ARRAY_AGG({"shopcode":t.sc,"shopdescription":t.shopdescription}) shops
FROM (
      SELECT d.shopid, sc, d2.shopdescription
      FROM   default d UNNEST shopcodes as sc
            ,default d2
      WHERE d.shopid IS VALUED
      AND d2.shopcode = sc
) t
GROUP BY t.shopid
;

HTH.

vsr1 · April 17, 2023, 1:01am

SELECT d.shopid, 
       d.shopcodes,
       (WITH shopcodes AS (d.shopcodes)
        SELECT RAW d1.shopcodedesciption
        FROM default d1
        WHERE d1.shopcode IN shopcodes) AS shopcodedescription
FROM default AS d
WHERE d.shopid IS VALUED;

OR

SELECT d.shopid, 
       d.shopcodes,
       (
        SELECT RAW d1.shopcodedesciption
        FROM d.shopcodes AS sc JOIN default d1
        ON d1.shopcode = sc ) AS shopcodedescription
FROM default AS d
WHERE d.shopid IS VALUED;

selfy · April 17, 2023, 5:53am

@dh I get an error at the comma in both of the queries.

code 3000… syntax error - at ,

EDIT:
I think this is because I am not yet on couchbase version 7.1 where comma separated joins are available? Reference 1, Reference 2.
Is there there a way to rewrite this query for CB version 6.6.x?

EDIT 2:
I tried to do an ANSI join according to the examples in the documentation

SELECT *
      FROM   demo d UNNEST shopcodes as sc
             JOIN demo d2 ON true

But that gives me the following error:

No index available for ANSI join term d2

What index would I need to create in addition to what I have based on below?

@vsr1

The second query returns an error, probably since I am on CB version 6.6.x

Unable to run subquery - cause: FROM in correlated subquery must have USE KEYS clause: FROM demo.

The first query seems to return incorrect data
I created a new bucket demo and…

INSERT INTO demo VALUES 
("A1",{ "shopid":"1", "shopcodes":["a","c"] }), 
("A2",{ "shopid":"2", "shopcodes":["b"] }), 
("A12",{ "shopcode":"a", "shopcodedesciption":"first code" }), 
("A13",{ "shopcode":"b", "shopcodedesciption":"second code" }), 
("A11",{ "shopcode":"c", "shopcodedesciption":"third code" })

CREATE INDEX ixt1 ON demo(shopid);
CREATE INDEX ixt2 ON demo(shopcode);
create index ix1 on demo(id);
CREATE PRIMARY INDEX idx_demo_primary ON `demo` USING GSI;

SELECT d.shopid, 
       d.shopcodes,
       (WITH shopcodes AS (d.shopcodes)
			SELECT RAW d1.shopcodedesciption
			FROM demo d1
			WHERE d1.shopcode IN shopcodes) AS shopcodedescription
FROM demo AS d
WHERE d.shopid IS VALUED;

The query results in an incorrect output like so:

shopcodedescription	shopcodes	shopid
["third code","first code"]	["a","c"]	"1"
["third code","first code"]	["b"]	"2"

The second record is incorrect in matching the shopcodes to shopcodedescription.
Looks like a duplicate of first record result?

dh · April 17, 2023, 8:28am

@selfy , yes you can change the comma-join to an ANSI join:

SELECT t.shopid, ARRAY_AGG(t.sc) shopcodes, ARRAY_AGG(t.shopdescription) shopdescription
FROM (
      SELECT d.shopid, sc, d2.shopdescription
      FROM   default d UNNEST shopcodes as sc
            JOIN default d2 ON d2.shopcode = sc
      WHERE d.shopid IS VALUED
) t
GROUP BY t.shopid
;

(The previously suggested indices should be sufficient for this.)
HTH.

vsr1 · April 17, 2023, 1:54pm

If shopid list small use this. Otherwise you may have to use JOIN

WITH shopObj AS (OBJECT v.shopcode:v.shopcodedesciption
                 FOR v IN (SELECT d1.shopcode, d1.shopcodedesciption
                           FROM default AS d1
                           WHERE d1.shopcode IS VALUED)
                 END)
SELECT d.shopid,
       d.shopcodes,
       ARRAY shopObj.[v] FOR v IN d.shopcodes END AS shopcodedescription
FROM default AS d
WHERE d.shopid IS VALUED;

system · July 16, 2023, 1:54pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Joining items in two arrays into one result SQL++	3	2325	January 15, 2019
Multiple arrays of objects to nest/join data with SQL++	4	2752	May 2, 2017
N1ql join on value is in one of multiple arrays SQL++ query , n1ql	3	624	June 13, 2023
Merging all arrays in N1QL via set intersection? SQL++ n1ql	1	1246	December 15, 2016
N1QL Trying to join multiple sub records SQL++ n1ql	3	1937	January 5, 2016

Join condition : array onto multiple recordset

Related topics