N1Ql | Use Array in where clause | Indexing Required

We have two sets of documents (pages and instances). There is one - many relationship between a page and instances. The pageId will be common for a given page and its instances.

There is another attribute “externalID” which is available on both the page and instance documents and it will be used to query the documents. The externalID will be an array and the request will also be an array of IDs.

Sample docs

PAGE
{
“id”: 1
“pageId”: “/page1”,
“externalId”: [ “cricket”,“tennis” ]

}

INSTANCE1

{
id: 12,
pageId: “/page1”,
effectiveTime:“1617778597591”,
expiryTime:“1617778597596”,
“externalId”: [ “cricket”,“tennis” ]
}

INSTANCE2

{
id: 34,
pageId: “/page1”,
effectiveTime:“1617778597561”,
expiryTime:“1617778597569”,
“externalId”: [ “cricket”,“tennis” ]
}

The query works fine but it is taking more than 500ms for execution. We expect the query execution time to come down below 40ms. We have indexes in place but with the introduction of the array, the old indexes are not picked.

Please give some leads to create an array index to bring down the execution time.

Working Query

WITH results AS (
SELECT ARRAY_CONCAT(ARRAY v1 FOR v1 IN pages WHEN v1.groupExternalId WITHIN instances END, ARRAY v FOR v IN instances WHEN v.groupExternalId WITHIN pages END) AS items
LET instances = (
SELECT MAX([ instance.effectiveTime, { instance , ‘_ID’: META(instance).id, ‘_CAS’:META(instance).cas, ‘groupExternalId’: externalId }])[1].*
FROM bucket_name AS instance
WHERE instance.stack = FALSE
AND instance.type = ‘inst’
AND ANY val IN instance.externalId SATISFIES val IN [“tennis”] END
AND instance.cId = “aqwert1234569”
AND ( (instance.effectiveTime <= 1623661183709
AND 1623661183709 <= instance.expiryTime)
OR (instance.effectiveTime <= 1623661183709
AND instance.expiryTime IS NULL )
OR (instance.past = TRUE
AND instance.expiryTime < 1623661183709) )
GROUP BY instance.externalId),
pages = (
SELECT page,
externalId AS groupExternalId
FROM bucket_name AS page
WHERE page.type = ‘page’
AND page.cId = “aqwert1234569”
AND ANY val IN page.externalId SATISFIES val IN [“tennis”] END))
SELECT ARRAY_AGG(item.instance)[0] AS instance,
ARRAY_AGG(item.page)[0] AS page,
ARRAY_AGG(item._ID)[0] AS _ID,
ARRAY_AGG(item._CAS)[0] AS _CAS
FROM results AS r1
UNNEST r1.items AS item
GROUP BY item.groupExternalId

You should take individual subqueries and use https://index-advisor.couchbase.com/indexadvisor/#1
Not sure what you are doing externalId is ARRAY searching one element present and doing GROUP BY on whole array.
In Array position also important [ “cricket”,“tennis” ] is not same as [ “tennis” , “cricket”]

SELECT MAX([ instance.effectiveTime, { instance , '_ID': META(instance).id, '_CAS':META(instance).cas, page}])[1].*
FROM bucket_name AS instance
JOIN bucket_name AS page ON (instance.pageId = page.pageId
                             AND page.type = "page"
                             AND page.cId = "aqwert1234569"
                             AND ANY val IN page.externalId SATISFIES val IN ["tennis"] END)
WHERE instance.stack = FALSE
      AND instance.type = 'inst'
      AND ANY val IN instance.externalId SATISFIES val IN ["tennis"] END
      AND instance.cId = "aqwert1234569"
       AND IFNULL(instance.expiryTime, 0) <= 1623661183709
      AND (instance.effectiveTime <= 1623661183709 OR instance.past = TRUE)
GROUP BY instance.externalId)

CREATE INDEX ix1 ON bucket_name(cId, DISTINCT externalId, IFNULL(expiryTime,0), effectiveTime, past) WHERE type = "inst";
CREATE INDEX ix2 ON bucket_name(cId, DISTINCT externalId) WHERE type = "page";