Index fetching when value is covered

I am covering an object of which includes a list of tokens in a document.

Example:

{
“eC”: “N”,
“token”: [
“582868”
],
}

the majority of our use cases will only have one token, but we may have a few docs that consist of multiple(2).

in our index we are covering the field as such:
CREATE INDEX IDX_Dt ON BN(((distinct (token)))PARTITION BY hash(token) WITH { “defer_build”:true, “num_replica”:1, “num_partition”:32 }

in our query when I am trying to query over this data, for example grouping on token, the server is performing a fetch. the information does not seem to be in the index node. Is their a way to better define this index, or does the structure need to change for a better performing query? we are iterating over the data to get a result.

this is what the query looks like:

SELECT ARRAY tkn FOR tkn IN t.token WHEN tkn IN [ “cdafac6a” ] END AS acctToken, t COUNT(1) AS cnts
FROM BN AS t
WHERE ANY tkn in t.token SATISFIES tkn IN [“cdafac6a”] END
GROUP BY t.token;

This is not covered . Index has flattened array each as separate item.

As you mentioned you have only 1 or 2 in array
Use the following index, which makes it covered. Array stored as last key in the index as scalar make it cover.

CREATE INDEX  `IDX_Dt`  ON  `BN` (((distinct ( `token` )), `token`)
PARTITION BY hash( `token` ) WITH { “defer_build”:true, “num_replica”:1, “num_partition”:32 }

If you want count the each value how many times present across the documents you can do the following.

CREATE INDEX  ix1  ON  `BN` ( ALL token);
SELECT tk AS token, COUNT(1) AS cnts
FROM BN AS t
UNNEST t.token AS tk
WHERE tk IN ["cdafac6a", "xyz"]
GROUP BY tk;

Thanks for your response. That does seem to be much better in performance. But by what you wrote

"As you mentioned you have only 1 or 2 in array
Use the following index, which makes it covered. Array stored as last key in the index as scalar make it cover. "

CREATE INDEX  `IDX_Dt`  ON  `BN` (((distinct ( `token` )), `token`)
PARTITION BY hash( `token` ) WITH { “defer_build”:true, “num_replica”:1, “num_partition”:32 }

It seems like even if I have more than 2 elements because i am using the array as the last key even if I were to have a 3rd or 4th element or more in the array it would perform in the same manner and the index would be the same? Am I understanding correctly?

Number of elements doesn’t matter. The reason i mentioned Assume you have 10000 elements in array, storing such big array as part of index can make index ballon. You need to measure performance covers vs non-cover due to index record size

1 Like

Yup that is what we are in the process of doing. Thanks for the clarification and always timely responses! :+1::+1: