Bad Performance for IN in array indexes - multiple ANY = is fast

We are having some very bad performance with EVERY lookups in arrays. Multiple ANY clauses seem to work.

Here’s an index:

CREATE INDEX `array_index_idx` 
ON `default`(DISTINCT ARRAY t.tags FOR t IN tags END)
WHERE (meta().`id` LIKE "array%" and (`tags` is valued) and (0 < array_length(`tags`))

This query is very slow:

SELECT meta().id FROM default USE INDEX (`array_index_idx`) 
WHERE (meta().`id` LIKE "array%" and (`tags` is valued) and (0 < array_length(`tags`)) 
AND EVERY tag IN t.tags SATISFIES tag IN ["tagOne", "tagTwo"]  END

Yet this is very fast:

SELECT meta().id FROM default USE INDEX (`array_index_idx`) 
WHERE (meta().`id` LIKE "array%" and (`tags` is valued) and (0 < array_length(`tags`)) 
AND ANY tag IN t.tags SATISFIES tag = "tagOne" END
AND ANY tag IN t.tags SATISFIES tag = "tagTwo" END

These are functionally the same, but IN seems to perform very badly. In fact, providing a list of one also performs poorly:

SELECT meta().id FROM default USE INDEX (`array_index_idx`) 
WHERE (meta().`id` LIKE "array%" and (`tags` is valued) and (0 < array_length(`tags`)) 
AND ANY tag IN t.tags SATISFIES tag IN ["tagOne"] END

Finally, it also performs poorly if we invert the tags:

SELECT meta().id FROM default USE INDEX (`array_index_idx`) 
WHERE (meta().`id` LIKE "array%" and (`tags` is valued) and (0 < array_length(`tags`)) 
AND ANY/EVERY tag IN ["tagOne"] SATISFIES tag = t.tags END

@unhuman ,

  1. Checkout t in index/query correctness? Is this typo of alias?

  2. EVERY will not use ARRAY index. It might be using different index that is why it is slow.

  3. You have ARRAY_LENGTH(tags) > 0, You can use ANY AND EVERY. Which uses array index with fetch.

  4. Are you required tags must contain only two elements of “tagOne”, “tagTwo” then only EVERY will be true. See following explanation.
    EVERY tag IN t.tags SATISFIES tag IN [“tagOne”, “tagTwo”] END

  5. If you required tags array contain both elements then following is right condition.
    AND ANY tag IN t.tags SATISFIES tag = “tagOne” END AND ANY tag IN t.tags SATISFIES tag = “tagTwo” END

    The following should use array index also cover.

CREATE INDEX `array_index_idx` ON `default`(DISTINCT tags) WHERE META().id LIKE "array%";

SELECT RAW META().id FROM default WHERE META().id LIKE "array%" AND ANY tag IN tags SATISFIES tag = "tagOne" END
INTERSECT
SELECT RAW META().id FROM default WHERE META().id LIKE "array%" AND ANY tag IN tags SATISFIES tag = "tagTwo" END;

Range predicates (ANY or SOME, EVERY, and ANY AND EVERY or SOME AND EVERY) allow you to test a boolean condition over the elements or attributes of a collection or object(s). They each evaluate to a boolean value.

ANY or SOME is TRUE if the collection is non-empty and at least one element matches.

EVERY is TRUE if the collection is empty, or if the collection is non-empty and every element matches.

ANY AND EVERY or SOME AND EVERY is TRUE if the collection is non-empty and every element matches.

Thanks - we’ll look into this.

I probably made some mistakes while removing any proprietary info…