Indexing an array field using IN vs ANY vs CONTAINS

andrea · December 12, 2025, 10:26am

Hello,

I’m working on optimizing some queries with new indexes and ran into a question regarding array searches.
In many parts of our system, when we need to check whether a value exists inside an array field (for example, searching for a string within an array), we typically use the simple syntax:

$value IN `fieldname`

This works correctly, but when I started creating indexes that include array fields using the syntax:

DISTINCT `fieldname`

I noticed that the Index Advisor does not show these indexes being used for queries that rely on the IN operator. This made me wonder: Is the IN syntax efficient when the element being searched is dynamic, or should we prefer using functions like CONTAINS() or the more explicit:

ANY v IN fieldName SATISFIES v = $value END

My current understanding is that the ANY … SATISFIES form is more index-friendly because it maps directly to array index entries and it should be recognized by the index, whereas IN may not leverage these indexes when the search value is dynamic.

For reference, here’s an example query I’m trying to optimize. Currently, it falls back to using a generic index:

SELECT DISTINCT N1, N2, ARRAY_AGG(N3)
        FROM bucket
        WHERE Field1 = 'fixed value'
        AND Field2 = $value2
        AND ($value3 IN ArrayField3
                OR $value4 IN ArrayField4
                OR $value5 IN $ArrayField5)
        AND Field6 = false
        GROUP BY N1, N2

vsr1 · December 15, 2025, 7:02pm

If index key is filedName is scalar most likely you must search value using equal (OR multiple values)or range predicate

filedName IN [1,2,3] same as (fieldName = 1 OR fieldName = 2 OR filedName = 3)

The index will chosen

If filedName is ARRAY or ARRAY of objects indexing as scalar will not choose index. As Index has whole value and you are looking part of that. In that case ARRAY indexing comes

ARRAY indexing will match with ANY clause in query or UNEST Array Indexing | Couchbase Docs

Index DISTINCT `fieldname`. This allows index create separate entry for each value of ARRAY and able to scan

Query:

ANY v IN fieldName SATISFIES v = $value END

ANY v IN fieldName SATISFIES v IN [$value1, $value2] END

CREATE INDEX ix3 ON bucket(Field1, Filed2, Filed6, DISTINCT ARRAY v3 FOR v3 IN ArrayField3 END);
CREATE INDEX ix4 ON bucket(Field1, Filed2, Filed6, DISTINCT ARRAY v4 FOR v4 IN ArrayField4 END);
CREATE INDEX ix5 ON bucket(Field1, Filed2, Filed6, DISTINCT ARRAY v5 FOR v5 IN ArrayField5 END);

SELECT N1, N2, ARRAY_AGG(N3)
        FROM bucket
        WHERE Field1 = 'fixed value'
        AND Field2 = $value2
        AND (ANY v3 IN  ArrayField3 SATISFIES v3 = $value3
                OR ANY v4 IN  ArrayField4 SATISFIES v4 = $value4
                OR ANY v5 IN  ArrayField5 SATISFIES v5 = $value5)
        AND Field6 = false
        GROUP BY N1, N2

NOTE: Index can have 1 Array index Field, otherwise due to each array filed as separate can result in cartesian product and index can be huge.

andrea · December 16, 2025, 9:35am

Hello @vsr1 , and thank you for your response.

As I suspected, in order to force the use of an array index, I need to use the ANY … SATISFIES syntax. If I use the CONTAINS function maybe it uses the array index too, but I need to test it because I don’t remember.

In our system, there are many fields that are arrays of strings whose values can change over time. Whenever we need to search for an element within one of these arrays, we currently use the syntax:
WHERE value IN ArrayFieldName

If I understand correctly, the IN clause is internally translated into a series of OR conditions. Is that correct?

If we continue using the IN condition in these queries, is there a way to make them faster by creating some kind of index? Or is the only viable solution to update all queries to use the ANY … SATISFIES syntax and then create array indexes on those fields?

vsr1 · December 16, 2025, 12:15pm

CONTAINS() will not use array index

IN clause leftside value and right side field will not use array index. Only option is ANY clause to make it faster

andrea · December 16, 2025, 1:36pm

Hello @vsr1,

1)Sorry, what I mean was the ARRAY_CONTAINS function of the arrays, not the CONTAINS string function. I don’t know if using the ARRAY_CONTAINS is the same as IN or ANY SATISFIES.

2)So I can assume that the IN condition doesn’t use any type of index, not only the array index. The array condition (IN) is always checked after the fetch of the document and it is quiet fast for arrays of max 100-200 elements.

3)If I use something like WHERE ‘value’ IN [Field1, Field2, Field3] I can assume that it is the same as

(value = Field1 OR value = Field2 OR value = Field3) since the Field1/2/3 are string fields.

Thanks for the support. I swear that these are the last questions xD

vsr1 · December 16, 2025, 2:13pm

ARRAY_CONATINS() functionally same as IN or ANY but one can’t index because ARRAY_CONTAINS() function has constant you looking, if index with constant there will be no use of index because it can work only one value. Any such thing post index scan/fetch applied.
IN can be used without any index, constant on left side it will be applied post index scan/fetch. It can use index if left side in scalar field and right side constant, If field is array it must create array index and use ANY clause in query
yes (any type of value). IN is preferred (if left side is field and right side constants) vs OR IN List Handling Improvements in Couchbase Server 6.5 - The Couchbase Blog

vsr1 · December 16, 2025, 2:31pm

If you are looking one value in multiple fields with OR

one can use few techniques

CREATE INDEX ix1 ON default ( DISTINCT ARRAY v FOR v IN [Field1, Field2, Field3] END);

SELECT * 
FROM default d
WHERE ANY v IN [Field1, Field2, Field3] SATISFIES v IN [val1, val2] END;

If Field1,Filed2,FIeld3 you can use

CREATE INDEX ix1 ON default ( DISTINCT ARRAY v FOR v IN ARRAY_CONCAT(Field1, Field2, Field3) END);

SELECT * 
FROM default d
WHERE ANY v IN ARRAY_CONCAT(Field1, Field2, Field3) SATISFIES v IN [val1, val2] END;

andrea · December 16, 2025, 5:15pm

Hello @vsr1 , sorry to bother you again, but I have a strange behaviour in couchbase and I don’t know how to solve it and why it gives me this problem. I have this query

SELECT *
FROM bucket
WHERE Type = 'test'
AND Project = 'test'
AND ( any v in ArrayField1 satisfies v = 'abb' end
OR any v in ArrayField2 satisfies v = 'abb' end)
and IsDeleted = false

I tried to create the indexes for various queries without problem, but with this one I get a weird error. I tried to use both my indexes or the indexes suggested by the index advisor, but the query keeps using the old index. Also if I try to use the use index() syntax I can’t force to use these new indexes. For reference these are my new indexes

CREATE INDEX adv_1 ON `bucket`(DISTINCT ARRAY `v` FOR `v` IN `ArrayField1` END,`Project`) WHERE `Type` = 'test' AND `IsDeleted` = false

CREATE INDEX adv_2 ON `bucket`(DISTINCT ARRAY `v` FOR `v` IN `ArrayField2` END,`Project`) WHERE `Type` = 'test' AND `IsDeleted` = false

But it still use this index

CREATE INDEX adv_3 ON `bucket`(`Type`,`Project`,`IsDeleted`,`ID`)

The fun side is that if I only use one of the condition in OR it uses the correct index. And I tried with other similar queries and I don’t have the same problem, but uses the correct indexes.

For example this query use the correct index adv_1

SELECT *
FROM bucket
WHERE Type = 'test'
AND Project = 'test'
AND any v in ArrayField1 satisfies v = 'abb' end
and IsDeleted = false

Thanks for the support

vsr1 · December 16, 2025, 5:36pm

Move Project leading (as it is common) in adv_1, adv_2

Also try

FROM bucket USE INDEX(adv_1, adv_2)

If looking both arrays v = “abb”

CREATE INDEX ix1 ON `bucket`( Project, DISTINCT ARRAY `v` FOR `v` IN  ARRAY_CONCAT(IFMISSINGORNULL(`ArrayField1`,[]), IFMISSINGORNULL(`ArrayField2`,[]))  END,`Project`) WHERE `Type` = 'test' AND `IsDeleted` = false;

SELECT *
FROM bucket
WHERE Type = 'test'
AND Project = 'test'
AND any v in ARRAY_CONCAT(IFMISSINGORNULL(`ArrayField1`,[]), IFMISSINGORNULL(`ArrayField2`,[])) satisfies v = 'abb' end
and IsDeleted = false

system · March 16, 2026, 5:36pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Which one is bettwer, WHERE ANY IN or just IN? Couchbase Server n1ql	1	1716	June 3, 2016
Bad Performance for IN in array indexes - multiple ANY = is fast SQL++	2	755	March 8, 2022
How to create index for "IN" statement SQL++	2	716	April 12, 2021
How to properly index array field? Couchbase Server n1ql , index	3	2167	June 21, 2017
How to correctly create array index SQL++ index	1	4918	July 1, 2018

Indexing an array field using IN vs ANY vs CONTAINS

Related topics