I’m running into an issue where a query uses the expected index in my local Couchbase cluster, but on the production cluster (same version and configs) the optimizer doesn’t pick that index.
Here’s a simplified version of the query:
WITH matchingSegmentIds AS (
SELECT RAW META(cs).id
FROM mybucket.myscope.collection-segment cs
WHERE cs.field1 = “XXX”
AND cs.field2 = “YYY”
)
SELECT x.collection_id
FROM mybucket.myscope.collection x
WHERE ANY segment IN x.collection-segments
SATISFIES segment.collection-segment-id IN matchingSegmentIds
END;
Index created on collection-segment:
CREATE INDEX idx_collection_segment_route
ON mybucket.myscope.collection-segment
(field1, field2);
In local, the query plan shows that the index is being used correctly.
But in production, the same query doesn’t use this index and falls back to a less efficient plan.
Index definition is identical.
Buckets/scopes/collections have the same names and structure.
Has anyone experienced this difference in index selection between local and production?
What are some possible reasons the optimizer might skip the index in one environment but not the other?
thanks for the reply, BTW would differences in couchbase version between the two environments also have no effect, or could that cause behavior changes?
production cb version - “7.1.0-N1QL”,
local cb version - “7.6.7-N1QL”
I’m not sure but would it be the reason for the difference between the query planning and index selection
NOTE: If I run the query separately, the index is used. The index is not used only when the query is inside a subquery.
SELECT RAW META(cs).id
FROM mybucket.myscope.collection-segment cs
WHERE cs.field1 = “XXX”
AND cs.field2 = “YYY”
;
Use Index Advisor in the Query Workbench to see what it advises.
btw - it’s not possible to compare “a simplified version of the query” nor “if I run the query separately”. They are completely different queries and will have different query plans. You the exact query.
I believe there was a change in 7.6 where the cost-based optimiser would start automatically collecting statistics. So perhaps you are getting CBO on your local 7.6, while statistic-less prod is falling back to the rule-based optimiser and making a less optimal index choice?
Thank you, that makes perfect sense. Just to help my understanding, could you elaborate on why the rule-based optimizer was likely choosing a worse index without these statistics? Was it missing the cardinality of the fields?