The following n1ql command runs successfully on the travel sample bucket to create the partial index:
CREATE INDEX travel_info ON ``travel-sample`` (name, id, icao, iata, type) WHERE type = 'airline'
But when I run the following query under EXPLAIN, the “travel_info” partial index I created above is ignored and the “def_type” index is used instead.
SELECT * FROM ``travel-sample`` USE INDEX (travel_info) WHERE icao="MLA" AND type="airline"
The documentation where I obtained the above CREATE INDEX example states:
This creates an index only on documents that have type='airline' . The queries must include the filter type='airline' in addition to other filters for this index to qualify.
According to the following linked thread, it appears that if all documents within a bucket do not have the attribute set that the partial index is filtered on, in this case the “type” attribute, then the partial index will be ignored by queries.
I assumed that may be the case with the travel sample bucket, so I created my own bucket with two test documents and then created the partial index and then ran the query, which did then use the partial index that I had created.
The documentation that I cited in the previous post may want to not use the travel sample bucket for the partial index example because it appears to not work on that sample bucket.
Query predicate must include (or subset of ) Index where condition.
AND
query must have predicate on leading index key (The reason indexer will not index the document when leading index key is MISSING so that index will not be big). Without that leading index key as predicate query must qualify the row . Due to that leading index key must be present in the query predicate.
If it is okay based on your data you can try the following.
SELECT * FROM `travel-sample`
USE INDEX (travel_info) WHERE icao="MLA" AND type="airline" AND name IS NOT MISSING AND id IS NOT MISSING.
Thanks, that worked! I understand that the query predicate must contain the predicate of the target partial-index, but I am guessing that “AND name IS NOT MISSING AND id IS NOT MISSING” is appended to ensure that the query is indeed a subset of the index?
Is there something special about the attributes “id” and “name” or as just a rule of thumb when a partial-index is not being selected to just use any suitable attribute with “IS NOT MISSING” until the partial-index is picked up by the query?
This may be a gray area that may not be known without looking at the source code, but I will take a closer look at the reference pdf you provided. Thanks again, I really appreciate the help!
Query will not add because the results will change (if there is missing entry query will skip, that may not intended behavior). To Pick the index you need only leading index key (name IS NOT MISSING).
The reason added id IS NOT MISSING is you have predicate icao=“MLA” (this is 3rd key in the index). Query sends continuous leading predicate to indexer as Spans in the EXPLAIN. By adding id predicate “MLA” pushed to indexer and indexer can filter out early otherwise query needs filter them later.
Understand IndexScan in the book covers how predicate to transformed to indexscan Spans.
Monitoring section gives timings/row in/out counts each operator.
Example: "doc1" {"a":1, "b": 10}
"doc2" {"b": 5}
"doc3" {"a":3, "b": 100}
CREATE INDEX ix1 ON default(a); This index has entries for "doc1", "doc3". "doc2" is not in the index a IS MISSING
CREATE INDEX ix2 ON default(a) WHERE b < 20; This index has entries for "doc1", "doc2" is not in the index a IS MISSING. "doc3" is not in the index because b is not less than 20.
SELECT a,b FROM default; must give all 3 rows, both indexes doesn't qualify because it did not have all info.
SELECT a,b FROM default WHERE b <= 10; it should give "doc1", "doc2". both indexes doesn't qualify because it did not have all info.
SELECT a,b FROM default WHERE b <= 10 AND a IS NOT NULL; should only give "doc1". It can use both indexes.
SELECT a,b FROM default WHERE a >= 0; it should give "doc1", "doc3". ix1 has only have all info and qualifies.
Thanks, this clarifies it for me: (1) the column order of partial indexes is important and effects the subsequent SELECT queries predicate structures, and (2) the more columns specified for regular/partial indexes means additional requirements for documents to meet in order to be included in those indexes.
Knowing these basic behaviors of how indexes work makes learning couchbase much easier going forward!