In plans.zip,
"condition": "(((1 < (`d`.`a`)) or (((`d`.`a`) = 1) and (1 <= (`d`.`b`)))) and (((`d`.`a`) < 2) or (((`d`.`a`) = 2) and ((`d`.`b`) <= 1))))"
It looks like the query would have been a query on the range [1,1] - [2,1], which is…
... WHERE (a > 1 OR (a=1 and b >=1)) AND (a < 2 OR a=2 and b<=1) order by meta().id
And your question is about the range scan :
...
"range": [
{
"high": "2",
"inclusion": 0, // 0 -> exclusive
"index_key": "`a`",
"low": "1"
}
]
Low should be [1,1]. In case there are ten million documents with a = 1, it’ll be super inefficient?
The part you are missing is that inclusion:0 is exclusive. This range scan is for the middle part, (where ‘a’ is between the top and the bottom exclusive) means that b does not have to be considered. (unfortunately for this example due to our choice the range 1 - 2 for a, it will always be empty. If we had chosen 1- 5 , that range would be low=1, high=5, exclusive - and would include all the 2,* through 4,* documents - which is correct and minimal.) We still need the 1.* documents where b >= 1 and the 2.* documents where b <=1.
Let’s look at the ranges in the other IndexScans
The first range is for a=1 (i.e. 1.*) and the second range is for b >= 1). This will get the documents on the bottom end where a matches our bottom (a=1), which means the predicate then depends only on b.
"index": "adv_a_b",
...
"spans": [
{
"exact": true,
"range": [
{
"high": "1",
"inclusion": 3, // exact match
"index_key": "`a`",
"low": "1"
},
{
"inclusion": 1, // inclusive
"index_key": "`b`",
"low": "1"
}
]
}
]
The first range is for a=2 (i.e. 2.*) and the second range is for b <= 1). This will get the documents on the bottom end where a matches our bottom (a=1), which means the predicate then depends only on b.
"index": "adv_a_b",
...
"spans": [
{
"exact": true,
"range": [
{
"high": "2",
"inclusion": 3, // exact match
"index_key": "`a`",
"low": "2"
},
{
"high": "1",
"inclusion": 2, // inclusive
"index_key": "`b`",
"low": "null"
}
]
}
]
I cannot speak to the efficiency of the last IndexScans. They might suffer from what you mentioned, depending on the data.
Low should be [1,1]. In case there are ten million documents with a = 1, it’ll be super inefficient?
Since you have this data, and have the queries you can test it to see which approach works better.
Also - regarding needing the index to be ascending or descending because the application needs the results in order - only ORDER BY guarantees the order. Having the indexes ordered does not guarantee the results being ordered (and given the query plan in plans.zip is a union of three indexScans, I would not expect them to be in order). Using the array index - I would expect them to be in order, but that is not guaranteed.