N1QL query optimization considering ORDER BY

I am trying to somehow optimize following query

SELECT *
FROM `bucket`
WHERE external = true AND -createdAt >= 1 AND -createdAt < 2287821078001 AND account IS NOT NULL AND eventType IS NOT NULL
ORDER BY external, -createdAt, account, eventType
LIMIT 100 OFFSET 50000

So I’ve created following index:

CREATE INDEX `externa-index` ON `bucket` (external, -createdAt, account, eventType) WHERE external = true;

In the bucket is in total 2M documents.

  • external = true (boolean) (does not have it every document ~1M)
  • createdAt is a timestamp (have it every document)
  • account is a String (does not have it every document)
  • eventType is a String (have it every document)

Documents is always needed order by createdAt DESC (thats the reason why there is an index over -createdAt)

The issue is that the query obviously uses specified index. But it returns incorrect result set and it is because of:

-createdAt >= 1 AND -createdAt < 2287821078001 condition. How does it works when there is - suffix? Why it is not working as I expect? I’ve not found any doc related to this.

Thanks for the feedback.

Explain

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "index": "ext-index",
              "index_id": "3aca320aa7e5df0d",
              "keyspace": "bucket",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "High": [
                      "true",
                      "null"
                    ],
                    "Inclusion": 1,
                    "Low": [
                      "true"
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Parallel",
              "maxParallelism": 1,
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Fetch",
                    "keyspace": "bucket",
                    "namespace": "default"
                  },
                  {
                    "#operator": "Filter",
                    "condition": "((((((`bucket`.`external`) = true) and (2287821078001 <= (-(`bucket`.`createdAt`)))) and ((-(`bucket`.`createdAt`)) < 1)) and ((`bucket`.`account`) is not null)) and ((`bucket`.`eventType`) is not null))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "self",
                        "star": true
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Offset",
          "expr": "50000"
        },
        {
          "#operator": "Limit",
          "expr": "100"
        },
        {
          "#operator": "FinalProject"
        }
      ]
    },
    "text": "SELECT *\nFROM `bucket`\nWHERE external = true AND -createdAt >= 2287821078001 AND -createdAt < 1 AND account IS NOT NULL AND eventType IS NOT NULL\nORDER BY external, -createdAt, account, eventType\nLIMIT 100 OFFSET 50000"
  }
]

Hi @petojurkovic,

What version of Couchbase are you using?

Two things:

  1. The index order exploitation has improved with 5.0. The query exploits the index defined as is in 5.0 (Try out the March DP).
  2. You can now (5.0 DP) create index with ASC and DESC attribute on each key. ORDER BY will exploit it when there is a match.

@petojurkovic,

First of all, thank you for your feedback!

You can download developer builds of Couchbase Server from the downloads page (www.couchbase.com/downloads), and then click of “developer” tab. The latest Monthly build posted is this months developer build.

Thank you and regards,
Don Pinto, Couchbase Server PM

CC: @prasad, @keshav_m

Hi,

I am using Couchbase 4.5 Community - the latest stable version. Unfortunately I can not use even 4.6 DP, we can’t push Development preview to the production. I have to optimize it for Couchbase 4.5

@don

@petojurkovic, It looks like you ext-index index and query using that index. You can drop that index, if not able to drop specify the USE INDEX clause.

The following is right index and query,
CREATE INDEX externa-index ON bucket (-createdAt, account, eventType) WHERE external = true;

SELECT * FROM `bucket` 
 WHERE external = true AND -createdAt >= 1 
      AND -createdAt < 2287821078001 
      AND  account IS NOT NULL AND eventType IS NOT NULL 
       ORDER BY -createdAt, account, eventType
         LIMIT 100 OFFSET 50000;

Above query has huge offset rewriting as follows will run faster.

SELECT sd.* FROM (SELECT RAW META().id FROM `bucket` WHERE external = true AND -createdAt >= 1 AND -createdAt < 2287821078001 AND
               account IS NOT NULL AND eventType IS NOT NULL ORDER BY -createdAt, account, eventType
               LIMIT 100 OFFSET 50000) q1 JOIN `bucket` sd ON KEYS q1 ORDER BY sd.external, -sd.createdAt, sd.account, sd.eventType ;

Also note that when you negate the attribute index you need to change predicate.
i.e x > 1 AND x <= 10 IS SAME AS -x >= -10 AND -x < -1

1 Like

@vsr1 Thanks for your advices. The offset rewriting is actually quite issue. The second query which you provided is actually quite interesting, but I can’t use it. I have to work on whole, ordered result set. It would give me just ordered result of the first select. Is there any way how to optimize this kind of queries?

@petojurkovic, The subquery uses covered index order and produces LIMIT meta().id after skipping OFFSET, Outside query fetches LIMIT documents and reapply order. I don’t think there is issue. Can u check your predicates Is this need to be -createdAt <= -1 AND -createdAt > -2287821078001.