ORDER BY is very slow even when index is hit and all fields are covered

Hi!

I’m using Couchbase 4.5 Community Edition.

My bucket has about 1m items and I need to get documents with range filter and sort.

I created index:

CREATE INDEX my_index ON my-bucket(param1, param2, param3);

Query is:

SELECT meta(my-bucket).id FROM my-bucketWHERE param1="certain_val" AND param2 > 9500 ORDER BY param3 ASC LIMIT 20;

Eplain query gives me:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "covers": [
                "cover ((`my-bucket`.`param1`))",
                "cover ((`my-bucket`.`param2`))",
                "cover ((`my-bucket`.`param3`))",
              ],
              "filter_covers": {
                "cover ((`my-bucket`.`param1`))": "certain_val"
              },
              "index": "my_index",
              "index_id": "f034eb9852aefe99",
              "keyspace": "my-bucket",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "High": [
                      "successor(\"certain_val\")"
                    ],
                    "Inclusion": 1,
                    "Low": [
                      "\"certain_val\""
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "(((cover ((`my-bucket`.`param1`)) = \"certain_val\") and (9500 < cover ((`my-bucket`.`param2`))))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "cover ((meta(`my-bucket`).`id`))"
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Order",
          "limit": "20",
          "sort_terms": [
            {
              "expr": "cover ((`my-bucket`.`param3`))"
            }
          ]
        },
        {
          "#operator": "Limit",
          "expr": "20"
        },
        {
          "#operator": "FinalProject"
        }
      ]
    },
    "text": "SELECT meta(`my-bucket`).id FROM `my-bucket`WHERE param1=\"certain_val\" AND param2 > 9500 ORDER BY param3 ASC LIMIT 20;"
  }
]

Still, it takes 37 seconds to perform the query even though index is hit and fields are covered.

I noticed that the speed of the query seems largely dependent on the number of documents left after the filtering. However that’s beyond our control… now I have no idea how to improve query performance. Any insights on potential improvements?

The same query seems to be much faster in Solr (< 0.5s). So the last resort is that we’ll need to use a connector to sync the data with a solr cluster. But it would be much trouble and we’d like to avoid it if possible.

any help is much appreciated.

Thanks in advance.

QUERY ORDER BY is not following index order, Therefore it needs to produces all qualified documents and sort .

Try following and see if it improves anything

CREATE INDEX my_index ON my-bucket(param1, param3, param2);
SELECT meta(my-bucket).id FROM my-bucketWHERE param1=“certain_val” AND param2 > 9500  AND param3 IS NOT NULL ORDER BY param3 ASC LIMIT 20;