N1ql does not use index for "order by"

Hello,

I am aware that “order by desc” is not that optimized at this point, but it looks like “order by asc” is not working well for me either.

I have the following document format.

{
    "client_id": "value",
    "form" : "value",
    "username": "value",
    "deleted_at": null,
    "published": true,
    "meta": {
          "updated_at": "value"
     }
}

query.

select * from bucket where client_id='value' and form='value' and username='value' and deleted_at is null and published = true order by meta.updated_at asc limit 30 offset 0

Index

CREATE INDEX `idx_by_username` ON `bucket`(client_id, form, username, deleted_at, published, STR_TO_MILLIS(meta.updatedAt))

explains shows that it uses client_id, form, username, deleted_at, and published_at correctly. I can confirm it by running the query without “order by” and it is under 30ms. When I add “order by asc”, it takes 300ms.

I’ve also tried to create a separate index just for “meta.updated_at” field, but it did not make any changes.

Am I doing something wrong?

Thanks!
Moon

Use Couchbase 4.5.1. Add meta.updated_at IS NOT NULL to your WHERE clause. Post the new query and timings here.

1 Like

I will try that out today and report back!

Thank you :slight_smile:

1 Like

Hello @geraldss

I just tested it out and EXPLAIN shows that my new query with meta.updated_at is not null uses the correct index as expected. However, it is slower now. My guess is that it has to check meta.updated_at null now.

I’ve also noticed that index does not get used if I convert meta.updated_at to str_to_millis. I was trying to use a workaround for “order by desc” that you added in other thread. My index def. is

create index `idx_meta_updated_at` on bucket(-str_to_millis(meta.updated_at))

My goal is to improve “order by”. When I omit “order by desc|asc”, I get 27ms. When I include “order by”…I get 270ms+. That’s 10 times slower and the response time grows as I work with more data.

Please post the full query for the index with -str_to_millis.

Please also post the EXPLAIN for this.

Query can only uses Index order when ORDER BY expressions matched exactly with leading index keys from left to right.

Most of your query predicates are equal you can try this.
For Ascending:

CREATE INDEX idx_by_username ON bucket(client_id, form, username, deleted_at, published, meta.updatedAt);
select * from bucket where client_id=‘value’ and form=‘value’ and username=‘value’ and deleted_at is null and published = true order by client_id, from, username, delted_at, published, meta.updated_at asc limit 30 offset 0;

For descending

CREATE INDEX idx_by_username ON bucket(client_id, form, username, deleted_at, published, -STR_TO_MILLIS(meta.updatedAt))

select * from bucket where client_id=‘value’ and form=‘value’ and username=‘value’ and deleted_at is null and published = true order by client_id, from, username, delted_at, published, -STR_TO_MILLIS(meta.updatedAt) limit 30 offset 0;

1 Like

Thank you both @geraldss and @vsr1

I will post the full query and try suggestions from vsr1 as soon as possible.

Hi @geraldss

It works now :slight_smile: Explains shows meta.updatedAt index being used.

This is index

CREATE INDEX `idx_by_username` ON `bucket`(client_id, form, username, deleted_at, published, -STR_TO_MILLIS(meta.updatedAt))

and this is query that DOES NOT work

select * from bucket where client_id='clientId' and form='form' and username='username' and deleted_at ='deleted_at' and published=true and meta.updatedAt is not null order by meta.updatedAt desc limit 30

This works

select * from bucket where client_id='clientId' and form='form' and username='username' and deleted_at ='deleted_at' and published=true and STR_TO_MILLIS(meta.updatedAt) is not null order by meta.updatedAt limit 30

Without using index for meta.updatedAt, query response time is between 300ms and 400ms.
With meta.updatedAt index, query response time is between 150ms and 200ms.
Without “order by”, query response time is 20ms and 30ms.

It looks like order by increases response time by almost 10x whether it is asc or desc. Am I correct on this?

Try this.

CREATE INDEX `idx_by_username` ON `bucket`( -STR_TO_MILLIS(meta.updatedAt), client_id, form, username, deleted_at, published );

select *
from bucket
where client_id='clientId' and form='form' and username='username' and deleted_at ='deleted_at' and published=true and -STR_TO_MILLIS(meta.updatedAt) is not null
order by -STR_TO_MILLIS( meta.updatedAt )
limit 30;
2 Likes

!!! that did it!!!

it’s now between 20ms and 35ms. That’s more than 15 times faster than what we currently have in production.

Thank you so much!

1 Like

Hi @geraldss

Just created this new index on production and tested it out. Explain shows that everything looks okay, but then the query takes 27+ seconds.

This is index def:

CREATE INDEX `idx_artist_creatives` ON `bucket`((-STR_TO_MILLIS((`meta`.`updatedAt`))),`clientId`,`form`,`username`,(`values`.`deleted_at`),(`values`.`published`)) WHERE (`form` = "creative")

This is EXPLAIN output:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "index": "index",
              "index_id": "b59e718566a05c53",
              "keyspace": "bucket",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "Inclusion": 0,
                    "Low": [
                      "null",
                      "\"clientId\"",
                      "\"creative\"",
                      "\"username\"",
                      "null",
                      "true"
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Fetch",
              "keyspace": "bucket",
              "namespace": "default"
            },
            {
              "#operator": "Parallel",
              "maxParallelism": 1,
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "(((((((`bucket`.`clientId`) = \"clientId\") and ((`bucket`.`form`) = \"creative\")) and ((`bucket`.`username`) = \"username\")) and (((`bucket`.`values`).`deleted_at`) is null)) and (((`bucket`.`values`).`published`) = true)) and ((-str_to_millis(((`bucket`.`meta`).`updatedAt`))) is not null))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "self",
                        "star": true
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Offset",
          "expr": "0"
        },
        {
          "#operator": "Limit",
          "expr": "30"
        },
        {
          "#operator": "FinalProject"
        }
      ]
    },
    "text": "iety6'\n                and form = 'creative'\n                and username = 'username'\n                and `values`.deleted_at is null\n                and `values`.published = true\n                and -STR_TO_MILLIS(meta.updatedAt) is not null\n                  order by -STR_TO_MILLIS(meta.updatedAt)  limit 30 offset 0"
  }
]

The query works, it just takes 27+ seconds unlike my development machine.

Do you have any clues?

Please post the query.

This is query

    select * from bucket
        where
        clientId = 'clientId'
        and form = 'creative'
        and username = 'username'
        and `values`.deleted_at is null
        and `values`.published = true
        and -STR_TO_MILLIS(meta.updatedAt) is not null
          order by -STR_TO_MILLIS(meta.updatedAt)  limit 30 offset 0

This query is using a different index. Can you add

USE INDEX ( `idx_artist_creatives` )

I’m sorry, that is my mistake.

It’s actually using the correct index. I renamed index right after I made the EXPLAIN output. It’s using the correct index.

I’ve tried USE INDEX, but it did not work either.

I’ve double checked it again locally just to make sure and it works just fine.

Only difference is that the prod has way more data than my local.

Ok, a few changes.

(1) Remove form from the index keys, because it is already defined in the index WHERE clause.

(2) Change SELECT * to SELECT specific fields, so you can use a covering index.

(3) If you are using cbq shell, \SET -pretty = false

DISREGARD PLEASE . I will make another reply.

DISREGARD PLEASE . I will make another reply.

I’ve removed ‘form’ from the index.

This is my new index definition.

CREATE INDEX `idx_creatives_by_artist_test` ON `catalog`((-str_to_millis((`meta`.`updatedAt`))),`clientId`,`username`,(`values`.`deleted_at`),(`values`.`published`)) WHERE (`form` = "creative")

This is EXPLAIN out:

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "covers": [
                "cover ((-str_to_millis(((`bucket1`.`meta`).`updatedAt`))))",
                "cover ((`bucket1`.`clientId`))",
                "cover ((`bucket1`.`username`))",
                "cover (((`bucket1`.`values`).`deleted_at`))",
                "cover (((`bucket1`.`values`).`published`))",
                "cover ((meta(`bucket1`).`id`))"
              ],
              "filter_covers": {
                "cover ((`bucket1`.`form`))": "creative"
              },
              "index": "idx_creatives_by_artist_test",
              "index_id": "8430fa9158884cc8",
              "keyspace": "catalog",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "Inclusion": 1,
                    "Low": [
                      "true",
                      "\"clientId\"",
                      "\"username\"",
                      "null",
                      "true"
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Parallel",
              "maxParallelism": 1,
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "((((((cover ((`bucket1`.`clientId`)) = \"clientId\") and (cover ((`bucket1`.`form`)) = \"creative\")) and (cover ((`bucket1`.`username`)) = \"username\")) and (cover (((`bucket1`.`values`).`deleted_at`)) is null)) and (cover (((`bucket1`.`values`).`published`)) = true)) and cover ((-str_to_millis(((`bucket1`.`meta`).`updatedAt`)))))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "cover ((`bucket1`.`form`))"
                      }
                    ]
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Offset",
          "expr": "0"
        },
        {
          "#operator": "Limit",
          "expr": "30"
        },
        {
          "#operator": "FinalProject"
        }
      ]
    },
    "text": "select `bucket1`.`form` from catalog bucket1 \nwhere \nbucket1.clientId = 'clientId' \nand bucket1.form = 'creative' \nand bucket1.username = 'username' \nand `values`.deleted_at is null\nand `values`.published = true\nand -STR_TO_MILLIS(`meta`.`updatedAt`)\norder by -STR_TO_MILLIS(`meta`.`updatedAt`)\nlimit 30 offset 0"
  }
]

Query response time is between 900ms and 1.2 second.

Thank you for the help :slight_smile: