N1QL slow compare to MySQL

Hello,

N1QL seems slow, so i compared the same behavior on MySQL.
1 000 000 rows like that :

uri	        partOfTown	partOfNafClassification
===========================================================
15878ab8fa733a	ville-90	eco/NafClassification/1509C
15878ab8fa7e43	ville-1         eco/NafClassification/1635C
15878ab8fa84ae	ville-98	eco/NafClassification/1959C 

Query :

    select uri, partOfNafClassification, partOfTown FROM data WHERE partOfTown = 'ville-33'
        and partOfNafClassification = 'eco/NafClassification/1000C' limit 1

Result in 3ms.

On CouchBase :
1 000 000 rows like that :

    {
          "uri": "entreprise-158788affabad4",
          "partOfTown": "ville-42",
          "partOfNafClassification": "eco/NafClassification/1073C"
     } 

Query :

    select uri, partOfNafClassification, partOfTown FROM benchmark  USE INDEX (idxmulti) WHERE partOfTown = 'ville-33'
    and partOfNafClassification = 'eco/NafClassification/1000C'
    limit 1

Result in 100ms.

Explain of Query :

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "index": "idxmulti",
              "index_id": "fdd3118d11ad0b9b",
              "keyspace": "benchmark",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "High": [
                      "\"eco/NafClassification/1000C\"",
                      "\"eco/NafClassification/1000C\""
                    ],
                    "Inclusion": 3,
                    "Low": [
                      "\"eco/NafClassification/1000C\"",
                      "\"eco/NafClassification/1000C\""
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Fetch",
              "keyspace": "benchmark",
              "namespace": "default"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "(((`benchmark`.`partOfTown`) = \"ville-33\") and ((`benchmark`.`partOfNafClassification`) = \"eco/NafClassification/1000C\"))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "(`benchmark`.`uri`)"
                      },
                      {
                        "expr": "(`benchmark`.`partOfNafClassification`)"
                      },
                      {
                        "expr": "(`benchmark`.`partOfTown`)"
                      }
                    ]
                  },
                  {
                    "#operator": "FinalProject"
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Limit",
          "expr": "1"
        }
      ]
    },
    "text": "select uri, partOfNafClassification, partOfTown FROM benchmark  USE INDEX (idxmulti) WHERE partOfTown = 'ville-33'\nand partOfNafClassification = 'eco/NafClassification/1000C'\nlimit 1"
  }
]

In your SQL query you do select uri, partOfNafClassification, partOfTown in the N1ql query select * - does that make a difference? Can you try the same for N1QL? Also, if you ever need to read those fields you could also try putting it in a covered index so it doesn’t need to fetch the docs.

I’m sure @geraldss has some other tricks in his hat to speed this up.

About * that was a mistake. I fixed and have the same time.
If i remove uri, i got the same time, and partOfNafClassification and partOfTown are in multi covering index.

CREATE INDEX `idxmulti` ON `benchmark`(`partOfNafClassification`,`partOfNafClassification`)

If I’m not mistaken for a covering index to work all the fields you access in the select need to be part of the covering index, so uri, partOfNafClassification, partOfTown, not just the fields from the WHERE clause (since otherwise N1QL needs to go look up the other fields from the docs, which means loading the doc)

Like i said, without uri i got the same time.

Which server version and index type are you using?

4.5.1-2844 Enterprise Edition (build-2844)

I don’t know what you mean by index type.

I have this :
Definition: CREATE INDEX idxmulti ON benchmark(partOfNafClassification,partOfNafClassification)

Oh sorry I meant MOI vs GSI

I used GSI index :
CREATE INDEX idxmulti on benchmark (partOfNafClassification,partOfNafClassification) USING GSI;

Did you pick Standard Global Secondary Indexes or Memory-Optimized Global Secondary Indexes for the cluster?

Standard, i assume i should test moi :wink:

Yes definitely! And also please make sure with MOI for best perf that you have a covered index, so your select fields need to be part of it (you can test with and without cover I guess). So in the benchmark(…) index definition all your fields show up. I think you can also see it in the explain plan that there is no fetch step, but I’m not 100% certain on that one. @geraldss can help us here if there are still perf issues when you use MOI

Here is some more info on it https://developer.couchbase.com/documentation/server/current/indexes/covering-indexes.html

Which also contains the hint on this:

Attention: MISSING items are not indexed by indexers. To take advantage of covering indexes and for the index to qualify, a query needs to exclude documents where the index key expression evaluates to MISSING. For example, index index1 defined below covers the following query.
CREATE INDEX index1 ON bucket(attribute1) WHERE attribute2 = “value”;
SELECT attribute1 FROM bucket WHERE attribute2="value AND attribute1 IS NOT MISSING;

Same time with MOI.

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "index": "idxmulti",
              "index_id": "3772e25f0c062dbb",
              "keyspace": "benchmark",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "High": [
                      "\"eco/NafClassification/1000C\"",
                      "\"eco/NafClassification/1000C\""
                    ],
                    "Inclusion": 3,
                    "Low": [
                      "\"eco/NafClassification/1000C\"",
                      "\"eco/NafClassification/1000C\""
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Fetch",
              "keyspace": "benchmark",
              "namespace": "default"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "(((`benchmark`.`partOfTown`) = \"ville-33\") and ((`benchmark`.`partOfNafClassification`) = \"eco/NafClassification/1000C\"))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "(`benchmark`.`partOfNafClassification`)"
                      },
                      {
                        "expr": "(`benchmark`.`partOfTown`)"
                      }
                    ]
                  },
                  {
                    "#operator": "FinalProject"
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Limit",
          "expr": "1"
        }
      ]
    },
    "text": "select partOfNafClassification, partOfTown FROM benchmark  USE INDEX (idxmulti) WHERE partOfTown = 'ville-33'\nand partOfNafClassification = 'eco/NafClassification/1000C'\nlimit 1"
  }
]

And with the index :
CREATE INDEX idxmulti on benchmark (partOfNafClassification,partOfNafClassification) USING GSI;

You are still not using a covered index. can you try

CREATE INDEX idxmulti on benchmark (partOfTown,partOfNafClassification) USING GSI;

and

select partOfNafClassification, partOfTown FROM benchmark  USE INDEX (idxmulti) WHERE partOfTown = 'ville-33' and partOfNafClassification = 'eco/NafClassification/1000C' limit 1

and send over the explain plan for this as well

Better but not enough : 12ms

[
  {
    "plan": {
      "#operator": "Sequence",
      "~children": [
        {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "IndexScan",
              "covers": [
                "cover ((`benchmark`.`partOfTown`))",
                "cover ((`benchmark`.`partOfNafClassification`))",
                "cover ((meta(`benchmark`).`id`))"
              ],
              "index": "idxmulti",
              "index_id": "70e2397553e65771",
              "keyspace": "benchmark",
              "limit": "1",
              "namespace": "default",
              "spans": [
                {
                  "Range": {
                    "High": [
                      "\"ville-3603\"",
                      "\"eco/NafClassification/1000C\""
                    ],
                    "Inclusion": 3,
                    "Low": [
                      "\"ville-3603\"",
                      "\"eco/NafClassification/1000C\""
                    ]
                  }
                }
              ],
              "using": "gsi"
            },
            {
              "#operator": "Parallel",
              "~child": {
                "#operator": "Sequence",
                "~children": [
                  {
                    "#operator": "Filter",
                    "condition": "((cover ((`benchmark`.`partOfTown`)) = \"ville-3603\") and (cover ((`benchmark`.`partOfNafClassification`)) = \"eco/NafClassification/1000C\"))"
                  },
                  {
                    "#operator": "InitialProject",
                    "result_terms": [
                      {
                        "expr": "cover ((`benchmark`.`partOfNafClassification`))"
                      },
                      {
                        "expr": "cover ((`benchmark`.`partOfTown`))"
                      }
                    ]
                  },
                  {
                    "#operator": "FinalProject"
                  }
                ]
              }
            }
          ]
        },
        {
          "#operator": "Limit",
          "expr": "1"
        }
      ]
    },
    "text": "select partOfNafClassification, partOfTown FROM benchmark  USE INDEX (idxmulti) WHERE partOfTown = 'ville-3603'\nand partOfNafClassification = 'eco/NafClassification/1000C'\nlimit 1"
  }
]

Okay so at this point I exhausted my N1QL-fu and we need to wait for @geraldss to see if we can get it further.

Use cbq shell.

\set -pretty=false

SELECT RAW …

cc @vsr1 @keshav_m

cbq> select partOfNafClassification, partOfTown FROM benchmark USE INDEX (idxmulti) WHERE partOfTown = 'ville-3603’
and partOfNafClassification = 'eco/NafClassification/1000C’
limit 1; > >
{
“requestID”: “0de6be00-dc1e-4075-bb1f-336db14d7746”,
“signature”: {
“partOfNafClassification”: “json”,
“partOfTown”: “json”
},
“results”: [
],
“status”: “success”,
“metrics”: {
“elapsedTime”: “107.718672ms”,
“executionTime”: “107.619958ms”,
“resultCount”: 0,
“resultSize”: 0
}
}

Can you try SELECT vs SELECT RAW?

Don’t work with more than one column :

cbq> select RAW partOfNafClassification, partOfTown FROM benchmark USE INDEX (idxmulti) WHERE partOfTown = 'ville-33’
and partOfNafClassification = 'eco/NafClassification/1000C’
limit 1; > >
{
“requestID”: “4cfff77b-2926-47db-9686-f4ede91fc2c5”,
“errors”: [
{
“code”: 3000,
“msg”: “syntax error - at ,”
}
],
“status”: “fatal”,
“metrics”: {
“elapsedTime”: “382.725µs”,
“executionTime”: “351.664µs”,
“resultCount”: 0,
“resultSize”: 0,
“errorCount”: 1
}
}