Query performance issue while parameterizing

The lastest Enterprise Edition server version is 7.2.3. The latest Community Edition server version is 7.2.2

I think this is the key

with $lastType=“”, the predicate type >= $lastType is always true (everything is >= the empty string).

I’m out of ideas. Open a ticket at https://issues.couchbase.com/

Thanks Michael. I opened the ticket : Loading...

MB-60217

1 Like

First $lastType is unknown during plan phase, planner should generate such a way that it will work for all possible values.
Above predicate is complex it has constants and dynamic value as previous statement it will generate generic plan that can work all values.

  1. First OR, Second OR is same. Because META().id is string and can’t be empty string. So (SUBSTR(META(sg).id,0) > “”) always true.

  2. First OR, Second OR is always false because sg.type will never be “”, it is AND with (sg.type IN [“location”]) these are contradictory predicates.

  3. So try with sg.type IN [“location”] AND sg.type > $lastType
    In this case Optimizer deiced to use one of the value for index scan and apply other post indescan. Depends on what value is used how many false positives it can impact query latencies. This has been addressed by MB-52873
    OR sg.type IN [“location”] AND sg.type >= GRATEST(“location”, SUCCESSOR($lastType))

Use one of the following alternatives.

Drive from application
Use $lastType = in $documentTypes and when query returns less than your LIMIT move to next Type and set all others to “”

WHERE sg.`type` = $lastType
      AND ((IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") = $lastLocation AND SUBSTR(META(sg).`id`,0) > $lastKey)
          OR  (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") > $lastLocation))

OR

WITH newTypes AS (ARRAY v FOR v IN $documentTypes SATISFIES WHEN v > $lastType END)
SELECT  sg.`type`, IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") AS `location`, meta(sg).`id`
FROM `sync_gateway_sw1` AS sg
WHERE (((sg.`type` = $lastType) AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") = $lastLocation) AND (SUBSTR(META(sg).`id`,0) > $lastKey))
       OR ((sg.`type` = $lastType) AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") > $lastLocation))
       OR sg.`type` IN newTypes)
         AND (NOT IFMISSINGORNULL(sg.`softDelete`, FALSE))
         AND (NOT IFMISSINGORNULL(sg.`_deleted`, FALSE))
         AND (META(sg).`id` NOT LIKE "_sync:%")
ORDER BY sg.`type`, IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff"), SUBSTR(META(sg).`id`,0)
LIMIT 100;

OR

Application can remove the type processed from $documentTypes (i.e. when lastType changed remove from the named parameter) and use following

SELECT  sg.`type`, IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") AS `location`, meta(sg).`id`
FROM `sync_gateway_sw1` AS sg
WHERE (((sg.`type` = $lastType) AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") = $lastLocation) AND (SUBSTR(META(sg).`id`,0) > $lastKey))
       OR ((sg.`type` = $lastType) AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") > $lastLocation))
       OR sg.`type` IN $documentTypes)
         AND (NOT IFMISSINGORNULL(sg.`softDelete`, FALSE))
         AND (NOT IFMISSINGORNULL(sg.`_deleted`, FALSE))
         AND (META(sg).`id` NOT LIKE "_sync:%")
ORDER BY sg.`type`, IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff"), SUBSTR(META(sg).`id`,0)
LIMIT 100;

This is duplicate of MB-52873

Hi,

Few questions:

  1. Is it possible to say exactly what is happening in Couchbase (perhaps in the Index server) during this 25-second period? Is it performing a full index scan?

  2. Why don’t I see this 25-second delay in my second iteration onwards? The query doesn’t change in the second iteration, therefore the query plan should be the same. The second iteration returns in less than 100ms.

Thanks for workaround, but I’m not sure if I can apply that in all our scenarios.

            "namespace": "default",
            "spans": [
              {   
                "range": [
                  {   
                    "high": "\"location\"",
                    "inclusion": 2,
                    "low": "$lastType"
                  } 

— vs —

           "limit": "10",
           "namespace": "default",
            "spans": [
              {   
                "exact": true,
                "range": [
                  {   
                    "high": "\"location\"",
                    "inclusion": 3,
                    "low": "\"location\""
                  } 

Above this clearly explains, when used named parameter used it starts with named parameter(“”) and stops at location. I.e. it means index scans doing thousands and millions of thems vs start the grater of $lastType, “location” due to multiple ANDs of same predicate.
You should have known that when you look profile timings and look #itemsOut/#itemsIn

Ok that makes sense.

  • When query is parameterized lower range is set to “low”: “$lastType” by the Query planner and the index scanner starts the scan from empty string “”, which means it scans every document in front of location type to reach the first location document in the index.

  • But when the query is un-parameterized lower range is set as “low”: “"location"”, therefore index scanner directly seeks to the first location document.

Just curious – can’t the Query Planner be a bit smarter when parameterized? Since my parameterized query has sg.type IN [“location”] , can’t it set “low”: “location” ?

       WHERE (sg.`type` IN ["location"])
         AND (((sg.`type` = "") AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") = "") AND (SUBSTR(META(sg).`id`,0) > ""))
          OR  ((sg.`type` = "") AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") > ""))
          OR   (sg.`type` > $lastType))

This has been addressed in future release by MB-52873

As i mentioned remove use the work around.

try this see it uses “location”

 WHERE (((sg.`type` = "") AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") = "") AND (SUBSTR(META(sg).`id`,0) > ""))
          OR  ((sg.`type` = "") AND (IFMISSINGORNULL(sg.`locationRoot`, sg.`locationId`, sg.`location`, "\uefff") > ""))
          OR   (sg.`type` > $lastType))
 AND  (sg.`type` IN ["location"])
1 Like

I cannot access this ticket. Could you please tell me the release date?

You should be able to access now. It will be next major release. Date TBD

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.