Query takes much longer in 6.0.4 as compared to 6.5.1

Hello,

We have a query which runs very slowly on 6.0.4 but very fast (as expected) on 6.5.1. Data is same. Cluster size is also not an issue. So I checked the explain plan and I saw some minor differences. I thought that may be the reason, so using hints, I managed to get the same plan. With the same plan too, there is a huge difference.

Slow query takes 1 minute (60000 ms).
Fast query takes 400 ms.

So I compared the text plan and this is what I see:
Fast query:
“~children”: [
{
#operator”: “NestedLoopJoin”,
#stats”: {
#itemsIn”: 52,
#itemsOut”: 52,
#phaseSwitches”: 523,
“execTime”: “2.852362ms”,
“kernTime”: “278.503176ms”

Slow query:
“~children”: [
{
#operator”: “NestedLoopJoin”,
#stats”: {
#itemsIn”: 52,
#itemsOut”: 52,
#phaseSwitches”: 107955,
“execTime”: “6.19948121s”,
“kernTime”: “4.576653445s”

As we can see, for the same operator, same step, same in/out items, 6.0.4 takes 6s execTime & 4ms kernTime. Whereas 6.5.1 took only 2 ms & 278 ms. On further checking, slow query does huge amount of get operations on the bucket (14k gets/sec for a minute).

Is this a bug? Or some enhancement in 6.5.1? Can I use a hint or something to make it go faster?

Thanks

Please post both the EXPLAINs . query, index definitions.

Hi,

Here is the index & query we are using. Also I have attached(Slow Query.zip) Explain Plan from both the clusters

Index:
CREATE INDEX ind_config_type ON config(__t)
CREATE INDEX ind_config_type_2 ON config(__t,nwid,networkid,hnwid,zcode,cosid)

N1QL:
SELECT DISTINCT P.*,
TONUMBER(P.networkid) AS networkid,
Q.networkname,
REPLACE(CONCAT(R.cosname," (",TOSTRING(R.cosid),")"),‘Default (0)’, ‘ZONEDEFAULT’) AS cosname,
CONCAT(S.zname," (",TOSTRING(S.zcode),")") AS zonename
FROM config P
INNER JOIN config Q ON TONUMBER(P.networkid) = Q.networkid
INNER JOIN config R ON P.cosid = R.cosid
OR P.cosid = P.zcode
INNER JOIN config S ON P.zcode = S.zcode
INNER JOIN config T ON P.cosid = T.cosid
AND P.zcode = T.zcode
AND P.hnwid = T.hnwid
WHERE P.__t=“ntr-ota-odreq”
AND Q.__t=“irdb-net”
AND R.__t=“ntr-cosm”
AND S.__t=“ntr-zonem”
AND T.__t=“ntr-zonecosm”
AND (P.cosid = R.cosid
OR ( P.cosid = P.zcode
AND R.cosid = 0 ))
ORDER BY P.sched_id
Slow Query.zip (6.7 KB)

Try the following query and indexes. Check on condition on R (in ON clause and WHERE clause and fix it).
As you are doing inner JOIN try change JOIN orders and check #itemsIn, #itemsOut each operator.

CREATE INDEX ix1 ON config(__t, networkid,hnwid,zcode,cosid); – P, Q
CREATE INDEX ix2 ON config(__t, cosid); – R
CREATE INDEX ix2 ON config(__t, zcode, cosid, hnwid); --, T S

SELECT DISTINCT P.*,
TONUMBER(P.networkid) AS networkid,
Q.networkname,
REPLACE(CONCAT(R.cosname," (",TOSTRING(R.cosid),")"),‘Default (0)’, ‘ZONEDEFAULT’) AS cosname,
CONCAT(S.zname," (",TOSTRING(S.zcode),")") AS zonename
FROM config P
INNER JOIN config Q ON TONUMBER(P.networkid) = Q.networkid
INNER JOIN config T ON P.cosid = T.cosid AND P.zcode = T.zcode AND P.hnwid = T.hnwid
INNER JOIN config R ON (P.cosid = R.cosid OR P.cosid = P.zcode)
INNER JOIN config S ON P.zcode = S.zcode
WHERE P.__t="ntr-ota-odreq"
AND Q.__t="irdb-net"
AND R.__t="ntr-cosm"
AND S.__t="ntr-zonem"
AND T.__t="ntr-zonecosm"
AND (P.cosid = R.cosid OR ( P.cosid = P.zcode AND R.cosid = 0 ))
ORDER BY P.sched_id;

Thanks vsr1.

Just for the sake of it, I completely removed R and it still takes 100x in 6.0.4 when compared to 6.5.1!

The concern really is, why is it working fine in 6.5.1 but not in 6.0.4? Indices, Data & Query is same. Resources are also same, infact better. The challenge is: prod is running 6.0.4 and Dev is running 6.5.1.

If we change the query, then there are other queries as well; we will need to test ,and may be modify, all of those.

As I shared previously, I compared the plans, and for an operator, even though the #itemsIn & #itemsOut are same, still the phaseswitches is very high in the slower plan and kernTime as well execTime is also very high in the slow plan.

Is there a bug in 6.0.4 which is fixed in 6.5.1 or is there some enhancement in 6.5.1 due to which it runs fine in 6.5.1?

Thanks

I recommended new query based on predicates so that it eliminates early (JOIN order is always LEFT to RIGHT).
If you can run the original query with new indexes. Did u follow my indexes. or https://index-advisor.couchbase.com/indexadvisor/#1

6.5.1 has more features and that why it performs. It can push more predicates to indexer and produce less document keys.
You saying every thing same but i see differently. Look the following in 6.5.1 On Q it could push the network_id (not on 6.0.4 because your index has network_id as 3rd key and you never had predicate on 2nd key on Q).

6.0.4

    \{\
                          "#operator": "Fetch",\
                          "#stats": \{\
                            "#itemsIn": 53768,\
                            "#itemsOut": 53768,\
                            "#phaseSwitches": 221988,\
                            "execTime": "354.71363ms",\
                            "kernTime": "363.547762ms",\
                            "servTime": "9.593390719s"\
                          \},\
                          "as": "Q",\
                          "keyspace": "testbucket",\
                          "namespace": "default",\
                          "nested_loop": true\
                        \}\
                      ]\
                    \},\

6.5.1

\{\
                      "#operator": "Fetch",\
                      "#stats": \{\
                        "#itemsIn": 52,\
                        "#itemsOut": 52,\
                        "#phaseSwitches": 468,\
                        "execTime": "762.019\'b5s",\
                        "kernTime": "184.772215ms",\
                        "servTime": "84.982187ms"\
                      \},\
                      "as": "Q",\
                      "keyspace": "config",\
                      "namespace": "default",\
                      "nested_loop": true\
                    \}\
                  ]\
                \},\

Thank you. You are right, but that is later in the plan. I compared both plans line-by-line and was referring to below from the same plans which appears before the part that you quoted:

6.0.4:
“~children”: [
{
#operator”: “NestedLoopJoin”,
#stats”: {
#itemsIn”: 52,
#itemsOut”: 52,
#phaseSwitches”: 107955,
“execTime”: “6.19948121s”,
“kernTime”: “4.576653445s”
},\

6.5.1
“~children”: [
{
#operator”: “NestedLoopJoin”,
#stats”: {
#itemsIn”: 52,
#itemsOut”: 52,
#phaseSwitches”: 523,
“execTime”: “2.852362ms”,
“kernTime”: “278.503176ms”
},\

I am not a pro at N1QL tuning yet, but I felt that the differences started from this point. In the above, #itemsIn & #itemsOut are same (52), still time is different. Can you help me understand what does it mean?

Thanks again.

To add: the query is generated by a software and so query tuning is not easy. There can be many dynamic queries like these.

I tried the USE HASH (BUILD) hint and its much almost as fast as 6.5.1 But as mentioned above, changing the query is difficult.

Can you help me understand which part of the plan indicates that the predicate is pushed down to the indexer?

Thanks

6.5.1

[\
                        \{\
                          "#operator": "IndexScan3",\
                          "#stats": \{\
                            "#itemsOut": 52,\
                            "#phaseSwitches": 364,\
                            "execTime": "886.955\'b5s",\
                            "kernTime": "84.697\'b5s",\
                            "servTime": "183.47457ms"\
                          \},\
                          "as": "Q",\
                          "index": "ind_config_type_2",\
                          "index_id": "e5c3c185a2019489",\
                          "index_projection": \{\
                            "primary_key": true\
                          \},\
                          "keyspace": "config",\
                          "namespace": "default",\
                          "nested_loop": true,\
                          "spans": [\
                            \{\
                              "exact": true,\
                              "range": [\
                                \{\
                                  "high": "\\"irdb-net\\"",\
                                  "inclusion": 3,\
                                  "low": "\\"irdb-net\\""\
                                \},\
                                \{\
                                  "inclusion": 0\
                                \},\
                                \{\
                                  "high": "to_number((`P`.`networkid`))",\
                                  "inclusion": 3,\
                                  "low": "to_number((`P`.`networkid`))"\
                                \}\
                              ]\
                            \}\
                          ],\
                          "using": "gsi"\
                        \},\
                        \{\
                          "#operator": "Fetch",\
                          "#stats": \{\
                            "#itemsIn": 52,\
                            "#itemsOut": 52,\
                            "#phaseSwitches": 468,\
                            "execTime": "762.019\'b5s",\
                            "kernTime": "184.772215ms",\
                            "servTime": "84.982187ms"\
                          \},\
                          "as": "Q",\
                          "keyspace": "config",\
                          "namespace": "default",\
                          "nested_loop": true\
                        \}\
                      ]\
                    \},\
                    "#time_normal": "00:00.002",\
                    "#time_absolute": 0.0028523

6.0.4

     "~children": [\
                \{\
                  "#operator": "IndexScan3",\
                  "#stats": \{\
                    "#itemsOut": 53768,\
                    "#phaseSwitches": 215228,\
                    "execTime": "203.551266ms",\
                    "kernTime": "4.609975073s",\
                    "servTime": "344.788861ms"\
                  \},\
                  "as": "Q",\
                  "index": "ind_config_type_2",\
                  "index_id": "9f06d91cff395b78",\
                  "index_projection": \{\
                    "primary_key": true\
                  \},\
                  "keyspace": "testbucket",\
                  "namespace": "default",\
                  "nested_loop": true,\
                  "spans": [\
                    \{\
                      "exact": true,\
                      "range": [\
                        \{\
                          "high": "\\"irdb-net\\"",\
                          "inclusion": 3,\
                          "low": "\\"irdb-net\\""\
                        \}\
                      ]\
                    \}\
                  ],\
                  "using": "gsi"\
                \},\
                \{\
                  "#operator": "Fetch",\
                  "#stats": \{\
                    "#itemsIn": 53768,\
                    "#itemsOut": 53768,\
                    "#phaseSwitches": 221988,\
                    "execTime": "354.71363ms",\
                    "kernTime": "363.547762ms",\
                    "servTime": "9.593390719s"\
                  \},\
                  "as": "Q",\
                  "keyspace": "testbucket",\
                  "namespace": "default",\
                  "nested_loop": true\
                \}\
              ]\
            \}

Thank you vsr1. I could spot the additional predicate being pushed to indexer in the 6.5.1 plan.

Also, the 3 indices that you suggested do help in 6.0.4 to get the results in millisec. FYI - index-advisor does not suggest the same 3 indices. It suggests 4 other indices and those are different from what you suggested. And those do not help as much as the ones that you suggested.

Believe, the join is evaluated left-to-right. Does the predicate order have to match the index key order or that does matter? for e.g. if index has col1, col2, col3 and the predicate is
WHERE col2=‘x’ and col1=‘y’ and col3=‘z’, then does it matter?

Predicate order doesn’t matter. Index advisor advises based on newest release, Most cases works on older releases, some cases may need tuning.

Thanks vsr1. Appreciate all your help.