Index scan timeout on query

Hello,

I am having an issue trying to figure out why my query is timing out. Here is the query:

SELECT COUNT(*)
FROM `sync_gateway` AS docs
WHERE owner IN (
    SELECT RAW owner FROM `sync_gateway` as owners WHERE type="userprofile_private" and emailConfirmed=true and lastSignInDate <= "2018-10-05T00:00:00Z"
)

I have an index on each of the fields in the where clause as well as a composite index on all of the fields in the where clause (added whilst trying to figure out why it was timing it).

The Explain looks like this:

{
  "plan": {
    "#operator": "Sequence",
    "~children": [
      {
        "#operator": "IndexScan3",
        "as": "docs",
        "covers": [
          "cover ((`docs`.`owner`))",
          "cover ((meta(`docs`).`id`))"
        ],
        "index": "owner-id",
        "index_id": "a387bda3ca329527",
        "index_projection": {
          "entry_keys": [
            0
          ]
        },
        "keyspace": "sync_gateway",
        "namespace": "default",
        "spans": [
          {
            "range": [
              {
                "inclusion": 0,
                "low": "null"
              }
            ]
          }
        ],
        "using": "gsi"
      },
      {
        "#operator": "Parallel",
        "~child": {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "Filter",
              "condition": "(cover ((`docs`.`owner`)) in (select raw (`owners`.`owner`) from `sync_gateway` as `owners` where ((((`owners`.`type`) = \"userprofile_private\") and ((`owners`.`emailConfirmed`) = true)) and ((`owners`.`lastSignInDate`) <= \"2018-10-05T00:00:00Z\"))))"
            },
            {
              "#operator": "InitialGroup",
              "aggregates": [
                "count(*)"
              ],
              "group_keys": []
            }
          ]
        }
      },
      {
        "#operator": "IntermediateGroup",
        "aggregates": [
          "count(*)"
        ],
        "group_keys": []
      },
      {
        "#operator": "FinalGroup",
        "aggregates": [
          "count(*)"
        ],
        "group_keys": []
      },
      {
        "#operator": "Parallel",
        "~child": {
          "#operator": "Sequence",
          "~children": [
            {
              "#operator": "InitialProject",
              "result_terms": [
                {
                  "expr": "count(*)"
                }
              ]
            },
            {
              "#operator": "FinalProject"
            }
          ]
        }
      }
    ]
  },
  "text": "SELECT COUNT(*)\nFROM `sync_gateway` AS docs\nWHERE owner IN (\n    SELECT RAW owner FROM `sync_gateway` as owners WHERE type=\"userprofile_private\" and emailConfirmed=true and lastSignInDate <= \"2018-10-05T00:00:00Z\"\n)"
}

I have also tried to do this query as a self join rather than a WHERE IN and I get the same result, the error is as follows:

[
  {
    "code": 5000,
    "msg": " Index scan timed out - cause:  Index scan timed out",
    "query_from_user": "SELECT COUNT(*)\nFROM `sync_gateway` AS docs\nWHERE owner IN (\n    SELECT RAW owner FROM `sync_gateway` as owners WHERE type=\"userprofile_private\" and emailConfirmed=true and lastSignInDate <= \"2018-10-05T00:00:00Z\"\n)"
  }
]

Our CB has approx. 28m documents, 5 nodes, 50GB RAM dedicated to this bucket (6GB dedicated to the index service) run CB 6.0.0 CE.

Any ideas or pointers?

You are doing whole scan of the bucket. What exactly you are trying to do.

My goal is to get a count of the documents where the owner attribute of the document matches those in the where clause, i.e. users who have not signed in within the last 6 months but have a confirmed email address

SELECT RAW owner FROM sync_gateway as owners WHERE type=“userprofile_private” and emailConfirmed=true and lastSignInDate <= “2018-10-05T00:00:00Z”
If this is small number
do 2 queries and pass the subquery output as query parameter and run the parent query has adhoc=true
Otherwise it needs to scan whole index.

It is not a small number unfortunately. Is there another way to achieve this?

In that case plan is optimal only option you have increase index timeout. which is 2min at present. cc @deepkaran.salooja

@vsr1 @deepkaran.salooja how can I increase index scan timeout on Couchbase version 6 ?
Thanks in advance

This is a repeat of the question here:

yes @biozal just to tag people that maybe have the answer.
Can you help me find a solution?

@amarino , you can use the below settings change to increase the scan timeout

curl -X POST -u user:pass http://index_node_ip:9102/settings --data ‘{“indexer.settings.scan_timeout”: <timeout_in_milliseconds>}’