MultiChangesFeed got error reading changes feed: Error while building channel feed

Sync Gateway 2.8
Couchbase server 6.6.0 CE

Since a month or so, we noticed timeouts from the Sync Gateway Admin API _changes (GET) endpoint like these:

2021-05-11T13:42:24.007Z [WRN] c:#1936 MultiChangesFeed got error reading changes feed: Error while building channel feed -- db.(*Database).SimpleMultiChangesFeed.func1() at changes.go:608
2021-05-11T13:43:39.008Z [WRN] c:#1939 Error retrieving changes for channel "provider49d49831f74b41a5b94912aa9eb09a42_tickets_with_logs": Timeout performing Query 
-- db.(*Database).changesFeed.func1() at changes.go:210
2021-05-11T13:43:39.008Z [WRN] c:#1939 MultiChangesFeed got error reading changes feed: Error while building channel feed -- db.(*Database).SimpleMultiChangesFeed.func1() at changes.go:608
2021-05-11T13:45:18.615Z [WRN] c:#1941 Error retrieving changes for channel "provider49d49831f74b41a5b94912aa9eb09a42_tickets_with_logs": Timeout performing Query 
-- db.(*Database).changesFeed.func1() at changes.go:210
2021-05-11T13:45:18.615Z [WRN] c:#1941 MultiChangesFeed got error reading changes feed: Error while building channel feed -- db.(*Database).SimpleMultiChangesFeed.func1() at changes.go:608
2021-05-11T13:45:37.861Z [WRN] c:#1942 Error retrieving changes for channel "providerf421f7ccc3c946789bbc39e75402b6ac_scans": Timeout performing Query -- db.(*Database).changesFeed.func1() at changes.go:210

Sometimes the requests work, and sometimes not. I could not find any info about this error message. Do you have any advice?

Thanks in advance

It’s the channel query Sync Gateway is running on Couchbase Server is timing out (as you can probably guess from the log).

Since you’re requesting changes through the Admin API, are you specifying any channel filters, or limits to help with lowering the amount of work the query has to do?

Thanks for your reply.

Yes we specify a channel filter, a limit of 1000 results and “since”, eg:

GET /sync/_changes?limit=1000&active_only=true&include_docs=true&filter=sync_gateway%2Fbychannel&channels=provider05678808112f418eb97a551ad8614c89_scans&since=82679467

Ok cool, that’s probably the best-case in terms of limiting the query and utilising the channel index.

You may find some more info by looking at the query monitor page in Couchbase Server whilst the queries are being run. I think from there you can run EXPLAIN to understand how the query is being run and what’s slowing it down.

https://docs.couchbase.com/server/current/tools/query-monitoring.html

https://docs.couchbase.com/server/current/tools/query-workbench.html#query-plans

Oh, would be cool to do that, but it seems like I don’t have access to the query monitor page (although the doc says it should be ok with CE 6.5+):

Is there another way to monitor queries in the community edition?

I’ve just found out that our docs are wrong regarding query monitor, apologies for the incorrect advice.

https://issues.couchbase.com/browse/DOC-7978

There is not another way to use query monitor, but you can use EXPLAIN to see what the query is doing. The below is an example channel query that you’d see as the result of your above changes request, if you paste into the Query workbench and click Explain, it will describe what is happening to fulfil that query.

Xattrs/enable_shared_bucket_access=true

SELECT [op.name, LEAST(meta(`sync`).xattrs._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][1] AS seq, [op.name, LEAST(meta(`sync`).xattrs._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][2] AS rRev, [op.name, LEAST(meta(`sync`).xattrs._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][3] AS rDel, meta(`sync`).xattrs._sync.rev AS rev, meta(`sync`).xattrs._sync.flags AS flags, META(`sync`).id AS id FROM `sync` USE INDEX (sg_channels_x1) UNNEST OBJECT_PAIRS(meta(`sync`).xattrs._sync.channels) AS op WHERE ([op.name, LEAST(meta(`sync`).xattrs._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)]  BETWEEN  ['provider05678808112f418eb97a551ad8614c89_scans', 82679467] AND ['provider05678808112f418eb97a551ad8614c89_scans', 82679967]) AND (meta(`sync`).xattrs._sync.flags IS MISSING OR BITTEST(meta(`sync`).xattrs._sync.flags,1) = false) ORDER BY [op.name, LEAST(meta(`sync`).xattrs._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)] LIMIT 1000

Non-xattr/enable_shared_bucket_access=false

SELECT [op.name, LEAST(`sync`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][1] AS seq, [op.name, LEAST(`sync`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][2] AS rRev, [op.name, LEAST(`sync`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)][3] AS rDel, `sync`._sync.rev AS rev, `sync`._sync.flags AS flags, META(`sync`).id AS id FROM `sync` USE INDEX (sg_channels_1) UNNEST OBJECT_PAIRS(`sync`._sync.channels) AS op WHERE ([op.name, LEAST(`sync`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)]  BETWEEN  ['provider05678808112f418eb97a551ad8614c89_scans', 82679468] AND ['provider05678808112f418eb97a551ad8614c89_scans', 82679968]) AND (`sync`._sync.flags IS MISSING OR BITTEST(`sync`._sync.flags,1) = false) ORDER BY [op.name, LEAST(`sync`._sync.sequence, op.val.seq),IFMISSING(op.val.rev,null),IFMISSING(op.val.del,null)] LIMIT 1000

Thank you for the sample, I could also find the query with
SELECT * FROM system:completed_requests

However, everything seems ok in the console, the execution time is quite fast and the index used is sg_channels_x1


Still, the timeouts with sync gateway are still there :confused:
Do you have any other clues?

Does the query being run through the UI have the request_plus consistency level set? That’s what Sync Gateway uses for its channel queries to ensure there’s no stale data returned.

https://docs.couchbase.com/server/current/learn/services-and-indexes/indexes/index-replication.html#index-consistency

Screenshot 2021-05-18 at 11.46.58

The scan consistency was set to not_bounded. It’s still fast when using request_plus.
Our workaround for now is setting the result limit to 500 instead of 1000.

I can’t think of a reason why the query would be so much slower when it’s issued via SG beyond normal network constraints between SG and CB Server. I suppose you could try issuing a query from the server SG is running on to rule that out?

There’s also a config option to increase the timeout value for queries: view_query_timeout_secs

https://docs.couchbase.com/sync-gateway/current/configuration-properties.html#databases-this_db-view_query_timeout_secs