Sync Gateway attempts to add missing document in changes feed on startup

Hi all,

We are currently running a cluster with 2 Sync Gateway 1.2 instances backed by 5 Couchbase servers running 4.0.0-4051 Community Edition (build-4051).
When we (re)start a Sync Gateway instance we notice a huge amount of the following warnings in the log:

couchbase-sync-gateway.log: 2016-09-21T18:21:52.458Z WARNING: Changes feed: error getting doc "6446818e-b08d-4198-9908-895b915032d5"/"16-7709e0d829a0c1c63e4a077ce6a1a6dc": 404 missing -- db.(*Database).addDocToChangeEntry() at changes.go:81

We looked into these documents but they cannot be found (might have been deleted before, as our users frequently create very shortliving documents).

Is this something we should be worried about? Can we do something to prevent these warnings in the future?

(Note: We already tried all the usual compact and purge cycles. Please let me know if there is any additional information I can provide to shed more light on this issue.)

This warning will occur when trying to retrieve a revision body that no longer exists while processing a changes request. We’d need additional context to understand exactly why you’re hitting this error, though.

A few additional questions:

  1. When you say the documents cannot be found - do you mean the document with that key, or the specific doc/revision pair? In the case of deleted documents, I’d still expect the document to exist (the tombstone revision)
  2. I’m not sure what’s meant by ‘the usual compact and purge cycles’ - can you elaborate?
  3. If this happens reliably on startup, it may be useful to start a Sync Gateway node with full logging enabled to get some context around the warning. If that’s possible, you could link the relevant log excerpt in a gist for review.

Hi Adam,

Thank you for your swift response.

In answer to your questions:

  1. I tried retrieving the document through the Sync Gateway admin REST api using the following call:

curl http://localhost:4985/mydatabase/docID
curl http://localhost:4985/mydatabase/docID?rev=revID

Both of these calls produced the following response:

{“error”:“not_found”,“reason”:“missing”}

It seems the tombstone is missing as well, as it clearly indicates “not found” and not “deleted: true”.

  1. By compact I mean the autocompaction which is run by Couchbase when it exceeds a certain treshold. We also triggered the manual compact a few times.
    Aside from that we also tried the “unsafe purge” command on Couchbase in an attempt to free up disk space of removed attachments as described in this thread:
    Running out of disk space - #5 by marvinwright

  2. I have tried to reproduce it on our staging cluster but unfortunately the warnings don’t occur here.
    I will try to get our ops team to add another SyncGateway instance to our production cluster with full logging enabled but I wont have data from this until somewhere next week.

I know this might be hard to answer but do you have any gut feeling of how worried we should be about this? Is this probably nothing or is it more likely to be a ticking timebomb?

Hi Bart,

In general this doesn’t seem like a significant issue - generally it looks like clients are attempting to request an obsolete revision that’s no longer available on the server.

However, I’d still like to understand what might be going on here - particularly that there’s a spike in activity on Sync Gateway startup. Some additional questions:

  1. Are you using bucket shadowing?
  2. Are you using SG-to-SG replication (via a replication defined in the SG config)?
  3. If the answer to both of the above is ‘no’ - what type of client activity would be hitting the SG node on startup? Couchbase Lite replication? Other REST API activity?
  4. Is there any other information appearing in the production SG log on startup (aside from the warnings posted above)?

Hello Adam

Glad to hear we shouldn’t be too worried about this.

  1. We are not using Bucket Shadowing.

  2. We do not use SG-to-SG replication. All replication is handled by the backend Couchbase servers.

  3. We have around 200 mobile clients accessing the SyncGateway through Couchbase Lite. Aside from that we also have some backend services which use the SyncGateway rest api directly.

  4. After some more digging we noticed that this also sometimes occurs when the Sync Gateway has already been running for a while.
    Right before the warningspam we see a backend service requesting the _changes feed filtered by a channel:

    GET /myBucket/_changes?filter=sync_gateway/bychannel&include_docs=true&feed=continuous&channels=outbox&heartbeat=24000&since=1
    Immediately after this call we see around 200 missing doc warnings as listed above.

Could this be caused by the unsafePurgeBucket unsafely removing documents?