We’re running Couchbase Server 5.1.1, and we’ve found that some items are not getting replicated to one of our other clusters. I would appreciate any assistance the community could provide on how to troubleshoot/resolve this issue.
Here’s my testing methodology:
- Upsert new items until the unique count of
- Wait for XDCR replication (usually only 30 seconds)
- Get those same items from the remote cluster (bucket.get) and note the failures
- Generate a report consisting of source vbucket_uuid, number of items found, number of items missing
What we see is that 164 vBuckets have not replicated ANY of their items to the remote cluster. The other vBuckets have replicated ALL of their items to the remote cluster. I’ve tried waiting several days, and the results are the same.
We’ve tried several means to resolve this:
- Pause/resume replication
- Delete/recreate replication
- Flush the remote bucket
- Delete/recreate the remote bucket
- Replicating from a different cluster (source->intermediate->remote)
To better explain that last one, we have 3 clusters, with items replicating from A->B and A->C. I stopped A->B and created a new replication from C->B, resulting in an A->C->B replication. A->C remained successful, but C->B had missing items.
In each of these tests, we also see that items created prior to the test are missing. For example, we had a “pre-flush” set of test items that were partially replicated upon re-enabling replication.