After digging Missed document after sync with sync-gateway I found out that problem in query that executes sync-gateway:
SELECT META(`test`).xattrs._sync.sequence AS seq,
META(`test`).xattrs._sync.rev AS rev,
META(`test`).xattrs._sync.flags AS flags,
META(`test`).id AS id
FROM `test` USE INDEX (sg_allDocs_x1)
WHERE META(`test`).xattrs._sync.sequence >= 1
AND META().id NOT LIKE '\\_sync:%'
AND (META(`test`).xattrs._sync.flags IS MISSING
OR BITTEST(META(`test`).xattrs._sync.flags,1) = FALSE)
ORDER BY META(`test`).xattrs._sync.sequence
On one server A it gives X rows, and on server B it gives Y rows.
Where Y > X. And database on server B is just restored dump from server A.
The most strange thing that if I remove this condition:
).xattrs._sync.sequence >= 1 AND from query, the X become equals Y,
but all “_sync.sequence” in query result are not null and “>= 1”.
So couchbase on server A is broken. Any idea what is broken?
My first idea that is index is broken, but removing “USE INDEX (sg_allDocs_x1)”
still gives wrong result with “META(
test).xattrs._sync.sequence >= 1”.
Even if you remove index it might have used same index or different.
- Try run via Query work bench and check plan Text tab #itemsIn,#ItemsOut each operator and see where miss match (or post both of them here)
- Try different index (primary index and see, or drop and recreate index and see)
- Only project RAW META().id and ORDER BY META().id and see which document is missing and take it from there where it is missing. i.e. skip index and try with USE KEYS.
- As this sync gateway index check attributes of index WITH clause make sure is retain_deleted_xattr attribute is set both cases. Not sure how this works from restored from backup
I created sg_allDocs_x2 index with expression from sg_allDocs_x1, and query with sg_allDocs_x2
index works just fine. I can not see any pattern for missed entries. Among the missed data there are with xattrs._sync.sequence = 180 and xattrs._sync.sequence = 20012, where max xattrs._sync.sequence ~26000.
So dropping and recreation of sg_* indexes should solve the current problem.
But how can I prevent this problem in the future? In web interface this index marked as green and it is not in updating phase, so at some point I got corrupted index, but no indication that I have the problem. Are any hypothesis why this happens and how can I prevent this?
I use couchbase 6.6.0 from rpm on centos 7.9 if it is important.
As you still have sg_allDocs_x1, See any pending mutations etc. cc @amit.kulkarni
No pending mutation in sg_AllDocs_x1.
This is something really strange. It is difficult to comment on the possibility of index corruption without more information. Requesting for cbcollect logs.