Diff between 2 buckets

There is any tool that I can use in order to check the differences between two buckets that have similar documents? For example: I have 2 buckets with 1 million documents each and 95% of the documents are equal and the other 5% are newer/missing or different documents. There is something “native”(no need to develop) that I can use in order to see and analyze these 5% of changes?


@drjunior ,
If Query Service (N1QL) is enabled try the following. Projections can be customized based on requirements.

(SELECT META(b1).id, b1.* FROM bucket1 b1 EXCEPT SELECT META(b2).id, b2.* FROM bucket2 b2) UNION (SELECT META(b2).id, b2.* FROM bucket2 b2 EXCEPT SELECT META(b1).id, b1.* FROM bucket1 b1) ;

Which version of Couchbase are you using?

Also, what do you mean by “95% of the documents are equal”? Do you mean that the contents of those 95% are identical? Or that the schemas of the documents is identical, but the individual values are different?

In Couchbase 4.5 EE and above (but not in CE), there is a N1QL command called INFER which randomly samples a set of documents from a bucket and tells you how many distinct document schemas it finds. You can also run INFER on all the documents in a bucket, if you want to know exactly how many documents match each schema.

See https://developer.couchbase.com/documentation/server/current/n1ql/n1ql-language-reference/infer.html for more details about INFER.

Hi @Eben, I’m using Couchbase Server 4.6.0-3453

I’’ try to use that operation.