I have requirements to be able to replicate data between different cluster - but I need a more advanced filtering than just filtering on id as per the current available choice in Couchbase 4.
Basically, to choose if the document need to be replicated or not is not only based on the document id itself, but some other information specific to our application. Also, we may want to filter part of the document and merge it back inside the other cluster, so it is like a partial replication…
Our current approach is to use the kafka connector to access the replication log of the cluster and put logic around reading / writing to kafka to handle our case for filtering / partial replication. But this leads to handling concurrent modification ourselves and also avoid feedback loop by adding metadata into our document, like a logical clock, … We feel that this information is already somewhere in the metadata of the document that Couchbase hold to manage its own replication protocol…
What will you guys recommend doing? Could we. somehow, plug into the XDCR protocol by just extending the logic for filtering / stripping information from document and let Couchabse handle the nitty gritty of cross cluster replication? Is there a way to access the metadata of a document and extend it to add our own, so we do not have to wrap our original document into one with metadata to handle our replication.
Any idea and proposal?