Couchbase Upgrade XDCR and Vbucket UUID

Couchbase upgrade is performed periodically by the admins. For us couchbase is a source and we poll couchbase using kafka connect by subscribing to the DCP channel.

During the upgrade process

  1. A new cluster with new IPs and hostnames are built
  2. XDCR is performed to replicate data from old to the new cluster
  3. DNS SRV is updated

Since our Kafka Connect process depends on Vbucket UUID and sequence number to begin to stream from. Does the XDCR replication maintain the documents in the same Vbucket UUIDs or can this be shuffled when a replication to a new cluster is performed?

If I understand you correctly, an “upgrade” process seems to me to be a replacement process in that you have a new cluster, and then the old cluster’s data is XDCR’ed to the new cluster. Then, the Kafka connector that used to subscribe to the old cluster is then to be used to subscribe to the new cluster. Please let me know if this is correct.

In each case, the VBUUID of a VBucket is generated randomly, and VBUUID information is not synchronized between source and target clusters.
When a source bucket is established, each VB will have a VBUUID at the time the bucket is created. The same happens for the target bucket. Thus, when you XDCR from the old cluster to the new cluster, even though the data on the new cluster is the same, the VBUUID of each VB holding the data will be different.

When you then reconnect the kafka to the “new” target bucket, it’ll experience a completely new set of VBUUIDs, and unrelated to the source.

Thanks for your response. Your understanding is correct. We will keep this in mind during our next upgrade to clear the kafka connect offsets , config, status topics. Basically will have to flush out the connector if VBUUID is changing

Hello. We are performing the same steps as mentioned in the original post:

When we start Kafka connector again on the new Couchbase cluster, we came across 1024 messages like below:

"Received rollback for vbucket [vBucket_ID] to seqno 0 ; requested start offset was: partitionUuid = <PARTITION_UUID>, seqno = <SEQ_NO>, snapshot = [X], collectionManifestUid = 0" 

Does this mean Kafka connector will re-process all data from Couchbase again?

Since the nodes have changed. Its better to drop the connector . Restart the kafka connect service. Create new connector and stream from begining else you might see dropped messages or data losses.

We would want to avoid streaming from the “beginning” since that would mean many documents would be “re-processed” as they were processed before the cluster change.

Would it be better to allow Couchbase writes to the new cluster AFTER dropping the Kafka connector and creating a new Kafka connector where the new Kafka connector would stream from “now”?

If you are not worried about the intermediate changes then you can stream from now. But any new documents which were created after this migration and before you created the new connector with stream now might be lost. The document will only flow if there are changes to it in future. Basically if your consumer use case is analytics you might encounter data loss in your reports however if your use case is activations and dont care about history of changes you are good.

1 Like

Thank you for your input. We only want the latest version of a document, so it sounds like streaming from “now” after creating the new cluster (before allowing writes) should work.