"bySeqno" is not increasing

Our Kafka connector logged some WARNINGS that look like this:

"Received rollback for vbucket [vBucket_ID] to seqno 0 ; requested start offset was: partitionUuid = <PARTITION_UUID>, seqno = <SEQ_NO>, snapshot = [X], collectionManifestUid = 0" 

for all 1024 of our partitions after we performed a cluster rotation which caused all vBuckets to have a different UUID. This is expected because the UUID changed.

However, we noticed that over time, the “bySeqno” value for all partitions is not increasing. At one instant, the value may be 4000, next instant it may be 8000, next instant it may be 20000, then later it may be 4000. Should the “bySeqno” always be increasing over time (assuming vBucket UUID stay the same and document changes occur all the time)?

Below is an example for one partition (from the offsets):

At time 01:00, the offset for vBucket 781 may look like this:

"[\"nosql_cb3_data1\",{\"bucket\":\"data1\",\"partition\":\"781\"}]" : "{\"vbuuid\":21044398436119,\"collectionsManifestUid\":0,\"snapshotEndSeqno\":3266822,\"bySeqno\":20480,\"snapshotStartSeqno\":0}"

At time 01:05, the offset for the same vBucket may look like this:

"[\"nosql_cb3_data1\",{\"bucket\":\"data1\",\"partition\":\"781\"}]" : "{\"vbuuid\":21044398436119,\"collectionsManifestUid\":0,\"snapshotEndSeqno\":3278968,\"bySeqno\":4096,\"snapshotStartSeqno\":0}"

^ Above we see that the “snapshotEndSeqno” value increases over time, but the “bySeqno” value decreased. We did not observe any error or warning logs during this time (except the warning about rollback for all partitions, but this occurred once per partition). Is it normal for offsets to be like this?

Hi @cbwill !

Should the “bySeqno” always be increasing over time (assuming vBucket UUID stay the same and document changes occur all the time)

That is indeed suspicious. The seqno goes backwards when there’s a rollback, but that’s the only case I can think of where that’s the expected behavior. (I hesitate to make a strong claim – I may be overlooking something.)

I would check for other running connector instances with the same name using the same Kafka Broker. (These could be from someone running the connector in standalone mode, or from a different Kafka Connect cluster.) If I didn’t find any, I would check again. If I still didn’t find any, I would rename my connector instance to avoid all possibility of offset collision.

Out of curiosity, how are you accessing/viewing the offsets?

After letting the process run overnight, the seqno seems to be incrementing now without resetting. All seqno for all partitions are currently 30,000+, so I don’t think any seqno were reset during the night.

Out of curiosity, how are you accessing/viewing the offsets?

We provided our own implementation of org.apache.kafka.connect.storage.MemoryOffsetBackingStore which saves the offset in a human friendly/readable format at every X interval (e.g., save offset every 2 min).

Going back to the sequence numbers in the offsets, if the Kafka connector is processing a big snapshot, let’s say the snapshotStartSeqno=0 and snapshotEndSeqno=30000000 for all partitions. Will Kafka connect only pick up new mutations only AFTER the entire snapshot is finished (seq 0 to 30000000)? Or will both big snapshot + new data be processed at the same time / same batch?

We did find it weird that the snapshotEndSeqno was increasing while Kafka connect was still processing a big snapshot. I am not sure if this is intended or not as I am under the assumption that the snapshotEndSeqno should change after Kafka connect is done processing the big snapshot (which it was not).

We did find it weird that the snapshotEndSeqno was increasing while Kafka connect was still processing a big snapshot

Yes, weird. Snapshots are immutable – at least that’s my understanding.

You might try enabling document lifecycle logging and watching the Kafka Connect worker log for CONVERTED_TO_KAFKA_RECORD milestones. It would be interesting to know if the source offsets there have the same weird behavior.

https://docs.couchbase.com/kafka-connector/current/source-configuration-options.html#couchbase.log.document.lifecycle