I am seeing some behaviour with the kafka-connector that i am not expecting and wondering if someone can shed some light on it.
I am using the default of zookeeper for state storage and i can see that state is being stored and retrieved/re-initiated correctly. However, upon restarting the connector I see a subset of events get replayed.
I managed to track a document to a specific partition and can see that a change to the document incurs a correct increase in the sequence number that is stored in zookeeper, and can see that partition, with correct sequence number, loaded when the bridge restarts. However that key was being replayed and sent again on to kafka on every subsequent restart. Having left it overnight a restart does not trigger it to be replayed anymore.
It therefore appears that some subset of DCPEvents always get replayed and perhaps there is always a minimum number of changes or that everything within a given timespan is re-triggered when a subscription is restarted? I don’t know whether this is expected behaviour - perhaps until document changes have been flushed to disk they are replayed to avoid missing them or something?
Does anyone have any idea what might be happening here?
I am using kafka-connector version 2.0.0, with core-io 1.2.6, and am seeing this with a single node cluster, running couchbase version 4.1.0-5005 Enterprise Edition (build-5005).