Hi Team,
I’m using the Couchbase Kafka Source Connector to ingest data from a bucket into Kafka.
-
I started with
"couchbase.stream.from": "BEGINNING"
, which successfully ingested the full backfill of documents. -
After that, I keep seeing a large number of records on the Kafka topic, but I can’t clearly tell which ones are new live changes versus replay/backfill.
My goal is:
-
Ingest all existing documents once (backfill).
-
Then continue streaming only new changes (inserts/updates/deletes) going forward.
My questions are:
-
What is the correct/best practice approach to separate backfill and live streaming?
-
How can I reliably detect that the backfill phase is complete, so I know when to stop the first connector? (e.g., using
cbstats dcp
or other Couchbase metrics?
Any examples, or recommended configurations would be appreciated!