A few questions about the `couchbase.stream.from` field in `kafka-connector`

Hi all :wave: ,

I am looking for additional information regarding the field couchbase.stream.from configuration field from the connector configuration Source Configuration Options | Couchbase Docs

Could you please clarify what does BEGINNING mean and how it operates?

In particular, I am looking for the following answers :

  • Does that mean that it will read and ingest the entire configured bucket to the kafka topic?
  • How does it control the volume? ie : What happens if the kafka topic can not host the entire bucket? Is this what couchbase.flow.control.buffer controls? How do we avoid impacting couchbase performance doing such loads?

Additionally, if possible, I would like to hear your thoughts about bulk load activities. In a few words, I need to ingest an entire bucket of data (100M+ records) from Couchbase to another datastore(elasticsearch) while also be able to process and ingest incoming realtime traffic.

Currently, we’re considering two separate workflows : one for the first load and another for real time (using Kafka connect). I am just curious is Kafka-connect could help to do this without the need of two separate processes given that couchbase.stream.from exists in the connector

Any help would be highly appreciated

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.