Couchbase connect streaming events even when kafka connectors down

Hi Team,

I am using Couchbase connect 4.1.13 with Kafka 3.2.2.

connector config api : GET /connectors/connector-name
task config api : GET /connectors/connector-name/tasks

There is issue/implementation in Kafka connect when new kafka connectors are created then connector config and task configs are same but when we update connector config, use PUT /connectors/connector-name/config to update, then task’s config are not getting sync/updating with connectors config. When we checked with Confluent Kafka team then got to know that task’s config only updated when either tasks.max value changed or at time to new connector creation.

Now to resolve this issue (syncing connector’s config with Task’s config), i had tried to to delete connector first and then created it back and then all connector’s task got updated correctly with connector’s config but issue is that when connectors are down, not pod down, that time still Couchbase connect dcp client stream data from Couchbase but not processing due to connectors down. ideally it should not stream data from couchbase as connectors are down and result is those events are getting lost when connector gets up.

Is it expected behaviour from Couchase connect?

HI @shyampa

How do you know the connector is streaming data from Couchbase while the connector is down? (That doesn’t make sense to me. Please help us understand what you think is happening.)

Can you share your connector config, please? (Without passwords or other sensitive info, of course).

Thanks,
David

Hi @david.nault ,

we have enabled lifecycle events, it is showing lifecycle events in log.

Steps :
1] connectors deleted using connect api DELETE /connectors/connector-name
2] monitor pod log, seeing connector deletion log in pod log
3] create/update Couchbase docs (create events)
4] checked pod logs,
5] checked messages in topic, no new messages for created events (in step 3)
6] recreate connectors using POST /connectors
7] monitored pod logs, no lifecycle events found (COMMITTED_TO_TOPIC ) in logs found except connector creation logs
8] checked kafka messages, no message found, no offset update

connector config:

{
“name”: “data-consistency-179e25212a6a0106b817b69c30462e9a”,
“config”: {
“connector.class”: “com.couchbase.connect.kafka.CouchbaseSourceConnector”,
“couchbase.persistence.polling.interval”: “0”,
“tasks.max”: “2”,
“couchbase.trust.store.path”: “/run/secrets/keystores/truststore/pki-common-truststore.jks”,
“couchbase.black.hole.topic”: “Couchbase_Kafka_Connector_Black_Hole”,
“couchbase.log.document.lifecycle”: “true”,
“couchbase.seed.nodes”: “cluster-srv”,
“couchbase.source.handler”: “com.amdocs.digital.ms.data.consistency.plugin.implementation.ResourceHandlerV2”,
“couchbase.enable.tls”: “true”,
“couchbase.collections”: “fnd.data-consistency”,
“couchbase.bucket”: “com.amdocs.digital.ms.data.consistency”,
“couchbase.enable.dcp.trace”: “true”,
“couchbase.stream.from”: “SAVED_OFFSET_OR_NOW”,
“couchbase.username”: “data-consistency.shyampa-ed-dev.svc.cluster.local”,
“amdocs.app.name”: “data-consistency-cbkc-plugin”,
“name”: “data-consistency-179e25212a6a0106b817b69c30462e9a”,
“couchbase.password”: “${yaml:/run/secrets/csb-secrets/csb-data-consistency-couchbase-nts.yaml:com.amdocs.platform.servicebroker.couchbase.couchbase-nts.password}”,
“couchbase.trust.store.password”: “${env:TRUSTSTORE_PASSWORD}”,
“couchbase.connector.name.in.offsets”: “true”,
“couchbase.topic”: “Data_ConsistencyV2_Subscribe”
}
}

Hi @shyampa,

The Couchbase Technical Support Engineer handling your support ticket has been consulting with me on this issue. I encourage you to continue working through the official support channel. Any support given here on the forum is on a “best effort” basis, meaning we might not be able to respond quickly.

When you continue your conversation with the support engineer, it would be helpful to clarify some things about your environment:

  • I’m assuming the pod you’re talking about is running the Kafka Connect Distributed Worker process. Is that correct?
  • Where is the Kafka broker running? Same pod, or different pod?
  • Is it possible a different instance of the Couchbase connector is running on the same pod, and the message were logged by that other instance? Please double-check.
  • Is there a reason you are not using SAVED_OFFSET_OR_BEGINNING? Using SAVED_OFFSET_OR_NOW means you’ll miss any document changes in a partition that happened while the connector is offline, unless the connector previously saved an offset for that partition. (NOTE: Each partition’s offset is stored separately).

If you’re changing the connector name each time you deploy it, then it won’t see the old saved offsets, and it will start streaming from “NOW”, and you’ll miss old changes. Is that what’s happening?

Please let the support engineer know the answers, and we’ll continue handling the case via Couchbase Technical Support.

Thanks,
David

1 Like

Hi @david.nault ,

Thanks for support, yes Couchbase technical team is working on this defect.

answers to you queries:

Points:
1] yes, connector is running on distributed mode
2] Kafka broker is running in diff pod
3] currently we are deploying per service per pod, so only one connector instance on that pod, i have verified using GET /connectors api call, no running connector was fou d
4] I think SAVED_OFFSET_OR_BEGINNING is not a good option for us.
We don’t want to publish the events for all the documents if there are no saved offsets. Our customers have Millions of documents. To reach to current mutation Kafka Connect will also process all the previous mutations which is not something we want. It consumes resource, add delays etc.
The appropriate choice for us is SAVED_OFFSET_OR_NOW. We anticipate that since we have offsets saved prior to executing DELETE /connector/, the connector should resume from the saved offset once it’s recreated (POST /connector/). How it is different from SAVED_OFFSET_OR_BEGINNING except the fact that it will start from oldest mutation? When they both save the offsets in Kafka Connect internal topics.

If there’s a saved offset for each partition, then both behave the same.