Couchbase structured streaming job skipped few records while streaming

deepthi26 · May 18, 2025, 8:35am

I have structured streaming job that streams from couchbase, persistence polling interval is100ms. Observed a weird case where there was huge load in a single batch and spark job went to second attempt, and it missed processing few records in that batch. How can this happen. I am maintaining a checkpoint folder in hdfs.

mreiche · May 18, 2025, 8:37pm

Whenever I see a report of documents missing from a query, I recall that queries on indexes only find what is indexed. If there are documents that have not yet been indexed, they will not be found by the query. I wonder if that is what you are observing? It would be useful to have more details. Also open a ticket with customer support

Topic		Replies	Views
Couchbase is fetching all data when using Stream From as from Beginning in structured streaming is there any way we can filter ids or provide query instead of loading all data Couchbase Server query , connections , java	2	429	June 25, 2023
Couchbase data read/write not happening in time Couchbase Server	3	992	August 19, 2018
Blocked indexes won't complete (Help!) server 3.0.1 EE build 1444 Couchbase Server	3	2077	May 18, 2015
Strange "delay" on index, when using simple N1QL query SQL++ query , n1ql	8	2338	June 14, 2016
Index building optimization Couchbase Server	4	1471	October 7, 2016

Couchbase structured streaming job skipped few records while streaming

Related topics