Consume item as stream for a classic GetAll

Hello,
Just to validate, but don’t have any solution to retrieve all element as stream ?
I try to retrieve all documents of a collection at the fastest way, but despite the usage of IAsyncEnumerable it look like all items are retrieved before starting to send first item.
My request is a simple “SELECT s.* from MyCollection s”

This is a pretty complicated topic, so please bear with me as I try to cover all the bases.

First, I wouldn’t recommend using a query to stream all the documents unless the collection is very small. It simply won’t scale well, as the query node will be unavailable for any other queries for the lifetime of the stream. Queries just aren’t designed for that kind of use case.

That said, you should be able to get a stream of results using IAsyncEnumerable. Possible reasons you might not get a stream are:

  1. The result set is small. The server is sending the data across as whole network packets, and probably does some batch collection, so it may simply have all the data available before it’s ready to send anything. It must also pass through .NET networking layers which may group data packets as well.
  2. The query isn’t streamable. I don’t think this is the case with your example, though. However, I’m not privy to when the server may or may not stream results, so I can’t say for certain.
  3. You are using a custom JSON deserializer that doesn’t support streaming. The built-in Newtonsoft.Json and System.Text.Json deserializers both support streaming.
  4. Of course, the fourth possibility is a bug in the streaming design. I believe we have it right, but I can’t guarantee it.

Now, let’s talk about other approaches to dealing with all the documents in a collection. I can’t say which approach is best for you, as I don’t know your specific use case, but I can offer some options.

First, you can use the Eventing service. It can process all the documents in a collection through a Javascript function. It’s great for monitoring all mutations, much like a relational SQL trigger, but can also be used to process all documents in a bucket. If you need those documents to go outside Javascript, you can use a cURL function to send them out in an HTTP request.

The second option is the Kafka connector. It similarly can subscribe to a collection and send documents forward to Kafka. I’ve also heard rumors of people plugging into the Java source of the connector to do other things.

Finally, the most advanced (and riskiest) option is to work directly with the Couchbase DCP protocol. This is the underlying protocol behind Eventing, the Kafka connector, and much more. It isn’t directly supported by the .NET SDK, and I’m not aware of any unofficial .NET implementation either. There is a Java variant, though: GitHub - couchbase/java-dcp-client: Couchbase Java DCP Client

@sarnold -

On top of what @btburnett3 said in his post, we do have a new feature in the works called RangeScan and it essentially does exactly what you are asking for. It won’t be GA for some time (likely 3.5.X), but there is a developer preview in later SDK versions:

var scan = coll.ScanAsync(
       new RangeScan(ScanTerm.Inclusive("key"), ScanTerm.Inclusive("key9999")),
       new ScanOptions().Timeout(TimeSpan.FromSeconds(2000)).IdsOnly(false));

foreach(var item in scan){
       //do something
}

Note that the API is volatile, meaning it is expected to change and it’s not QE tested.

Hi, first, thanks for this complete response.
For my case I have some contributors updating my collection and I need to send updated document to my consumers. I think unfortunately I can’t use js feature to enhance this case.
For information, I have to retrieve something like 160 documents of 25kB.
May be can flag updated document to retrieve only updated one, but because of multi-pods consumption I think it will be complicated.

Thanks ! I think is exactly the feature I’m lokking for.
I’m impatiently waiting for this now :slight_smile:

Hello,

I want to try the preview version but can’t find the source.
How can I retrieve this preview version to test ?

Thanks !

@sarnold

It’s actually baked into the latest SDK release. It’s just in preview and subject to change in subsequent releases or may have bugs.

Hello,

I have implemented the call in my code but it doesn’t work.
That throw an exception requesting a 7.5 server version but I only be able to update to 7.2 for the moment.

@sarnold -

KvRangeScan is an in development feature for later server versions (7.5 and greater).

Jeff

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.