About Optimising Document Fetch From Couchbase using Spring Cochbase Repository

Hey People i am using couchbase as my database and my application is a springboot application and i am using couchbase Spring Repository findAllById method to get the couchbase document.
In a single call i have to fetch like 50 docs which are like have almost 50 key values per doc.

I have configured the KV_POOL thread to be like 8 as i have 8 core but i see couchbase is taking a lot of time to respond is it something like my requests are getting queeue in the the io kv threads that i am using?

I also read that the desirialsing can take time so i tried by migrating to raw transcoder but it also did not help.

At couchbase server end every resource is in check. I feel i am missing something at my application level while interacting with couchbase because of which i am not seeing good performance.

for reference this is how i am accessing the documents.

List hptpEntities = Flux.fromIterable(hptpDocumentIds)
.flatMap(id → collection.reactive().get(id).publishOn(Schedulers.boundedElastic()).map(result → Tuples.of(id, result)) // shifts decoding & mapping off Netty I/O threads
.onErrorResume(DocumentNotFoundException.class, ex → Mono.empty()))
.map(result → {
HPTPEntity httpEntity = stockJsonFormatter.deserializeData(new String(result.getT2().contentAsBytes(), StandardCharsets.UTF_8), HPTPEntity.class);
httpEntity.setDocumentId(result.getT1());
return httpEntity;
}) // safe here
.collectList()
.block();

What is your latency to the cluster, e.g. the time to do a ping or similar? That is often the dominating performance consideration. And what latencies are you seeing in your application?

Totally agree with @graham.pople . I would suggested measuring the client - server round-trip time with pillowfight, or GitHub - mikereiche/loaddriver and using that as a baseline. If everything is as it should be, the throughput will be limited by network bandwidth.

Spring-data-couchbase has it’s own de/serialization that cannot be bypassed. It will be significantly slower than using the Couchbase Java SDK directly. The good news is that the Couchbase Java SDK can be called directly from the couchbaseClientFactory of a spring data couchbase repository.

collection.reactive().get(id).

That isn’t findByAllId of a spring data couchbase repository.

stockJsonFormatter.deserializeData

And it looks like the data is indeed being deserialized.

Just as a rough point of reference - running loaddriver with both the client and the (single-node) server on my MacBook Pro I get the result below. I don’t know the size of your documents (messages), I used 1024. You’ll need to look in the code to verify how avg is computed, but it looks to me like with one thread, each batch is executed in avg x batchsize microseconds. For this particular run 474 x 50 = 23700 microseconds which is 23.7 milliseconds.

% java -jar target/loaddriver-0.0.1-SNAPSHOT.jar --url couchbase://localhost --username Administrator --password password --bucket travel-sample --runseconds 10 --threads 1 --batchsize 50 --messagesize 1024

count: 1050200, requests/second: 105020, max: 74813, avg: 474, rq/s/thread: 105020, nthreads: 1, nRequestsPerSecond: 0, kvEventLoopThreadCount: 0, runSeconds: 10, timeoutUs: 2500000, thresholdUs: 500000, gcIntervalMs: 0, nKvConnections: 2, messageSize: 1024, schedulerThreadCount: 0, batchSize: 50, execution: reactive, transcoder: json, virtualThreads: false, cbUrl: couchbase://localhost, bucketname: travel-sample, asObject: true, sameId: false, shareCluster: true, operationType: get, durability: NONE

so @mreiche one more thing like i checked round trip it is fairly less. How can i minimise the deserialisation time ? i have kept desiralisation on a separate thread. it should be good.
Also wanted to ask that currently io threads for couchbase i have kept as 8 are they too much should i reduce them as my cores are also 8. How much does couchbase suggest to keep the io threads basically?

You could remove the deserialization completely to see if it is indeed the serialization that is taking time. If it is the serialization, you could try reduce the serialization time by trying a different serializer. Or delay serialization by programming more “reactively”. For instance, if these 50 documents are to be displayed with 5 on documents on each page, its necessary to only deserialze the first 5 before displaying tge first page.

If only parts of the documents is needed, it is possible to only fetch only those parts of the document.

If you posted rhe measurements you are getting and the size and number of the documents, it might be easier figure out what can be done.

so @mreiche one more thing like i checked round trip it is fairly less. How can i minimise the deserialisation time ? i have kept desiralisation on a separate thread. it should be good.
Also wanted to ask that currently io threads for couchbase i have kept as 8 are they too much should i reduce them as my cores are also 8. How much does couchbase suggest to keep the io threads basically?

@vaibhav123 what is ‘fairly less’ in cold hard numbers?

If it is still in the multiple milliseconds range then I would suggest focussing on reducing this as the priority, before starting to look at deserialization optimisations and particularly IO threads. Network latency will generally be the dominating factor, and there are techniques such as VPC peering you can use to minimise it.

I’m not saying deserialization is irrelevant - but personally I’d be looking at it only after getting the network as fast as possible. Not least because there may not be much you can do about it - things have to be deserialized.

One other thing you can experiment with is to increase the numKvConnections SDK parameter, which will allow more concurrent connections to KV. KV connections are already multiplexed, but it might help regardless.