How to do getMulti on Couchbase SDK 3.0

naftali · June 17, 2020, 7:13pm

I am getting a getMulti not defined on defaultCollection object.
On the other hand collection.get(key) key has to be a string or a buffer.

Where is getMulti?

Also is there an API reference because all I can find are the few examples in getting started…

naftali · June 17, 2020, 7:30pm

Ok looking at the source, it appears that there isn’t any getMulti option. Wondering why, I took a look at the source for 2.5 and apparently getMulti is just a sequence of singular gets, so basically just sugar, and I can perhaps see why it was implemented in 3.0.

I find this mindblowing actually. I’d always assumed getMulti is better than Select * from app use Keys but not really. Presumably the query engine will do a better job of a parallelizing the document getting than I can in my node environment.

Am I off?

brett19 · June 17, 2020, 7:51pm

Hey @naftali,

The Node.js is optimized to pipeline and batch individual operations together internally. When you perform a large number of operations simultaneously, these are all batched together to a single group of operations that are sent to the server to be executed. We take advantage of this internal behaviour in our getMulti implementation, which is why you see in 2.x we didn’t need to do anything special and also why we dropped it from the 3.x SDKs.

Cheers, Brett

naftali · June 17, 2020, 7:57pm

@brett19

thank you for that.

2 questions:

So practically speaking instead of a getMulti I just do something like promise.all([get(key), …])?

Also could you help me understand what sending a batch of key value ops to the server precisely means?

I had always thought that the query engine has the intelligence do deal with batches but that the data nodes are dumb key value lookup machines.

I guess that’s too simple a model?

When you say send the batches to the server do you mean that sdk can send a batch of keys that belong to a particular data node and the data node has the intelligence to return those docs in unison?

Or do we have to involve the query node?

brett19 · June 17, 2020, 8:16pm

Hey @naftali,

Your example of using Promise.all is dead on. For instance, in the following example, both calls would take pretty much an identical period of time, as both calls only essentially execute 1 network call:

var result = await bucket.get('somekey1')

var result = await Promise.all([
  bucket.get('somekey1'),
  bucket.get('somekey2'),
  bucket.get('somekey3')
])

Your model of the server’s performance is actually backwards from reality. In the case of using N1QL to perform your operation, it needs to then go out to the KV nodes to execute those operations anyways which adds additional round-trip time to your request, and N1QL cannot batch any better than the SDKs (which makes sense since they actually use the SDKs). If you already know which keys you want to access, directly performing those operations via KV will be vastly faster than performing them via N1QL.

Cheers, Brett

naftali · June 17, 2020, 8:19pm

“which makes sense since they actually use the SDKs)”

Solid gold. Thank you so much

naftali · June 17, 2020, 10:39pm

Hello @brett19 thank you for all your help.

I’m trying to put together a demo for some colleagues to illustrate among other things the upshot of this discussion, and so I wrote the attached script to demo the timing differences between the query vs direct approach.

I’m getting strange results in that it seems the performance is practically exactly the same, meaning timing results within single digit ms and with which approach coming in first being completely hit or miss.

The topology is a straightforward 1 query node, 1 data node deployment, with both nodes on separate AWS spot instances.

The only explanation I can conjure is that since I’m testing over fios (pinging the query node/data node results in 10-20ms) the network hop gain is counter balanced by some overhead in the node sdk batching process vs the c/go/whatever form of the sdk that the query engine uses. But I am also trying with large doc payloads, which slows the timings down across the test runs (and should exacerbate the extra network hop cost) but am getting the same roughly equivalent performance statistics as above.

Can you take a look at my small script and maybe shed a bit of light about where my approach went wrong?

demo.zip (908 Bytes)

brett19 · June 17, 2020, 11:09pm

Hey @naftali,

You are correct that over a WAN to the server, the performance difference would be negligible between KV or N1QL as N1QL ↔ KV would be <1ms, whereas your WAN is 10-20m. So the performance impact is effectively hidden in the jitter of the WAN network.

Cheers, Brett

naftali · June 17, 2020, 11:15pm

wow awesome I had no idea sub ms latency was even possible but I guess this has to do with both instances being in the same aws region

laurentiustroia · November 17, 2021, 10:09am

Hi @brett19 ,

I wonder how the SDK 2 threat this getMulti ? Because with a Promise.all if we have hundreds of keys will be a pain for JS main thread, and we have delay.

Basic same get run isolate give a time to response, but if we do a

Promise.all([
   bucket.get("samekey"),
   bucket.get("samekey")
  ....
])

we get much bigger time.

losapevo · February 9, 2022, 9:41am

Hi @laurentiustroia ,
in case of processing hundreds of keys, sometime I use the Promise.allLimit wrapper that can limit the number of concurrent requests. Here some examples: Promise All with Limit of Concurrent N · GitHub

hamza1 · July 29, 2022, 3:41pm

Hi @brett19
Its really a new thing for me that When you perform a large number of operations simultaneously, these are all batched together to a single group of operations that are sent to the server to be executed
Can please share a reference link of any documentation where it is mentioned?

Topic		Replies	Views
Nodejs SDK getMulti with replica Node.js SDK	2	1320	February 1, 2017
Bulk get operation using subdocument api Node.js SDK get	3	1681	July 26, 2019
getMulti returns error when it fails to find a key Node.js SDK	2	2903	July 16, 2014
What is the recommended way to use getMulti so it works when retrieving >500KB of data? Node.js SDK	3	2297	November 19, 2015
What is the recommended method for handling bulk inserts with the new SDK? Node.js SDK	1	2162	September 2, 2017

How to do getMulti on Couchbase SDK 3.0

Related topics