Wondering How getbulk API works
Hi
in Java Couchbase library,
there is an API called bulkget, which is getting multiple values with multiple keys
However, I was wondering how this is called.
Pretty sure it won't be calling get API multiple times...
Under the circumstances that I am using client side Moxi, can someone tell me how it is working roughly?
If client gets set of keys, moxi routes client to connect to one of master server of keys. Then, the client receives values of all keys whether those keys are set as master server. I will briefly draw rough drawing
Replica is set to 3
Cluster 1 Cluster2 Cluster3 Cluster4 Cluster5 Cluster6 Cluster7 Cluster8
Key A KeyA KeyB KeyB KeyA KeyB
if keys are stored like that, how does client actually work when I use getbulk(KeyA,KeyB) API?
https://github.com/membase/memcached/blob/engine/include/memcached/proto...
This is file describing binary protocol used to CRUD the keys in the Couchbase. And there no such single operation which allows to specify list of keys. But usually clients could write to socket several "quiet" operations followed with NOOP operation, to pipeline then.
https://github.com/couchbase/libcouchbase/blob/master/src/get.c#L107-L136
I think that this is almost the same you need, but I cannot give the link from java client right now.
if you take a look at getBulk API in Java library
it mentions " Using the bulk methods is more efficient than multiple single requests as the operation can be conducted in a single network call. "
It actually says that it is one single network call.
This is why I asked this question
Link to JAVA library's API - getBulk
http://www.couchbase.com/docs/couchbase-sdk-java-1.0/couchbase-sdk-java-...
Actually you cannot get several keys in single network call because your keys could go to different nodes. All the clients gives you effective abstraction using GETQ command and bufferring requests. For example, libcouchbase buffering the requests and scatter them to different sockets (in case of multiple nodes in the cluster) and then runs event loop, which picks the data from binary buffers and write out to the network. Here we of course can say that there was single network send operation per server. I guess that java client does something similar. Separate network calls will cause additional barriers after each operation to be sure that previous response was received.
Consider the following links to java client implementation:
1) constructing the requests https://github.com/dustin/java-memcached-client/blob/master/src/main/jav...
Here the client is grouping the keys by nodes using locator algorithm (for couchbase buckets it will be vbucket distribution, for plain memcached -- ketama distribution)
2) building multi get https://github.com/dustin/java-memcached-client/blob/master/src/main/jav...
Here the client is constructing binary buffer from given key list (from step 1). Note that they are separate memcached commands with NOOP at the end. But they sent in a bulk later.
Thanks! Now everything seems really clear to me!
Hmm space is ignored for some reason
Will explain again
replica : 3
cluster : 8
Key A is stored in Cluster 1,2,6
Key B is stored in Cluster 3,4,7
If keys are stored like that how does client actually work when I use getbulk(KeyA,KeyB) API?