Node.js based client library for Couchbase Server

Hi folks!

I would like to make an open source database driver for Couchbase Server and so far I have found these resources:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-client-deve...
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-client-deve...
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-client-deve...

If I understand it correctly, the client library would probably need to be able to work with the memcached binary protocol and also support HTTP REST based protocol for vBucket stuff and views querying. Now, should I start first with basic implementation of memcached binary protocol and then look at vBucket hashing and authentication or is there a better/other recommended approach?

Thanks for your time and help.

1 Answer

« Back to question.

I would start by implementing the vBucket hashing and authentication first, but you can approach the client from either direction. If you decide to do the vBucket hashing stuff first then you will want your client to send memcached protocol messages to port 11210 (not 11211). As for getting the bucket configuration stuff you will want to connect to port 8091.

If you go the other route and want to start getting the memcached protocol working then you will want to go through port 11211. This is because there is a proxy server that listens on 11211 and will route memcached protocol messages to the correct server. The downside to doing things this way is that this adds some extra latency to your requests.

Other than that I want to mention that in our other clients we only use the couchDB HTTP protocol to take a advantage of views. We do this because memcached protocol is much faster than the memcached protocol.
It's great to hear that you will be creating a Couchbase node.js client. Definitely let us know if you have any other questions.

So the starting point for me should be first to study this [1] document about vBuckets, then proceed with the implementation of vBucket specific REST/JSON parts of Couchbase [2] manual and then, when I would know which server to talk to, implement SASL [3] authentication? Sorry if this is not the correct order of doing things, but I'm little bit confused by the order of topics in the manual where authentication is mentioned in the last part.

[1]: http://dustin.github.com/2010/06/29/memcached-vbuckets.html
[2]: http://docs.couchbase.org/couchbase-manual-2.0/couchbase-client-developm...
[3]: http://docs.couchbase.org/couchbase-manual-2.0/couchbase-client-developm...

Yo Yojimbo87,

This is awesome!

It's not really all that fancy, but I'd started putting together a node.js client side implementation of the REST bootstrapping to get a configuration, needed to then do hashing and work with server changes. I'd imagine it's nowhere near useful yet, but it gave me a chance to start working with node.js and Couchbase.

There is some existing code out there which implements memcached binary protocol, so I was hoping to leverage that underneath.

I think the steps you lay out are good, but I wouldn't get too hung up on the vbuckets if you don't understand them at first. It's important material, but it can be a bit involved. I'd instead read it over then look at the REST/JSON parts.

SASL authentication is very straightforward. It's really just specific operations that need to be done before allowing other operations to flow through a given connection. It is through SASL auth that a bucket is identified on the server. No SASL auth means use the default bucket.

You can see my REST bootstrap here:
https://github.com/couchbaselabs/node-experiment

I'd love to collaborate with you on this! My node skills are nascent, but I know Javascript pretty well and am intimately familiar with the rest of Couchbase.

- Matt

I would implement some of the memcached protocol before implementing SASL since it isn't necessary to create a connection with Couchbase (Just create a non-SASL bucket). The reason for this is that you will probably want to test that your client is able to get and set keys from Couchbase. There's a decent amount of work before you get to the SASL stuff so when you start getting to that point post another question on the forums and I will walk you through what you need to do.

Let me know if you have any other questions.

Agreed with that, but I'd say it a different way.

It's probably best to focus on getting the cluster configuration from REST, then doing simple operations against the default bucket. Those simple ops should be get/set (over binary memcached protocol) and view access (over HTTP). After that's done, it's not much harder to start adding SASL auth for the connections, additional operations, etc.

Note that there is some good memcached node.js code out there already. A listing with some good nuggets is here:

https://github.com/joyent/node/wiki/modules

Note this one in particular: https://github.com/billywhizz/node-memcache-parser

Thank you guys for clarification, it's a great help for me. I was doing little "research" regarding the existing node.js modules with support for memcached binary protocol and couchdb REST views:

https://github.com/3rd-Eden/node-memcached (3rd-Eden is a well known guy from the node.js
community)
https://github.com/elbart/node-memcache
https://github.com/cloudhead/cradle (I'm using cradle in my node.js projects to work with CouchDB)
https://github.com/dscape/nano (lightweight client with straightforward API for CouchDB which could be a good option for working with views)

So I will try to get my head around REST stuff from the manual first and then look at the memcached binary protocol in order to get a bigger picture and then start some simple prototyping.

I finally get to do some stuff around the driver. There should be a lot of work simplified with recent node.js 0.6.x branch which now supports reading various data types directly from buffer [1]. I have several questions:
1. I installed Couchbase server 2.0 on a VM and can get the JSON response similar to those in manual [2]. If I understand it correctly this JSON will be used to calculate vBucket ID which would be encoded into future requests sent through port 11210. However I would like to skip this encoding part for now and first implement binary stuff (I guess through port 11211 which should send requests to it's correct destination without the need to calculate vBucket stuff from JSON). Are my assumptions correct?
2. JSON response which I was able to get from the server (through localhost:8091/pools/default/bucketsStreaming/default) is like this:
{"name":"default","bucketType":"membase","authType":"sasl","saslPassword":"","proxyPort":0,"uri":"/pools/default/buckets/default","streamingUri":"/pools/default/bucketsStreaming/default","flushCacheUri":"/pools/default/buckets/default/controller/doFlush","nodes":[{"couchApiBase":"http://127.0.0.1:5984/default","replication":0.0,"clusterMembership":"active","status":"healthy","thisNode":true,"hostname":"127.0.0.1:8091","clusterCompatibility":1,"version":"2.0.0r-1-ge4c8742","os":"i686-pc-linux-gnu","ports":{"proxy":11211,"direct":11210}}],"stats":{"uri":"/pools/default/buckets/default/stats","directoryURI":"/pools/default/buckets/default/statsDirectory","nodeStatsListURI":"/pools/default/buckets/default/nodes"},"nodeLocator":"vbucket","vBucketServerMap":{"hashAlgorithm":"CRC","numReplicas":1,"serverList":["127.0.0.1:11210"],"vBucketMap":[[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1],[0,-1]]},"bucketCapabilitiesVer":"sync-1.0","bucketCapabilities":["touch","sync","couchapi"]}
What I want to ask is why there is bucketType set to membase when the server itself was set to couchbase (I also checked it through admin interface)?

[1]: http://nodejs.org/docs/v0.6.1/api/buffers.html
[2]: http://docs.couchbase.org/couchbase-manual-2.0/couchbase-client-developm...

1. Your assumptions are correct. Just make sure when you set things up you are connecting to the default bucket. This will be the easiest case to start with (Later you can add SASL support).

2. The bucketType being Membase is something that is left over from the changes that have been made in the product lately. As long as the bucketType isn't memcached you should be ok.

Thanks Mike for your response. Here is another question:

3. Let's say I have this scenario: I send two commands (I call them A and B) to couchbase server from which the command A takes 10 seconds to execute and command B takes only 1 second to execute. Do I receive the result of command B first (because it was processed faster than command B) or my commands are processed and retrieved in the exact order as they were received?

You should plan for them to possibly come out of order. If they happen to use the same connection, they would be in order, but there is no guarantee of that. Especially consider the situation where command A has to do a disk fetch and command B is just a memory operation (from the server perspective).

Mike is right here. Technically speaking, the problem is that the way we did bucketType doesn't address the way we've extended the system. We probably should have had something more like bucketCapabilities which could include multiple answers, like Membase, Couchbase (which extends from Membase), etc.

What we're doing at the moment in other client libraries is discerning capabilities from whether or not a couchAPI URI is present in the JSON response. Lack of a URI there should reliably fail, but you can just make it fail immediately (don't even try) from the client if it's not implemented. This way it can work with both Membase 1.x and Couchbase 2.x.

And how does the existing client libraries deal with "out-of-order" processed commands? For example, when I send two request packets with the same command type, how do I know which one of the received response packets belongs to my which requests? (possibly an opaque field in header can be used to identify the response and assign request to it?)

I'm sorry for the exceptionally long delay.

Most clients either use callbacks or some other mechanism to allow a queue of operations to be processed and then returned or they use a connection per caller from a connection pool. Note that the server side will always process operations in order.

I have to apologize for the delay in response. I was out of town when you last replied, and I'd not looked back to the thread until recently.

It's ok, I understand that you guys are now busy with the world tour and other stuff.

I was also using a queue with callbacks mechanism when I was prototyping binary driver for OrientDB since it was guaranteed that server processed commands in order. However could the opaque field in the req/res headers be used to identify the operation for the client?

Yes, the opaque field can be used for that exactly.

You will receive the responses in the same order as your requests, and each response is guaranteed to have the same opaque field as in the request.

Hi guys,

Although this thread hasn't been active for a while, it looks like the best place to get the attention of anyone who knows anything about working with Couchbase from Node.js.
I'm about to migrate an existing Node.js app from Redis to Couchbase. It's pre-production (launch slated for late October), so I'm using Couchbase 2.0 developer preview. I only currently need CRUD operations and compare-and-set, so I'm leaning towards using a memcached client like https://github.com/3rd-Eden/node-memcached, and switching to a full-feature Couchbase client once one is available.
However, if there a client in production which would be easy enough to use, and is expected to be supported by Couchbase officially in the future, I would like to start using it already and skip the extra step. Is there such a client which you could recommend? I'm aware of https://github.com/Wizcorp/node-couchbase and of https://github.com/trondn/couchnode, but it doesn't look like either of them would be very easy to use.

Many many thanks,
Near Privman

https://github.com/couchbase/couchnode is the node.js client. It is still a work in progress.

If you have ideas on how it may be improved, let us know (preferrably in #libcouchbase on freenode, or in the mailing list; couchbase-at-googlegroups . com).

If you look around at the test files in the couchnode repository, you'll see some example cases.. the documentation in the README.md file may be dated.

There has been a lot of work recently on the couchnode client..

which is based on libcouchbase. If you have some input on what makes the client difficult to use, and maybe some ideas for how to enhance it, please join #libcouchbase or post your answers here (or preferrably on couchbase at googlegroups.com) so we can improve it.

Thanks for the swift response mnunberg. The difficulty I'm referring to is mainly the difficulty to install on my dev machine (Windows :( ). Obviously, if I can't do that it would be hard to evaluate how user-friendly the rest of the process will be...

I've seen Node modules which rely on waf before. Without an exception, I passed on all of those, since getting any of them to work on Windows would probably be very tricky. Have a look at this discussion from February: https://groups.google.com/forum/?fromgroups#!topic/nodejs/tUNGgXMQDHo

It looks like the general consensus is that building (and statically linking) native code should be the package developer's responsibility, so that package consumer may simply `npm install`. There are also some approaches suggested that may help you do that, and I would specifically look at the node-bindings tool mentioned and linked there.

Otherwise, I would also update the readme file. A relatively small effort there can make the lib much more usable. Hope this helps.

I'm on a rather tight schedule with my own work at the moment, so I'm going to move ahead with 3rd-eden's memcached client for now. I hope I'll be able to take another stab at using the new client in a few weeks, and then I'll hop on your mailing list to contribute my feedback.

Many thanks!
Near