libcouchbase 2.4.0 is here. It offers large architectural improvements and several new features, improving over previous versions.

This blog was originally written for the 2.4.0 DP1 version, it has been modified to reflect the differences between the developer preview and the current GA release (2.4.0)

Direct download linkshttp://packages.couchbase.com/clients/c/index.html

API Documentation: http://docs.couchbase.com/sdk-api/couchbase-c-client-2.4.0/

Internal Improvements

Packet Structures

Codenamed packet-ng, this version of libcouchbase started out as an attempt to refactor packet handling in such a way that packets were considered first class objects. The request packet is the core currency of the library as it binds the user requested cookie together with the server reply.

In 2.4, request packets are encapsulated in the mc_PACKET structure which contains information about the cookie, the buffers for the packet itself, and the state of the packet (i.e. received, flushed, retried, errored, pending). The packet structure comes along with the mcreq module which provides a unified API for allocating, freeing, analyzing, and rescheduling packets to individual servers.

Packet Queues and Buffers

Packets are now inserted into a queue (or an mc_PIPELINE) structure which contains the ordering of the packets as a linked list. Packets are added to the queue in the order they are scheduled.

Since I/O efficiency is better with contiguous buffers, the mc_PACKET structure itself does not contain the buffer within its own object, but rather a special pointer to a region within a contiguous buffer managed by a special in-order contiguous allocator. This allows packets to live as “independent” objects while having their actual network data be tightly packed in sequence. Like the network buffers themselves, each packet object is also allocated using a separate instance of this allocator.

The allocator lives in the netbuf system which also contains structures and routines for efficiently handling buffer fragments and properly preparing them for being sent to the network, while handling conditions such as partial sends.

I/O Improvements

The I/O system has been refactored and modularized within the lcbio module (src/lcbio). The notable addition is that of the lcbio_CTX structure which contains efficient and unified routines for socket reads, writes, and error handling, abstracting the underlying I/O model (e.g. completion-based like IOCP or libuv; or event-based like select or libevent) from its API.

Robustness during Configuration Changes and Failures

Configuration changes and failures are now handled gracefully. When a new configuration is received and the related server object (mc_SERVER) needs to change positions, its TCP connection is kept in tact, and it is traversed for any commands (packets) which are no longer mapped to it. For each of those commands, the mc_PACKET structure is duplicated and placed in the proper queue, while the possibly underlying send buffer is still sent out to the network and its response ignored. This allows us to keep the TCP stream in tact and simply swallow the related (and anticipated) error response coming from the server.

If a TCP connection is suddenly broken and no new configuration has arrived, the related packets may be placed inside a retry queue or immediately failed. Which commands are retried and which commands are failed can be configured by the user.

Behavior while operating under a degraded cluster has also been improved. Now operations which are routed to missing nodes are placed in the queue and the library will transparently issue a configuration request to the cluster, thus allowing the application to retry the item without performing extra steps.

More Tests

As this version of the library has been refactored to modularize as many systems as possible, it means that testing each of the modules becomes simpler as they have more well defined behavior and fewer dependencies. Many new tests have been added dedicated to buffer management, packet handling, and raw I/O handling. All of these tests make no use of the CouchbaseMock server or any external resources but are entirely contained and deterministic.

New API Documentation

API documentation is now generated via Doxygen. Doxygen is an open source cross platform documentation generator which generate API documentation based on source code comments. This will allow our API documentation to be more up to date – so that as long as a new API is added and it contains comments, it will feature inside the API documentation, and if an older API is removed, it will disappear from it.

Additionally we’ve formally added interface attributes to all of our APIs to help you determine the stability and roadmap for a particular API call. This allows us to clearly convery if a specific interface is experimental (or volatile), or if it may be used in production code with confidence that it will not be modified or removed in later versions.

New Features

SSL Support for Couchbase Enterprise 3.0

Version 2.4 contains support (via OpenSSL) for communicating with the server using the SSL protocol. SSL support is implemented entirely in one of the layers inside lcbio and thus resides underneath lcbio_CTX. As such, SSL support is virtually transparent to most systems in the library. By default the library will still connect in a non-encrypted mode (your SASL password will still be encrypted if possible, though)

Connection String Support

Also new is support for a new way of specifying how to connect to the cluster. As more and more connection options are added to the library it was necessary to provide a uniform format for users to declare how and what they want to use when connecting to the cluster. Brett Lawson proposed a new URI-like format which allows one to specify connection options in a clear, concise, and unambiguous format. Using a URI format allows things such as being able to specify these settings inside a configuration file (so you don’t have to manually parse multiple settings and then match them to appropriate struct fields).

Since libcouchbase is mostly used as a core layer of higher level libraries (such as Python, Node.JS and Ruby), exposing a string connection option makes it easy for all these languages to share a common interface and a common codebase when specifying how to connect to the cluster.

As a demonstration, a connection string like couchbase://foo.com,bar.com,baz.com/mybucket?operation_timeout=5000000&detailed_errcodes=true  will use foo.combar.com, and baz.com as nodes to connect to the bucket mybucket, applying an operation timeout of 5 seconds and enabling detailed error codes (another new feature in the library).

New Common Operation API

A new set of (volatile) request APIs were added in this version to form the basis of the APIs of the next major release of the library. These APIs operate on a single command at a time and follow an enter/leave pattern, where a user “enters” a scheduling context, schedules a bunch of commands, and then “leaves”. In contrast to the 2.x APIs where each command would implicitly schedule a flush to the network, these new APIs will only schedule a flush when “leaving” their current context. This allows efficient construction of multiple batched operations without having to allocate an array of command structures to do so; thus for example.

Additionally, the new request and response APIs allow code reuse in common operations by making a common ABI for the request and response structures. In this new API all request structures are layout-compatible with the lcb_CMDBASE  structure and all response structures are layout-compatible with the lcb_RESPBASE structure. Likewise the new callback mechanism allows a single callback to handle more than one kind of operation, wherein the callback is passed an integer constant indicating the type of operation, and an lcb_RESPBASE structure which may be casted to the appropriate operation-specific response structure if needed.

#include <libcouchbase/couchbase.h>
#include
#include
#includestatic void
op_callback(lcb_t instance, int cbtype, const lcb_RESPBASE *resp)
{
fprintf(stderr, “Got result for key %.*s with code 0x%xn”,
(int)resp->nkey, resp->key, resp->rc);
if (resp->rc != LCB_SUCCESS) {
return;
}

if (cbtype == LCB_CALLBACK_GET) {
const lcb_RESPGET *gresp = (const lcb_RESPGET *)resp;
fprintf(stderr, “RETRIEVED ITEMn”);
fprintf(stderr, “VALUE: %.*snFLAGS: 0x%xnCAS=0x%lxn”,
(int)gresp->nvalue, gresp->value, gresp->itmflags, gresp->cas);
} else if (cbtype == LCB_CALLBACK_STORE) {
fprintf(stderr, “STORED ITEMn”);
fprintf(stderr, “CAS: 0x%lxn”, resp->cas);
}
}

int main(void)
{
lcb_t instance;
unsigned ii;
struct lcb_create_st cropts = { 0 };
lcb_error_t err;

cropts.v.v3.connstr = “couchbase://10.0.0.99/default”;
lcb_create(&instance, &cropts);
lcb_connect(instance);
lcb_wait(instance);
lcb_install_callback3(instance, LCB_CALLBACK_GET, op_callback);
lcb_install_callback3(instance, LCB_CALLBACK_STORE, op_callback);

for (ii = 0; ii < 10; ii++) {

lcb_CMDSTORE store_cmd = { 0 };
lcb_CMDGET get_cmd = { 0 };
char buf[1024];
size_t nbuf;

sprintf(buf, “Key_%d”, ii);
nbuf = strlen(buf);

LCB_CMD_SET_KEY(&store_cmd, buf, nbuf);
LCB_CMD_SET_VALUE(&store_cmd, “Value”, strlen(“Value”));
store_cmd.operation = LCB_SET;
err = lcb_store3(instance, NULL, &store_cmd);
if (err != LCB_SUCCESS){
break;
}
LCB_CMD_SET_KEY(&get_cmd, buf, nbuf);
err = lcb_get3(instance, NULL, &get_cmd);
if (err != LCB_SUCCESS) {
break;
}
}
if (err == LCB_SUCCESS) {
lcb_sched_leave(instance);
lcb_wait(instance);
} else {
lcb_sched_fail(instance);
}
lcb_destroy(instance);
return 0;
}

Efficient Payload Handling

Several features have been introduced into the 2.4 client to allow for more efficient payload handling between the application and the library. By default the library will copy the value for a set operation so as to not require the passed-in value buffer to remain in valid memory for the duration of the entire operation. Likewise for get responses, the application must copy the value buffer if it wishes for the data to persist outside of the callback.

Experimental support has been added that enables you to indicate to the library not to copy the value buffer (for set) operations via the lcb_VALBUF structure (provided as a field within the lcb_CMDSTORE) structure:.

Likewise, an additional field exists within the lcb_RESPGET structure called bufh. This field contains an opaque pointer to a buffer handle. This buffer handle can be set to persist outside of the callback, allowing the response data to remain valid until the buffer handle itself is released. Internally this uses a reference counting

Raw Packet Dispatching

You may now give libcouchbase raw memcached packets to dispatch to a server and receive a raw memcached packet in reply. This allows lower level access to packet functionality and allows you to build a proxy server. The feature is implemented in such a way that the response buffers are not copied over to the callback and may be kept alive outside the callback, so that you do not need to copy over GET responses into a temporary buffer for processing. Likewise the request packet itself can also optionally not be copied, but have a callback invoked when it is no longer needed by the library.

New Cluster Configuration APIs

A new callback has been added to the library notifying the user if the inital bootstrap has succeeded or failed. This was previously done using the error callback (lcb_set_error_callback()) and the configuration callback (lcb_set_configuration_callback), where the error callback would be invoked upon an initial error, and the configuration callback invoked when the cluster received a new configuration. The error callback however would also be invoked each time a specific node failed, making clients fail prematurely if multiple nodes were passed and only the first one in the list failed. The new bootstrap callback is invoked only once, and only during the initial creation with a definite error code indicating either bootstrap success or failure. For non asynchronous clients you can simply use lcb_get_bootstrap_status() and not need to rely on a callback:

lcb_t instance;
struct lcb_create_st cropt = {
.version = 3,
.v.3.dsn = “couchbase://cbnode1,cbnode2/mybucket”
};
lcb_error_t err = lcb_create(&instance, &cropt);
if (err != LCB_SUCCESS) {
// handle error;
}
#if I_AM_BLOCKING
err = lcb_connect(instance);
lcb_wait(instance);
if ((err = lcb_get_bootstrap_status(instance)) != LCB_SUCCESS)
{
printf(“Failed to bootstrap: %sn”, lcb_strerror(instance, err));
}
// do commands
#else /* I AM ASYNC */
static void bootstrap_callback(lcb_t instance, lcb_error_t err) {
if (err != LCB_SUCCESS) {
printf(“Couldn’t bootstrap”);
} else {
lcb_GETCMD gcmd = { 0 };
LCB_KREQ_SIMPLE(&req.key, “foo”, 3);
lcb_sched_enter(instance);
lcb_get3(instance, NULL, &gcmd);
lcb_sched_leave(instance);
}
}
lcb_set_bootstrap_callback(instance, bootstrap_callback);
lcb_connect(instance); // Return to event loop, or call lcb_wait()
#endif

Additionally, an lcb_refresh_config() callback has been added to forcefully make the client request a new configuration from the cluster. This is useful to “force” a reconfiguration in cases where many timeouts are being encountered, or to enforce a customized refresh policy within the application.

Finally, the vbucket API has been exposed, allowing inspection of the current configuration being used by the library. The new API is located in libcouchbase/vbucket.h (inside the headers directory). 

#include <libcouchbase/vbucket.h>
lcbvb_CONFIG *config;
lcb_error_t err;
err = lcb_cntl(instance, LCB_CNTL_GET, LCB_CNTL_VBCONFIG, &config);
// Check error
printf(“Revision of current config is %dn”, lcbvb_get_revision(config));
printf(“Cluster has %u serversn”, lcbvb_get_nservers(config));

You may also use the revision to determine if the client has received a new configuration.

Author

Posted by Mark Nunberg, Software Engineer, Couchbase

Mark Nunberg is a software engineer working at Couchbase. He maintains the C client library (libcouchbase) as well as the Python client. He also developed the Perl client (for use at his previous company) - which initially led him to working at Couchbase. Prior to joining Couchbase, he worked on distributed and high performance routing systems at an eCommerce analytics firm. Mark studied Linguistics at the Hebrew University of Jerusalem.

3 Comments

  1. Aliaksey Kandratsenka June 20, 2014 at 3:05 am

    Looks good. I want to try rewriting maxi/mc-loader with it. Not sure about whether I\’ll find time however.

    1. I\’d certainly recommend trying to redo moxi with the new features. I\’ve actually written a small moxi clone in C++ called \”epoxy\”: https://github.com/mnunberg/ep

  2. […] If you’ve been following along, a developer preview version of the library was released last month. It contained a whole bunch of improvements which you can read about here. […]

Leave a reply