View Source

Under normal operations the cluster should be in a “stable” state and the configuration shouldn't change. If your application is using short lived connections or a lot of connections from the same machine you are constantly requesting the server to give you the current configuration of the server. These are HTTP streaming requests which, in some realases, the ns_server component of the Couchbase Cluster server software isn’t implementing very efficiently (slow to set up and requires a fair amount of resources in ns_server).

To mitigate this error and to be even more efficient, the idea is to use a local _cache_ in the file system on the client side so that the clients don’t talk to ns_server unless there is an actual change in the cluster topology.

h1. Create instance

New instances is created by using lcb_create_compat with type set to LCB_CACHED_CONFIG:

lcb_create_compat(LCB_CACHED_CONFIG, &specific, &instance, io);

Specific is a new structure that looks like:

  struct lcb_cached_config_st \{
       struct lcb_create_st createopt;
       const char \*cachefile;
       const char \*lockfile;

The cachefile is the path (relative or absolute) of the file containing the cache. If set to a non-null value the lockfile specifies the file to use to synchronize access update access to the the cachefile (we don’t want _all_ instances to try to upgrade the cache at the same time). If no lockfile is specified “.lock” is appended to cachefile.

h1. Cleanup

Given that the cache is intended to be accessed from multiple processes at the same time the user is responsible for deleting the cachefile and the lockfile when it is no longer in use.

h1. Implementation

We need to extend the internal instance metadata that so that we know that _if_ we get a “not my vbucket” response we know that we’re in a “cached configuration” change and that we should use full bootstrap logic.

h2. {color:#000000}Updating the cache{color}

To avoid a “burst” of clients updating the cache when the topology change the clients first try to create a lockfile. If that _fails_ because of a file existence it the client will try to check the age of the lockfile (to work around stale locks). If the lockfile is older than 2secs it will go ahead and try to update the configuration anyway (and remove the lockfile when its done). In this situation you +MAY+ get a burst of connect attempts anyway.

The entire JSON for the current configuration is dumped in the cache file.

If the lock file exists and is “new” the client will “busy”-wait (just a really short sleep) and check for the existence for the file (unless the platform supports monitoring a file).