For simplicity, in this section we completely ignore membase multi-tenancy (or what we have historically called a "bucket," which represents a "virtual membase instance" inside a single membase cluster). The bucket and vBucket concepts are not to be confused - they are not related. For purposes of this section, a bucket can simply be viewed as synonymous with "a membase cluster."
A vBucket is defined as the "owner" of a subset of the key space of a membase cluster.
Every key "belongs" to a vBucket. A mapping function is used to calculate the vbucket in which a given key belongs. In membase, that mapping function is a hashing function that takes a key as input and outputs a vBucket identifier. Once the vBucket identifier has been computed, a table is consulted to lookup the server that "hosts" that vBucket. The table contains one row per vBucket, pairing the vBucket to its hosting server. A server appearing in this table can be (and usually is) responsible for multiple vBuckets.
The hashing function used by membase to map keys to vBuckets is configurable - both the hashing algorithm and the output space (i.e. the total number of vBuckets output by the function). Naturally, if the number of vBuckets in the output space of the hash function is changed, then the table which maps vBuckets to Servers must be resized.
The vBucket mechanism provides a layer of indirection between the hashing algorithm and the server responsible for a given key. This indirection is useful in managing the orderly transition from one cluster configuration to another, whether the transition was planned (e.g. adding new servers to a cluster) or unexpected (e.g. a server failure).
The diagram below shows how Key-Server mapping works when using the vBucket construct. There are 3 servers in the cluster. A client wants to look up (get) the value of KEY . The client first hashes the key to calculate the vBucket which owns KEY . Assume Hash(KEY ) = vB8. The client then consults the vBucket-server mapping table and determines Server C hosts vB8. The get operation is sent to Server C.
After some period of time, there is a need to add a server to the cluster (e.g. to sustain performance in the face of growing application use). Administrator adds Server D to the cluster and the vBucket Map is updated as follows [note: the vBucket-Server map is updated by an internal Membase algorithm and that _updated table is transmitted by Membase to all cluster participants - servers and proxies]
After the addition, a client once again wants to look up (get) the value of KEY . Because the hashing algorithm in this case has not changed,
1 Hash(KEY ) = vB8 as before. The client examines the vBucket-server mapping table and determines Server D now owns vB8. The get operation
1 is sent to Server D.
Membase Commandment 1 requires membase to be a drop in replacement for an existing memcached server, while adding persistence, replication, failover and dynamic cluster reconfiguration. Existing applications will likely (99.999%) be using an old memcached client to communicate with an OTC memcached cluster. This client will probably be using a consistent hashing algorithm to directly map keys to servers.
To make this work, a proxy capability is required. Over the longer run, having a client that can implement the vBucket concept directly will remove the need for a proxy (though a proxy will continue to be desirable in some environments). We do not expect vBucket-aware clients to emerge quickly.
Membase TCP ports
Membase can listen for data operations on two configurable ports: 11210 and 11211 (these are default port numbers in membase). Both ports are "memcapable," supporting the memcached ASCII and Binary protocols.
Port 11211 is the port on which an embedded proxy listens (the standard memcached port). It can receive, and successfully process, data operations for keys that are owned by vBuckets not hosted by this server. The proxy will forward the request to the right server then return the result to the client.
Port 11210 is the port on which the membase data operation server listens. It will reject data operations for keys owned by vBuckets not hosted by this server. To do this, a key-vBucket hash must be performed on all requests. The vBucket is then compared with the list of vBuckets hosted by this server.
The first deployment option is to communicate with the embedded proxy in Membase over port 11211. Supporting this deployment option is our highest priority. It allows the customer to install membase and begin using it with an existing application, via an OTC memcached client, without also installing another piece of proxy software. The downside to this approach is performance. We must do everything practical to minimize latency and throughput degradation.
In this deployment option (as shown in detail below) versus an OTC memcached deployment, in a worst case scenario, server mapping will happen twice (e.g. using ketama hashing to a server list on the OTC client, then using vbucket hashing and server mapping on the proxy) with an additional round trip network hop introduced.
Assume there is an existing application, with an OTC memcached client, with a server list of 3 servers (Servers A, B and C). Membase is installed in place of the memcached server software on each of these 3 servers.
As shown in the figure above, when the application wants to Get(KEY), it will call a function in the OTC client library. Client library will hash(KEY) and be directed, based on the server list and hashing function to Server C. The Get operation is sent to Server C, port 11211. When it arrives to membase (now a proxy port), the Key is hashed again to determine its vBucket and Server mapping. This time, the result is Server A. The proxy will contact Server A on port 11210, perform the get operation and return the result to the client.
The second option is to deploy a standalone proxy, which performs substantially the same way as the embedded proxy, but potentially eliminating a network hop. A standalone proxy deployed on a client may also be able to provide valuable services, such as connection pooling. The diagram below shows the flow with a standalone proxy installed on the application server.
The memcached OTC client is configured to have just one server in its server list (localhost), so all operations are forwarded to localhost:11211 - a port serviced by the proxy. The proxy hashes the key to a vbucket, looks up the host server in the vBucket table, and then sends the operation to the appropriate membase server (Server A in this case) on port 11210.
In the final case, no proxy is installed anywhere in the data flow. The client has been updated and performs server selection directly via the vBucket mechanism. In addition, these clients could send additional information using a modified on-the-wire membase protocol, for example to explicitly encode the destination vBucket. This data could be used to optimize system performance.
See also vBuckets for an in-depth description.