Table of Contents
From a client perspective, membase speaks memcached protocol, which is well understood by many, if not most application developers. The difference, of course, is that membase has persistence and replication capabilities while still allowing for memcached like speed.
Individual membase nodes are clustered together. Within a cluster data is automatically replicated between nodes of a cluster. Cluster nodes can be added and removed without interrupting access to data within the cluster.
All clusters start with a single node, typically one installed from a package. Either through the Web UI or from the REST interface, membase allows one or more nodes to be added to the cluster. When a node is added to the cluster, it does not immediately start performing data operations. This is to allow the user to perform one or more changes to the cluster before initiating a rebalance. During a rebalance the data and replicas of that data, contained in sub-partitions of the cluster called vbuckets, are redistributed throughout the cluster. By design, a given vbucket is only active in one place within the cluster at a given point in time. By doing so, membase is always consistent for any given item.
Data is moved between nodes, both when rebalancing and replicating, using a set of managed vbucketmigrator processes in the cluster. This process uses a new protocol called TAP. TAP is generic in nature though, and it has very clear use cases outside replication or migration of data. Incidentally, TAP doesn't actually stand for anything. The name came about when thinking about how to "tap into" a membase node. This could be thought of along the lines of a 'wiretap' or tapping into a keg.
Cluster replication defaults to asynchronous, but is designed to be synchronous. The benefit of replication being asynchronous is that membase has speeds similar to memcached in the default case, taking a data safety risk for a short interval.
Cluster coordination and communication is handled by the ns_server erlang process. Generally, users of membase need not be aware of the details about how ns_server performs its tasks, as interfacing with the cluster is done with the aforementioned membase REST interface. As part of keeping the system simple, all nodes of the cluster expose the state of the cluster.
Generally speaking, membase is memory oriented, by which we mean that it tends to be designed around the working set being resident in memory, as is the case with most highly interactive web applications. However, the set of data in memory at any given point in time is only the hot data. Data is persisted to disk by membase asynchronously, based on rules in the system.
From a developer perspective, it is useful to know how all of the components of membase come together. A membase node consists of
ns_server
This is the main process that runs on each node. As it says in it's source repository summary, it is the supervisor. One of these runs on each node and then spawns processes, which then later spawn more processes.
menelaus
Menelaus is really two components, which are part of the ns_server repository. The main focus of menelaus is providing the RESTful interface to working with a cluster. Built atop that RESTful interface is a very rich, complex JQuery based application which makes REST calls to the server.
memcached
Though membase is different than memcached, it does leverage the core of memcached. The core includes networking and protocol handling.
The bulk of membase is implemented in two components:
membase engine (ep-engine)
This is loaded through the memcached core and the bucket_engine. This core component provides persistence in an asynchronous fashion and implements the TAP protocol.
bucket engine
The bucket engine provides a way of loading instances of engines under a single memcached process. This is how membase provides multitenancy.
vbucketmigrator
Effectively a TAP client, based on how ns_server starts one or more vbucketmigrator processes, data is either replicated or transferred from one node to another.
Moxi
A memcached proxy, moxi "speaks" vbucket hashing (implemented in libvbucket) and can talk to the REST interface to get cluster state and configuration, ensuring that clients are always routed to the appropriate place for a given vbucket.
Across multiple cloud instances, VMs or physical servers, all of these components come together to become a membase cluster.