Make sure you have enough nodes (and the right configuration) in your cluster to keep your data safe. There are two areas to keep in mind: how you distribute data across nodes and how many replicas you store across your cluster.
Basically, more nodes are better than less. If you only have two nodes, your data will be split across the two nodes, half and half. This means that half of your dataset will be "impacted" if one goes away. On the other hand, with ten nodes, only 10% of the dataset will be "impacted" if one goes away. Even with automatic failover, there will still be some period of time when data is unavailable if nodes fail. This can be mitigated by having more nodes.
After a failover, the cluster will need to take on an extra load. The question is - how heavy is that extra load and are you prepared for it? Again, with only two nodes, each one needs to be ready to handle the entire load. With ten, each node only needs to be able to take on an extra tenth of the workload should one fail.
While two nodes does provide a minimal level of redundancy, we recommend that you always use at least three nodes.
Couchbase Server allows you to configure up to three replicas (creating four copies of the dataset). In the event of a failure, you can only "failover" (either manually or automatically) as many nodes as you have replicas. Here are examples:
In a five node cluster with one replica, if one node goes down, you can fail it over. If a second node goes down, you no longer have enough replica copies to fail over to and will have to go through a slower process to recover.
In a five node cluster with two replicas, if one node goes down, you can fail it over. If a second node goes down, you can fail it over as well. Should a third one go down, you now no longer have replicas to fail over.
After a node goes down and is failed over, try to replace that node as soon as possible and rebalance. The rebalance will recreate the replica copies (if you still have enough nodes to do so).
As a rule of thumb, we recommend that you configure the following:
One replica for up to five nodes
One or two replicas for five to ten nodes
One, two, or three replicas for over ten nodes
While there may be variations to this, there are diminishing returns from having more replicas in smaller clusters.
In general, Couchbase Server has very low hardware requirements and is designed to be run on commodity or virtualized systems. However, as a rough guide to the primary concerns for your servers, here is what we recommend:
RAM: This is your primary consideration. We use RAM to store active items, and that is the key reason Couchbase Server has such low latency.
CPU: Couchbase Server has very low CPU requirements. The server is multi-threaded and therefore benefits from a multi-core system. We recommend machines with at least four or eight physical cores.
Disk: By decoupling the RAM from the I/O layer, Couchbase Server can support low-performance disks better than other databases. As a best practice we recommend that you have a separate devices for server install, data directories, and index directories.
Known working configurations include SAN, SAS, SATA, SSD, and EBS, with the following recommendations:
SSDs have been shown to provide a great performance boost both in terms of draining the write queue and also in restoring data from disk (either on cold-boot or for purposes of rebalancing).
RAID generally provides better throughput and reliability.
Striping across EBS volumes (in Amazon EC2) has been shown to increase throughput.
Network: Most configurations will work with Gigabit Ethernet interfaces. Faster solutions such as 10GBit and Inifiniband will provide spare capacity.
Due to the unreliability and general lack of consistent I/O performance in cloud environments, we highly recommend lowering the per-node RAM footprint and increasing the number of nodes. This will give better disk throughput as well as improve rebalancing since each node will have to store (and therefore transmit) less data. By distributing the data further, it lessens the impact of losing a single node (which could be fairly common).
Read about best practices with the cloud in Section 4.6, “Using Couchbase in the Cloud”.