Search:

Search all manuals
Search this manual
Manual
Couchbase Server Manual 2.0
Community Wiki and Resources
Download Couchbase Server 2.0
Couchbase Developer Guide 2.0
Client Libraries
Couchbase Server Forum
Additional Resources
Community Wiki
Community Forums
Couchbase SDKs
Parent Section
4.2 Sizing Guidelines
Chapter Sections
Chapters

4.2.1. RAM Sizing

4.2.1.1. Working Set
4.2.1.2. Memory quota

RAM is usually the most critical sizing parameter. It's also the one that can have the biggest impact on performance and stability.

4.2.1.1. Working Set

Before we can decide how much memory will we need for the cluster, we should understand the concept of a 'working set'. The 'working set' at any point of time is the data that your application actively uses. Ideally you would want all your working set to live in memory.

4.2.1.2. Memory quota

It is very important that a Couchbase cluster is sized in accordance with the working set size and total data you expect.

The goal is to size the RAM available to Couchbase so that all your document IDs, the document ID meta data, along with the working set values fit into memory in your cluster, just below the point at which Couchbase will start evicting values to disk (the High Water Mark).

How much memory and disk space per node you will need depends on several different variables, defined below.

Note

Calculations are per bucket

Calculations below are per bucket calculations. The calculations need to be summed up across all buckets. If all your buckets have the same configuration, you can treat your total data as a single bucket, there is no per-bucket overhead that needs to be considered.

Table 4.1. Deployment — Sizing — Input Variables

VariableDescription
documents_numThe total number of documents you expect in your working set
ID_sizeThe average size of document IDs
value_sizeThe average size of values
number_of_replicasnumber of copies of the original data you want to keep
working_set_percentageThe percentage of your data you want in memory.
per_node_ram_quotaHow much RAM can be assigned to Couchbase

The following are the items that are used in calculating memory required and are assumed to be constants.

Table 4.2. Deployment — Sizing — Constants

ConstantDescription
Meta data per document (metadata_per_document)This is the space that Couchbase needs to keep metadata per document. Prior to 2.0.2, it is 64 bytes. As of Couchbase 2.0.2 metadata uses 56 bytes of memory. All the metadata for documents needs to live in memory while a node is running and serving data.
SSD or SpinningSSDs give better I/O performance.
headroom [a]Typically 25% (0.25) for SSD and 30% (0.30) for spinning (traditional) hard disks as SSD are faster than spinning disks.
High Water Mark (high_water_mark)By default it is set at 70% of memory allocated to the node

[a] The headroom is the additional overhead required by the cluster to store metadata about the information being stored. This requires approximately 25-30% more space than the raw RAM requirements for your dataset.


This is a rough guideline to size your cluster:

VariableCalculation
no_of_copies1 + number_of_replicas
total_metadata [a](documents_num) * (metadata_per_document + ID_size) * (no_of_copies)
total_dataset(documents_num) * (value_size) * (no_of_copies)
working_settotal_dataset * (working_set_percentage)
Cluster RAM quota required(total_metadata + working_set) * (1 + headroom) / (high_water_mark)
number of nodesCluster RAM quota required / per_node_ram_quota

[a] All the documents need to live in the memory

Note

You will need at least the number of replicas + 1 nodes irrespective of your data size.

Example sizing calculation

Table 4.3. Deployment — Sizing — Input Variables

Input Variablevalue
documents_num1,000,000
ID_size100
value_size10,000
number_of_replicas1
working_set_percentage20%

Table 4.4. Deployment — Sizing — Constants

Constantsvalue
Type of StorageSSD
overhead_percentage25%
metadata_per_document120
high_water_mark70%

Table 4.5. Deployment — Sizing — Variable Calculations

VariableCalculation
no_of_copies= 2 [a]
total_metadata= 1,000,000 * (100 + 120) * (2) = 440,000,000
total_dataset= 1,000,000 * (10,000) * (2) = 20,000,000,000
working_set= 20,000,000,000 * (0.2) = 4,000,000,000
Cluster RAM quota required= (440,000,000 + 4,000,000,000) * (1+0.25)/(0.7) = 7,928,000,000

[a] 1 for original and 1 for replica


For example, if you have 8GB machines and you want to use 6 GB for Couchbase:

number of nodes =
    Cluster RAM quota required/per_node_ram_quota =
    7.9 GB/6GB = 1.3 or 2 nodes

Note

RAM quota

You will not be able to allocate all your machine RAM to the per_node_ram_quota as there maybe other programs running on your machine.