For performance, Couchbase Server prefers to store and provide information to clients using RAM. However, this is not always possible or desirable in an application. Instead, what is required is the 'working set' of information stored in RAM and immediately available for supporting low-latency responses.
Couchbase Server stores data on disk, in addition to keeping as much data as possible in RAM as part of the caching layer used to improve performance. Disk persistence allows for easier backup/restore operations, and allows datasets to grow larger than the built-in caching layer.
Couchbase automatically moves data between RAM and disk (asynchronously in the background) in order to keep regularly used information in memory, and less frequently used data on disk. Couchbase constantly monitors the information accessed by clients, keeping the active data within the caching layer.
The process of removing data from the caching to make way for the actively used information is called ejection, and is controlled automatically through thresholds set on each configured bucket in your Couchbase Server Cluster.
The use of disk storage presents an issue in that a client request for an individual document ID must know whether the information exists or not. Couchbase Server achieves this using metadata structures. The metadata holds information about each document stored in the database and this information is held in RAM. This means that the server can always return a 'document ID not found' response for an invalid document ID, while returning the data for an item either in RAM (in which case it is returned immediately), or after the item has been read from disk (after a delay, or until a timeout has been reached).
The process of moving information to disk is asynchronous. Data is ejected to disk from memory in the background while the server continues to service active requests. During sequences of high writes to the database, clients will be notified that the server is temporarily out of memory until enough items have been ejected from memory to disk.
Similarly, when the server identifies an item that needs to be loaded from disk because it is not in active memory, the process is handled by a background process that processes the load queue and reads the information back from disk and into memory. The client is made to wait until the data has been loaded back into memory before the information is returned.
The asynchronous nature and use of queues in this way enables reads and writes to be handled at a very fast rate, while removing the typical load and performance spikes that would otherwise cause a traditional RDBMS to produce erratic performance.