This is Part 2 of a two part blog. Part 1 discusses the Index service scaling improvements implemented in Couchbase Server 7.2.2. This second part focuses on reducing memory and disk I/O overheads.

Reducing Memory Overheads for Indexer Process

Couchbase Server’s Index service tries to keep as much “hot” data in memory as possible to serve index scans faster. But, the index service also requires some memory to perform index management activities. There are many such memory overheads. In Couchbase Server 7.2.2, we have reduced a lot of memory overhead by fine-tuning the parameters. The updated parameter values help clusters with low-end configuration, with hardly any impact on performance of the high-end configuration nodes. Following is a list of specific improvements:

Reduced mutation queue memory overhead

The indexer process maintains a mutation queue to cache the incoming updates from the data service (via DCP streams). Every 10ms, contents of the queue are flushed to the index storage. Once the queue flush operation is done, an index snapshot gets created.

By default, the maximum size for the mutation queue was 256MB. In Couchbase Server 7.2.2, the max mutation queue size will be dynamically determined as 1% of the memory quota. For a node with 4GB memory quota, the max mutation queue size will be 40MB.

Please note that this is not a hard limit. The queue will always hold the minimum required number of mutations. In Couchbase Server 7.2.2, that minimum required number (default value) is also reduced from 50 elements to 30 elements. We have also implemented dynamic tuning of this minimum number from 30 to 20 and eventually 10, when in-use heap memory of the indexer process increases. Once the in-use heap memory decreases, the value is tuned back from 10 to 20 and eventually 30. 

This improvement is enabled by default for both Capella as well as self-managed clusters.

Reduced storage buffer size

Once the mutations are flushed to the index storage, the mutations reside in the storage buffers before they are processed. The size of the storage buffers is calculated as: number of CPU threads * 200. The storage buffers are unique to each index instance/partition, so as the number of indexes increases, overhead due to storage buffers also increases.

In Couchbase Server 7.2.2, we have reduced the maximum size of storage buffers to:  number of CPU threads * 20. The buffer size is further scaled down if the memory quota is less than 16GB. This reduces the storage buffer overhead by a factor of 10 or more. This improvement is enabled by default for both Capella as well as self-managed clusters.

Implemented compaction for index snapshot request queue

As mentioned earlier, the indexer process triggers a snapshot creation every 10ms. When a burst of mutations are received, index storage can slow down temporarily, so that mutation queue flush can take more than 10ms. If the situation remains the same over a few minutes, a lot of index snapshot requests will get queued. 

In Couchbase Server 7.2.2, we have introduced compaction of the index snapshot request queue, which reduces the memory overhead. This improvement is enabled by default for both Capella as well as self-managed clusters.

Sharing LSS across index instances

Couchbase index storage is implemented as a Log Structured Storage (LSS) to achieve the best performance. Each LSS instance has unique resources assigned to it (disk files, flush buffers, etc). 

Starting with Couchbase Server 7.2.2, Capella will, by default, enforce sharing of LSS instances across multiple indexer nodes. If an index is already created with a dedicated LSS instance, it will continue to use that dedicated LSS instance after upgrade. This improvement has yielded a good memory overhead reduction (from previously more than 1GB to less than 30MB, as 484 indexes are now sharing 6 LSS instances).

LSS sharing is enabled by default only on Capella clusters. For self-managed clusters, it can be enabled by using the following REST command:

curl -X POST -u <credentials> http://<indexer-node-ip>:9102/settings --data '{"indexer.plasma.useSharedLSS" : true}'

Other Improvements

Bloom Filters

Couchbase server had introduced bloom filters for index service in version 7.1. But the users have to explicitly enable them. The bloom filters are used by the index storage layer, to reduce the disk I/O, and hence improve the overall efficiency of the index service. 

Starting in Couchbase Server 7.2.2, the index service will have bloom filters enabled by default. This reduces the index service disk lookups, with a trade-off of a small increase in memory overheads. The bloom filters primarily exhibit their benefit for the insert-heavy workloads.

Zstd Compression

Index storage uses compression in two use cases: compression of on-disk files, and compression of in-memory index pages. On-disk files are compressed with snappy compression, while the in-memory pages are compressed with Zstd compression. Due to the use of two different algorithms, index storage had to perform a decompress-compress cycle while fetching the index pages from the disk. 

Starting with Couchbase Server 7.2.2, index storage now uses Zstd compression for on-disk files as well. This avoids an unnecessary decompress-compress cycle during disk fetch. Also, Zstd is known to yield a better compression ratio. This leads to efficient disk i/o utilization in Cloud environments. 

What’s next?

Learn more about Couchbase products:

Author

Posted by Amit Kulkarni

Amit Kulkarni is working as a Engineering Manager at Couchbase on Global Secondary Indexes. He has experience in working on technologies like Distributed Systems, Distributed NoSQL Databases, Cloud Storage, Storage Virtualisation etc.

Leave a reply