Store hundreds of terabytes of JSON data and query in milliseconds with Couchbase 7.1. Our new storage engine in 7.1, Couchbase Magma, makes Couchbase the most cost-effective and performant database, lowering TCO.

With Magma storage engine, Couchbase 7.1 becomes the ideal database for your data intensive use cases. Some examples of such use cases include write heavy applications such as IoT and logging; read heavy applications with large datasets such as customer portals; as well as mixed workloads with large datasets such as metadata and content stores (photo/video stores) and user profiles.

Let us dive further into benefits of Magma and how you can start using it.

Magma TCO reduction

Magma’s design allows it to function with minimal amounts of memory – it is operationally stable at as low as 1% memory-to-data ratio. For example, if you want to store 1TB of data in a node, you need only 10GB of memory to run with Magma if you want to access everything primarily from disk. If you want in-memory (sub-millisecond) access speed for your working set you will probably want to use a memory-to-data ratio that represents your working set. 

Magma can also store larger amounts of data per node – we tested and certified up to 10TB of data per node (this includes primary and replica data). Hence, to store 100TB of data, you need only 10 servers in a Couchbase cluster.

If you’re familiar with the existing Couchbase storage engine called Couchstore, you know that Couchstore requires a minimum of 10% memory-to-data ratio, and recommends no more than 2-3TB of data per node.

The reduction in memory-to-data ratio from 10% to 1%, and increase in node density to 10TB of data translates to 10x reduction in servers for most scenarios, and a corresponding 10x reduction in TCO.

Magma performance gains

For disk-based workloads, Magma significantly improves performance. Some highlights of performance improvements are:

    • 4x throughput for mixed disk-based workloads compared to the current storage engine (Couchstore). We regularly test this in-house using the YCSB benchmark (see chart below for reference).
    • 10x improvement in tail latency for reads: 99.9 percentile < 10ms on regular SSDs.
    • 2x improvement in throughput and latency for persistent durable writes used in ACID transactional workloads.
    • 20% better compression and hence reduction in disk space consumption.
Couchbase Magma YCSB performance benchmark

The above is a chart from our regular runs of a modified YCSB Workload G. It demonstrates over 4x throughput improvement in a read-modify-write workload. We ran it with 100% read-modify-write workload with a uniform distribution. Since the resident ratio is around 30%, the workload is disk-heavy with around 70% of the reads going to disk. Fragmentation ratio was set at 50% for both Magma and Couchstore.

The magic behind Magma

How is Magma able to achieve all these benefits to TCO and performance? Magma is based on a proprietary architecture developed in-house at Couchbase that combines Log Structured Merge trees (LSMs) with value separation in a log structured object store. This combination creates a storage engine that has much better efficiency metrics for read amplification, write amplification, and space amplification compared to an off-the-shelf LSM based storage engine such as RocksDB.

Architecture of Couchbase Magma engine

Magma storage engine architecture.

Magma has scalable incremental compaction with micro-compactions that are continually on-going and hence no noticeable read or write pauses are caused from compaction. In addition, compaction is parallel which allows benefitting from underlying IO subsystem parallelism. Granular and continual compaction also reduces the space amplification and hence the total amount of required disk space. Magma does block compression which provides better compression ratios than document level compression alone, further reducing the amount of disk space required.

How to use Magma

Once you have set up your Couchbase cluster, you can create a Couchbase bucket (a bucket is like a database in Couchbase) with the storage engine selection as Magma. Yes, Magma storage engine choice is available per bucket. You can have a single Couchbase cluster with some Magma buckets and some Couchstore buckets. This makes it easier to try Magma in your existing Couchbase clusters. For data-intensive use cases where Magma is ideal, you will probably want to have Magma-only buckets in your cluster.

Note that for Magma buckets, it is recommended to set ejection method to Full this implies even keys will get evicted from the cache), especially if you are running with a low memory-to-data ratio. Full Ejection is the default selection for a Magma bucket.

Once you have created a bucket with Magma as the selection, you can start using it to load and query your data using any of the Couchbase APIs.

Creating a Magma bucket in Couchbase UI

Creating a Magma bucket in the Couchbase Admin UI.

Next steps to reducing TCO

The Magma storage engine makes Couchbase 7.1 a high-performance, disk-oriented, database for your large datasets consisting of hundreds of terabytes of data. The required memory-to-data ratio is as low as 1%, reducing your TCO dramatically for the number of servers required. Couchbase 7.1 can tackle those use cases that were previously cost-prohibitive and reduce TCO where data is growing rapidly. Download Couchbase today and give these new innovations a try.

Learn more with these resources:

Author

Posted by Shivani Gupta

Shivani Gupta is Director of Product Management at Couchbase for the Core Server. Shivani has over 20 years of varied experience in Big Data, Distributed Systems, and Databases at different companies including Oracle, Microsoft, VMWare, Hortonworks and now Couchbase.

Leave a reply