Comparing Couchbase Capella vs CosmosDB

CosmosDB is Microsoft’s NoSQL offering that’s exclusive to Microsoft Azure. It used to be called DocumentDB, but they changed the name and added some interesting new features. Let’s go a little deeper on it and explore its strategy, documentation, what developers have been talking about, and how it compares to Couchbase Capella.

One Database to rule them all?

Microsoft claims that CosmosDB is a NoSQL database able to do literally everything: It is a Document database, Columnar storage, a Key-Value Store and a Graph Database. All achieved thanks to an abstraction of the data format called atom-record-sequence (ARS).

Let’s look at how data is organized according to each model. First, you have to choose the API you would like to use (SQL, MongoDB API, Microsoft Azure Table, Cassandra or Gremlin) and stick with it, as it can’t be changed later. But behind the scenes, it looks to be a custom JSON format.

CosmosDB is trying to compete with all of the major NoSQL databases, which may be a risky strategy. For one, this approach may limit the features that CosmosDB can ultimately offer. There’s a single common denominator, and that can’t be strayed too far from. Also, APIs like MongoDB and Cassandra are not defined or planned by Microsoft. This means that Microsoft will always be catching-up to the latest releases, and will ultimately never achieve 100% compatibility. Microsoft maintains documentation about which MongoDB features are supported and which are not (and the same thing for Cassandra). An all-in-one solution like CosmosDB might be good for simple applications with few functionality demands, but all those abstractions come with a cost and will ultimately impact simplicity, performance and be feature limited.

Couchbase vs CosmosDB – Comparing Apples with “Apples”

This comparison will focus most on scenarios that make sense to compare both technologies (for example, Couchbase is not a graph database, so the comparison wouldn’t make sense).

One other important note: Couchbase Capella is Couchbase’s DBaaS (database-as-a-service) offering, available in AWS and GCP (soon to be in Azure too). It is basically a managed version of Couchbase Server, which is still available for download, so they are very similar. Unless otherwise stated, the “Couchbase” column applies to both Capella and Server.

Feature	CosmosDB	Couchbase Capella
Licensing	Proprietary, closed-source but free-tier is available.	Free trial available for Capella, Couchbase Community and Enterprise available for download, BSL
Type	Key-value Document Graph Columnar	Key-value Document Built-in cache Mobile
Model	2MB limit per document 16MB limit for Mongo mode only	20MB document limit
Search	Requires a separate, propietary product: Azure Cognitive Search	Full-text search engine built in (using Bleve open-source engine) Connector available for ElasticSearch if necessary
Indexing	Indexes every property for all items by default Allows index customization	Unlimited indexes Any field can be indexed Memory-optimized indexes available
Data Integrity	Five options are available in configuration: Strong Bounded staleness Session (default) Consistent prefix Eventual	Strong consistency Query consistency can be specified on a per-query basis
Scalability	Highly scalable	Highly scalable
Mobile	No plans for CosmosDB for mobile or devices or any offline support	Couchbase Lite provides a mobile/device/edge database. Sync Gateway automatically syncs to/from the data center
Deployment	Azure only, fully managed only. There is a development version available (currently Windows only).	Can be deployed anywhere, including Azure, on-premises, Kubernetes, Docker, VM, bare-metal. Couchbase Capella offers a fully managed DBaaS
Locking	Optimistic and pessimistic locking available	Optimistic and pessimistic locking available
Backup & Restore	Continuous backup mode for 30 days Periodic backup mode (default)	Automatic backup and restore service with configurable backup wizard Continuous backup available using XDCR
Querying	Based on which mode is chosen. Example 1: SQL API is an extremely limited subset of standard SQL Example 2: MongoDB API is a non-100% subset of Mongo API	Full SQL implementation called SQL++ (with JOIN, aggregate, CTE, window functions, CRUD operations, etc) – previously known as “N1QL”
Data Center Replication	Push-button global master-master replication between supported Azure data centers	XDCR allows any combination of unidirectional and bidirectional replication between any Couchbase deployment, including data filtering
Speed/performance	More speed and performance is only obtained by increasing RUs, which will often be prohibitively expensive	Memory-first read and write operations. Built-in caching layer. Can be tuned by increasing memory, disk, or adding a new node. Memory-optimized indexes available
Sharding / partitioning	Partition key(s) must be created and managed manually, requiring a dedicated expert to set and design correctly in order to reach performance/scale goals	Sharding is completely automatic
Architecture	Unknown / proprietary	Every node is a master in Couchbase, making most efficient use of resources
Supported SDKs	.NET (primary, most feature complete) Other SDKs: Java Node.js Python (Others through Mongo/Cassandra)	.NETC / C++ Go Java Node.js PHP Python Ruby Scala Kotlin

Success in the Real World

This side-by-side comparison may favor Couchbase, but what about the real-world experiences of an organization that was using CosmosDB and switched to using Couchbase?

Facet Digital cut their database costs by 50%, and improved their performance by 100x by switching to Couchbase Capella.

How was it possible?

Faster deployment time
Easy search integration
Faster indexing
Better DevOps automation (CI/CD index definitions)
Familiar and complete SQL syntax

Summary

CosmosDB has a unique vision, but as a natural consequence of building something focusing on multiple fields at once, CosmosDB’s support for all of your desired features can be uneven.

One of the most prominent features is the ability to choose between multiple levels of eventual consistency: Bounded-staleness, Session, Consistent Prefix and Eventually Consistent. The fact that Session is set as the default consistency says a lot about the recommended way to use CosmosDB. It could mean that it might not be the best solution if you need a strong data consistency (and perhaps Microsoft would want to steer you back towards their flagship SQL Server database).

Being memory-first is one of the reasons why Couchbase is so fast. CosmosDB has an integrated cache (currently in preview), but like with search, it’s a separate product that must be added-on. Couchbase has been memory-first since its inception.

With CosmosDB, all fields are indexed in their Global Secondary Indexes (GSI). It seems like overkill. It may be easier to specify which fields to index than specifying which fields not to index. As soon as your JSON gets much bigger than a handful of properties (and especially when nesting JSON objects), these indexes are definitely going to be overkill, with the costs passed on by default. Too many indexes means too many RUs which means too many dollars.

Sharding seems to be one of the trickiest things in CosmosDB. Partitions are moved automatically among nodes, but you still have to specify a partition key. One drawback of this approach is that each partition is indivisible with a max size of 10Gb. If you pick a bad partition key, a lot of frequently accessed documents might end up in the same partition, which limits the throughput of your reads/writes by the node capacity where the partition is stored.

The partition key is also immutable, so in order to change it, you will be required to copy your whole data to another collection. In Couchbase, documents are distributed evenly between vBuckets to avoid this problem, and also to increase your read/write performance.

With CosmosDB, throttling up is done only by increasing Request Units (RUs). The challenge with this approach is that it is not a very good predictor of the query performance and makes it even harder to boost a specific behavior like increasing only the writes capacity. For some use cases, you may find a team needs a person to work on RUs full-time to figure out and maintain the queries properly.

Microsoft has put a lot of effort into trying to make RUs easier to understand, but it’s common for developers to underestimate their RUs (see here or here) and they end up stuck with a bill much higher than expected. On Couchbase, throttling up is very flexible, it can be done by vertical and/or horizontal scaling, running specific services according to the node hardware, keeping indexes in memory, etc.

CosmosDB also provides a cool push-button global data distribution that makes it really simple to replicate data in multiple data centers across the world. However, it can also be easily achieved in a matter of minutes in Couchbase Server without the limitation of running only in Azure.

Benchmarking is difficult, because of CosmosDB’s RUs model, but a third-party benchmark using the YCSB approach shows Couchbase Capella’s clear advantage in throughput and latency.

CosmosDB’s pricing is attractive if you have a small database with few reads/writes per second. But anything above that can cost a lot. CosmosDB’s price calculator shows that with a 50/50 mix of reads and writes, plus a handful of queries per second, can add up to thousands per month. CosmosDB provides a helpful calculator, but it’s somewhat unreliable, due to the difficulty in predicting RUs (as mentioned earlier). Also, the calculator does not consider the consistency model you are going to use, so you have to add a few extra dollars to this number for Strong-Consistency.

Couchbase Capella pricing is much more predictable, and will often be lower cost, especially for larger, mission-critical use cases.

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

Comparing Couchbase Capella vs CosmosDB

What We Learned Evaluating Agent Memory:The Results (Part 2)

What We Learned Evaluating Agent Memory:The Setup (Part 1)

Building a Test Matrix Pipeline for Couchbase Autonomous Operator

App Development Cost: A Complete Pricing Guide and Breakdown

Azure Key Vault for Credentials

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

Comparing Couchbase Capella vs CosmosDB

One Database to rule them all?

Couchbase vs CosmosDB – Comparing Apples with “Apples”

Success in the Real World

Summary

Get Couchbase blog updates in your inbox

Author

Posted by Matthew Groves

Leave a comment Cancel reply

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch