I’m excited to announce support of “distributed multi-document ACID transactions” in Couchbase Server 6.5.
Whether you’re writing a new application or modernizing an existing relational application, with transactions in Couchbase 6.5 your work is easier than ever before.
Why distributed ACID Transactions?
Couchbase has always supported single document ACID transactions. These are the bread and butter of transactions in a document model database and cover more than 95% of use cases. There are also business-critical use cases where multi-document ACID transactions are needed, and until now our customers have modeled these cases at the application level. With our new multi-document ACID transaction feature, you can let the database layer handle it for you instead. This relieves the application tier from managing all the recovery semantics of system failures during a multi-document update. The database tier now offers ACID transactions across multiple documents, multiple buckets, and multiple nodes.
Here’s how simple the code is:
All work inside a transaction is done using standard Couchbase SDK APIs with access to the programmatic prowess of the underlying platform. Error handling is simplified with built-in retries for the many failures that are bound to occur in a highly concurrent distributed system.
Let’s look at how we address ACID transaction guarantees in our distributed database that’s built on a shared-nothing architecture.
With this release we extend our atomicity guarantees from a single document to multiple documents across multiple nodes. Now you have all-or-nothing semantics for your standard application where you’re updating several documents at once. Within the transaction boundary, Couchbase will change all the impacted documents or none at all. This multi-document atomicity is critical for application scenarios such as multi-asset coordination and microservice sagas orchestration – and now you can rely on Couchbase to provide it.
Consistency and Isolation
Couchbase has always provided strong consistency on reads from key-value APIs and from N1QL with GSI (using request_plus). Now we’ve extended that consistency to multi-document transactions as well. Of course, any discussion of multi-document consistency is incomplete without a description of the isolation levels supported. Couchbase Server 6.5 provides a “Read Committed” level of isolation. According to ANSI standards, the Read Committed isolation level guarantees that any data read is committed at the moment it is read. It also requires that no uncommitted “dirty” data is ever read. Couchbase transactions ensure that you always get Read Committed semantics regardless of how the read is done – be it through the key-value interface, a N1QL query, an XDCR cluster, analytics, mobile, or an eventing function.
Transactions are layered over a new synchronous replication mechanism in Couchbase Server 6.5 to provide durability guarantees. Synchronous replication ensures that a write is not visible until it is durably replicated and/or persisted. Once a transaction is committed, all the updates in the transaction are guaranteed to be durable, regardless of where the documents reside in the cluster.
With synchronous replication, Couchbase now makes it easier to use tunable durability with better resiliency. Tunability of durability comes either from using replication as a strategy for durability, or from using persistence for durability.
The new replication mechanism is undergoing comprehensive in-house testing using Jepsen, a test framework that subjects distributed systems to multiple concurrent failures and checks for data consistency under these failures. The results of this testing will be made public.
Highly available and scalable transactions
As a distributed scale-out data platform, Couchbase has a long-standing distinction of being a leader in scalability, performance, and high availability. With multi-document distributed transactions we remain true to those tenets. We’re not introducing any global schedulers or global coordination, and don’t rely on finely tuned time servers.
By using our smart clients, we avoid the need for a single transaction monitor or distributed lock manager. Historically, transactions are implemented using a 2PC. In a distributed scale-out database 2PC is too slow, induces distributed deadlocks, and most importantly introduces SPOF. In our implementation, we’ve taken a different approach.
Each transaction is attached to some application logic in the smart clients. As the transaction is executed, the smart clients track the transaction state and determine whether to proceed with the transaction. If the system state does not match with the smart client transaction state, the smart client will automatically unwind the transaction state and retry the application logic. Because smart clients are aware of the transaction state, this eliminates the availability and scalability limitations of 2PC protocol.
Further, in databases that are sharded, the scale and performance limitations of 2PC are traditionally overcome by offering the transaction guarantees in a single shard. That requires a pre-partitioning of data into a single shard. But requiring manual sharding of data is also a well-documented issue that helped precipitate the entire NoSQL industry. With our architecture, transactions are partition agnostic and don’t require any special handling or manipulation of data placements. Basically, the transaction semantics are honored for any document no matter where it physically resides in the cluster.
Pay the price only when you need it
Last but not least amongst the virtues of Couchbase ACID transactions is the fact that you don’t pay any performance penalty except when you use them. You can interleave operations that require strong ACID guarantees with those that don’t to get the best of both worlds: the performance and scale of a NoSQL system, plus the transactional guarantees of a traditional database. This gives applications the power to decide when to pay the transaction cost rather than having the database impose it unconditionally for every operation.
This combination of performance at scale, availability, data model flexibility of JSON, programming power of SQL, and ACID transaction guarantees make Couchbase very empowering for modern applications. If your application requires what NoSQL and Couchbase provide, you no longer need a separate system to achieve the same ACID semantics you’re used to in relational databases.
Couchbase Server 7.0 Beta is available for download. There are additional blogs and documentation published by the Couchbase team to dive deeper into Couchbase ACID transactions. I encourage you to try it out and we look forward to your feedback!
Couchbase Transactions 6.5 Documentation
Couchbase Transactions 6.5 How to Guide for SDKs
Understanding Distributed Multi-Document ACID Transactions in Couchbase
Intro to Couchbase Transactions Java API [Video]
Announcing Couchbase Server 6.5 – What’s New and Improved
As I understand, with 6.5.0, achieving ACID transactions is by way of SDK APIs. Is there a plan to let N1QL perform transactions?
Also, for a lot of customers who plan to port from relational databases, is there a way/plan to support porting of existing PL/SQL (Stored Procedures and alike) to something that can execute in couchbase?
N1QL transactions feature is planned for the upcoming release of Couchbase. Look for the announcement of the Beta in Nov 2020.
We will look into N1QL support for transactions in a future release. It is on our roadmap.
N1QL has a preview feature for User Defined Functions that should allow porting of PL/SQL stored procedures.
See the details of functions in Couchbase 6.5(preview) at: https://docs.couchbase.com/server/6.5/n1ql/n1ql-language-reference/createfunction.html
“As the transaction is executed, the smart clients track the transaction state and determine whether to proceed with the transaction. If the system state does not match with the smart client transaction state, the smart client will automatically unwind the transaction state and retry the application logic”
please help me understand the logic behind matching system state and transaction state. I am dead curious to know the logic used to get rid of 2PC.
The key innovation here is that we surface, through transaction metadata specific operations on the network, details about the transaction state. This allows the entire distributed system that makes up Couchbase, including the client operations on behalf of your application code, to coordinate if needed and otherwise let individual transactions proceed with zero overhead.
With 2PC across systems in the classic transaction manager approach to scale beyond a single monolithic system, the cost of the transaction is paid multiple times as you rightly point out.
The Couchbase solution refactors the work across the system rather than paying that cost by trying to hide it behind an existing layer.
Since this blog was published, there are a few new resources. Graham Pople’s presentation from Couchbase Connect 2020 gets into a bit more detail and my presentation demonstrates how it all comes together.
I hope that answers your question and thanks for the interest!