Both NoSQL databases and modern Blockchain ledgers benefit from a set of common principles. When they are both implemented for an application a lot can be accomplished as the platforms can complement each other.
What is Blockchain?
Blockchain is a peer-to-peer technology for developing what are known as distributed ledgers in which data can be stored in thousands of different servers and databases around the world, visible to all participants, and retrieved almost instantly. Blockchain is the foundational technology for cryptocurrencies such as Bitcoin.
In this article, we review two synergistic overlaps of Blockchain and NoSQL that look at how Couchbase’s NoSQL platform could support your next enterprise distributed ledger application, e.g. based on Hyperledger. This topic is very deep, but I only touch on two superficial ideas to help map out the commonalities and opportunities: distributed computing, and world state.
For more information on Blockchain in general I recommend the academy pages at ledger.com, preview is shown below.
Modern enterprise architectures are built on distributed computing at their core–be they parallel processing CPU/GPU environments, multi-node database clusters, or global datacenters with synchronized clusters in different locales.
By leveraging the distributed processing speed, recoverability, and scalability of these architectures (a topic in its own right!) application developers are able to focus on building the desired user experience and letting backend data systems do the heavy lifting.
What does it mean to be distributed? In its most basic form, it means having more than one server that is managed as part of a cluster of nodes. Ideally, there is no single point of failure or centralized control in distributed systems.
Also, it assumes that workload items are broken into pieces that are digestible by the underlying atomic processors, e.g. distributing the work across multiple nodes.
Other names for these kinds of systems include peer-to-peer networks, clustered computing, parallel processing, etc. Wikipedia has a great list of the kinds of systems leading up to today.
Blockchain is the Epitome of Distributed
Despite the benefits, distributed computing is not pervasive; even within modern enterprises centralization of many systems is still quite common. This includes industries that you would expect to be designed with more resiliency in mind, like the global financial systems or supply chain management which have tended to be more centralized around mainframe computing.
By the way, you can always tell when there is a centralized system because when it fails, it fails absolutely! When all data or services are running on a single machine it is quite easy to know when it goes down because everything completely stops.
It may be because it takes time to start up a replacement machine, or takes time to notice a failure before re-routing users or a myriad of other devastating engineering reasons. A centralized system is the opposite of the peer-to-peer networks we aspire to.
However, with the introduction of platforms like Bitcoin, the next generation of digital currency and “ledgers” are slowly being proven out. Now there are thousands of different cryptocurrencies and dozens of Blockchain backends that are taking advantage of decentralized technology.
As an aside, note that “distributed ledger” does not equate to the proof-of-work scenarios that many cryptocurrencies use. Instead, think of ledgers as already having trust for an application that is making updates and not having to undertake any particular task to add to the chain. Likewise, proof-of-work systems are less focused on being highly performant which we ultimately need for bringing blockchain applications into the mainstream.
Likewise, enterprises are looking for ways to leverage more distributed approaches for their internal systems to reduce downtime. If it’s to be a ledger-based system, there are several approaches now available. If it’s to be a general-purpose database, there are also options, especially for data management.
In enterprises where there are some distributed systems in play, it is likely that most of the technology will be found in databases, particularly NoSQL platforms. One of the pillars of Couchbase has been this distributed nature from day one, filling a critical gap that legacy databases were not filling effectively.
Introducing yet another JSON document store would not have been special if it only ran on a single node. Similarly, yet another Blockchain technology would be nothing of note if it were not also distributed across a cluster of machines.
If it’s centralized then it would be a single point of failure and control, defeating the trust of the overall system. Thankfully both Couchbase, Blockchain, and related technologies are bringing forward the need and value of distributed systems.
In Hyperledger Fabric speak (a specific distributed ledger implementation), there are two types of data handling components in play in their system.
Operational transactions are the core of any ledger– it verifies, creates, and logs all transactions in the ledger. Hyperledger Fabric handles all the built-in permissions for acknowledging who can initiate transactions and also stores them in a variety of backends technologies.
World state is the other primary component, another data view that maintains the current account values, not all the individual operational transactions. When a transaction from one entity to another is performed, the world state also gets an update so the new values are kept current. Previous transactions are not stored in the world state.
Each account in the world state system will have a single value, but the overall history is stored in precise detail by the transaction system.
NoSQL Source of Truth
Both of the above components could be implemented with a Couchbase NoSQL database as the backend. Couchbase handles high-throughput operational transactions across many different use cases, finance, fraud detection, IoT, etc. Distributed ACID transactions are also possible, which a topic unto itself (more in a future post).
If Couchbase sits as the primary backend database for a Blockchain system, it can facilitate both the operational and the world state data storage/retrieval.
Couchbase is often used to store an aggregate of data from multiple different databases, providing what is known as a Source of truth (vs. System of Record). This can be analogous to the world state, storing the materialized picture of the current data of interest.
This makes Couchbase a good fit for application developers who need to store user profiles for an application. For example, other backend systems may keep the individual pieces of data up to date but when the user logs in the user profile are immediately available in a single JSON document.
The ultimate benefit of using Couchbase in this context is having all the built-in advantages for developers. Once data is dropped into the database, you have frictionless access to powerful SQL-based query tools, full-text search using natural language, massive big data analytics for huge datasets, and more. This allows developers to focus on the product and user while letting the backend systems keep things managed and in sync.
For example, with full-text queries, users can perform ad hoc full-text searching of keys or other text data attributes in the database. Full-text indexes take search conditions and look for matches in the inverted indexes for the given documents. Instead of standard tabular SQL query results, relevant results from search requests provide pointers to the source documents themselves, and may also point to the field and values in the document where the text occurs.
Blockchain systems themselves may offer real-time event stream querying, but full-text searching is more valuable for historical searching of source of truth datasets.
Why NoSQL + Blockchain?
Rather than just compare these two technologies, I also want to encourage developers and architects to look at how they can both be used together. Here is one way.
Couchbase can serve as the application developer layer on top of any distributed ledger or Blockchain technology, both as the operational database component or the world state. The world state database is a great first use case to investigate if you are building an enterprise solution and need to surface account details quickly and easily to end-users.
For example, when Blockchain transactions occur and the world state is updated, the same update could be sent to Couchbase and made available to users. Because Couchbase has mobile SDK, as well as comprehensive analytic SQL support, and much more, it provides a more robust data interface than what comes with Blockchain systems out of the box.
This is particularly important when you want users to have rapid access to the most current information.
While Blockchain systems take time to propagate information, Couchbase uses advanced protocols to do it much quicker — views of data can be built as changes occur. And because Couchbase runs in a multi-cluster environment, the stability and resiliency of the platform can keep up with similar demands of the Blockchain system.
While there is no off-the-shelf Couchbase integration with Hyperledger, the Couchbase SDK supports all the main programming languages. Anyone building a Blockchain-based ledger can start sending current world state updates over to the NoSQL database via JSON, using both the Blockchain API and Couchbase API.
If you are interested in building this kind of integration, check out the current backend providers in Hyperledger Fabric and adapt one of those for Couchbase Server.
Also, it may be possible to implement this functionality directly in the chaincode sent during smart contract application updates in a ledger. I’m only just starting to understand this side of the system, but in another post, we could compare/contrast database user-defined functions (UDFs) and Blockchain chaincode to give you more reference.
There is a lot more to Blockchain that we can dive into, but I hope this glimpse into similarities and overlaps helps to get your cognitive juices flowing.
- What Is Blockchain (ledger.com)
- Couchbase – Why NoSQL?
- Couchbase – Cloud to Edge Data Sync
- Couchbase – ACID Transactions
- Hyperledger Fabric — a specific distributed ledger technology
- Couchbase Developer Portal
- Developing Back-end Application with Hyperledger Fabric through SDK (IBM.com)
- Distributed Computing (Wikipedia.org)