Dynamo accelerated the NoSQL revolution that’s driving the database industry.

Recently, Amazon announced PartiQL – A SQL-Compatible Query Language for their flagship NoSQL database Amazon DynamoDB. This has brought the NoSQL “re:evolution” full circle. It’s wonderful to see the collaborative research from UCSD and Couchbase enabling the industry to move forward.

NoSQL had a good run. MapReduce triggered it in 2004. Dynamo  and BigTable  accelerated it in 2007. NoSQL meant no support for SQL, no support for multi-document transactions, no schema management, no procedural language, and many more No’s. However, NoSQL wasn’t a rebel without a cause. It promised reliability, scalability, and flexibility. Someone quipped: “If you are not ready to scale, you’re not ready to succeed!”. To the amusement of SQL folks, MongoDB claimed it’s webscale. Despite the reservations and opposition from the database community, the NoSQL movement marched on. Many wondered if it’s useful for Amazon, Google, and Facebook, who else would find it useful?

Here we are. With more than 200 NoSQL databases.

NoSQL systems haven’t remained simple key-value stores with get/set, map/reduce operations. MongoDB added an aggregation framework, a modern storage engine, and dropped ACID. DynamoDB added transactions, indexes, and now a SQL-like language. Cassandra started with CQL, an SQL-like language, made significant changes with 3.0, and has been adding new datatypes, DDLs and DMLs. Couchbase started with simple APIs, added views. In 2015, Couchbase introduced N1QL — SQL for JSON, global secondary indexes, and later added analytical service with N1QL support, and distributed transactions.

If you’re read The Innovator’s Dilemma, it should remind you of the Toyota and the mini mills stories. Incumbents left the low end of the market to Toyota Corona and mini mills which freed them to focus on higher margin markets. In the database industry, the Incumbents regarded NoSQL with the same derision or claimed they had SQL-less databases a long time ago! Still, there have been some half-hearted attempts by the incumbents to get into the market. Oracle even invested in NoSQL by buying Sleepycat Software, maker of the BerkeleyDB NoSQL database. IBM added JSON and MongoDB support to Informix and DB2. This is the equivalent of GM releasing the Geo Metro — let’s release a basic model at the lower end of the market and be done with it. Microsoft probably made the best attempt by building CosmosDB (formerly DocumentDB) from the ground up.

NoSQL isn’t simply the absence of SQL. Just like Toyota and mini mills, NoSQL has the core strengths. Toyota’s strength was the single-mold frame which enabled them to produce the car in Japan, ship it 6000 miles and still be affordable for the low end of the car buyers. The single-mold frame also provided them the base to iterate and innovate to improve the car. Mini mills continue to move up the stack to produce better quality steel and the integrated steel mills usually responded by fleeing the market and avoiding competition. One fine day, GM filed bankruptcy and the integrated steel mills went out of business.

So, what is the core differentiator of NoSQL? Distributed database from the core. Traditional RDBMS systems were built to run on a single machine and extended for hot-standby situations and then stretched for scale. NoSQL systems are designed to run on a cluster of machines from the get go. Hardware and software failures are expected and handled at a systemic level. In fact, COUCH in COUCHBASE stands for Cluster Of Unreliable Commodity Hardware. The core of NoSQL starts with a distributed database providing reliability and scale-out. Sure, there have been distributed database systems for 40 years. They focused on warehousing, not OLTP.

NoSQL started with the measly get()/set() APIs: just enough to cover the basic use cases with shopping carts, session stores, and profile managers. There was no way that their functionality could match the sophistication of SQL. The mockery of NoSQL from the traditional database vendors is reminiscent of the response from GM and integrated steel mills. The “high priests” of Oracle and IBM were happy to seed this low end market to a bunch of “know nothings.”

NoSQL databases have since added indexes, SQL with joins, aggregates, window functions, search and transactions. OK. This is just Couchbase. But, others have some of these. At Couchbase, we work with customers migrating off Oracle, Db2, Informix, SQL Server, and Sybase. These are great databases, but their core is different. Built for one, extended for a few. NoSQL systems are built for massive scale and reliability. Even when the higher-level index, query, and search capabilities are added, this core still has to operate at scale and be reliable. With all these capabilities, NoSQL has become the mainstay in the enterprises. Mission-critical workloads and thousands of clusters routinely run CouchbaseMongoDBElastic, and other NoSQL systems. NoSQL systems have made the modern database ready and affordable for modern enterprises.

SQL itself has been unreasonably effective. Many NoSQL systems have been extended to support many aspects of SQL. The competition between the NoSQL systems to provide better SQL — which means better syntax, functionality, performance, optimizer, and scale — is fierce. As these systems grow to create an even better database, let’s welcome Amazon DynamoDB to the fold and say: NoSQL is dead, long live NoSQL.

References

  1. https://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
  2. https://dzone.com/articles/keep-calm-and-query-json
  3. https://dzone.com/articles/keep-calm-and-json
  4. https://cassandra.apache.org/doc/latest/architecture/overview.html
  5. https://hostingdata.co.uk/nosql-database/
  6. http://www.vldb.org/pvldb/vol2/vldb09-938.pdf
  7. https://www.singlestore.com/blog/why-nosql-databases-wrong-tool-for-modern-application/
  8. https://www.couchbase.com/blog/unreasonable-effectiveness-of-sql/
  9. https://www.couchbase.com/blog/the-unreasonable-effectiveness-of-sql-in-nosql-databases/

Author

Posted by Keshav Murthy

Keshav Murthy is a Vice President at Couchbase R&D. Previously, he was at MapR, IBM, Informix, Sybase, with more than 20 years of experience in database design & development. He lead the SQL and NoSQL R&D team at IBM Informix. He has received two President's Club awards at Couchbase, two Outstanding Technical Achievement Awards at IBM. Keshav has a bachelor's degree in Computer Science and Engineering from the University of Mysore, India, holds ten US patents and has three US patents pending.

Leave a reply