Press Release

UC San Diego, Couchbase Collaborate on Next-Generation Query Language for Big Data

June 03, 2015
Combines flexibility of JSON with power of SQL

In a major step toward broader adoption of document-oriented data and the JavaScript Object Notation (JSON) data format, University of California, San Diego computer science and engineering professor Yannis Papakonstantinou and Couchbase Inc., today announced their collaboration on a next-generation query language for big data. Their work brings together the full power of SQL with the flexibility of JSON.

 

Common Vision: SQL + JSON

 

Prior to their collaboration, both Couchbase and Prof. Papakonstantinou independently concluded that existing approaches did not provide a complete and efficient solution for querying semi-structured data. Both shared a common vision of combining SQL, the leading database query language, with JSON, the leading format for modeling semi-structured data in modern applications. Both had launched work in that direction, and their decision to collaborate is based on this common vision.

 

Couchbase will fund continued research at UC San Diego to further the development of SQL++, a formally-defined, SQL-backwards-compatible declarative language for semi-structured data developed by Papakonstantinou’s team at UC San Diego’s Database Group. Couchbase will also continue to enhance N1QL, the company’s query language that extends SQL for JSON and is consistent with specifications defined by SQL++.

 

SQL++ is easy to learn, especially for developers who are familiar with the syntax of SQL. But unlike a relational database, where all data must fit neatly into tables, JSON is a lightweight data-interchange format that is easy for humans to read and write, and for machines to generate and parse.

 

As detailed in a recent technical report* from the UC San Diego Database Group, SQL++ co-creators Papakonstantinou, as well as researcher and CSE alumnus Kian Win Ong (PhD ’12), specify the syntax and semantics of SQL++, which is much cleaner and only introduces a small number of query language extensions to SQL. “SQL capabilities are most often extended by removing semantic restrictions of SQL, rather than inventing new features,” said Papakonstantinou. “This allows SQL++ to avoid unnecessary extensions over SQL.” The ease of use is also enhanced because SQL++ semantics tend to be significantly shorter than in prior query languages.

 

SQL++ and N1QL

 

After looking at 11 query languages, Papakonstantinou concluded that none provided full-fledged querying of semi-structured data. Funded by the National Science Foundation (NSF) and Informatica as UCSD’s FORWARD project, he and his team developed and launched the SQL++ specification. Concurrently, Couchbase had independently developed N1QL to provide a comprehensive query language, combining the query power of SQL with the flexibility of JSON data.

 

“Enterprises began to ask for declarative queries on semi-structured databases. With SQL++ you have a declarative query language that queries JSON and is backwards compatible with SQL,” said Papakonstantinou. “This is a query language for the new era of big data, because it operates on semi-structured data but is fully declarative and SQL compatible. It gives you the best of both worlds.  Couchbase N1QL aligns with the SQL++ specifications and the requirements of querying semi-structured data.”

 

“We are delighted to work with professor Papakonstantinou and his research team because they share our vision that a declarative query language for JSON should be based on SQL,” said Gerald Sangudi, Chief Architect for query engineering at Couchbase. “SQL++ also brings rigor and completeness that are beneficial to our users.”

 

In fact, Couchbase and UCSD have formally established that N1QL is a dialect of SQL++. The formal mapping of N1QL to SQL++ is being published separately.

 

Others to Join Collaboration

 

In addition to Couchbase, UCSD will also invite other academic and industry partners to join a query language collaboration, in order to benefit users and ease the adoption of semi-structured and NoSQL databases. Already, UC Irvine’s AsterixDB *, led by professor Mike Carey, supports most of SQL++ and is on the path to supporting the full SQL++. The collaboration has already provided important language design feedback.

 

*Kian Win Ong, Yannis Papakonstantinou, Romain Vernoux, The SQL++ Query Language: Configurable, Unifying and Semi-structured, Technical Report 2015, Department of Computer Science and Engineering, University of California, San Diego, 29 April 2015. http://arxiv.org/pdf/1405.3631v7.pdf

 

 

About UC San Diego Database Group

The Database Group is located in UC San Diego’s Computer Science and Engineering department, and is led by CSE professor Yannis Papakonstantinou, a leading expert on databases and data management technologies. He is also a co-director and on the faculty of the university’s new professional Master of Advanced Studies in Data Science and Engineering, launched in Fall 2014. Papakonstantinou is also an entrepreneur: in 2000 he founded Enosys Software, which was acquired by BEA Systems in 2003. Enosys was one of the first companies to feature a semi-structured data query processor, using XML, which is currently being rapidly replaced by JSON. More recently, Papakonstantinou, researchers Kian Win Ong and Yannis Katsis and their team of PhD and MS graduate students worked on the FORWARD project, a rapid development platform for analytics applications that uses SQL++ to create and incrementally update integrated views of data across multiple databases (SQL, NoSQL, or both). FORWARD includes a middleware query processor that uses SQL++ to issue distributed queries over a variety of data sources, including SQL, NoSQL, NewSQL and SQL-on-Hadoop.  The FORWARD project's SQL++-based visualization and app development platform has been commercially deployed. More information about FORWARD project at http://forward.ucsd.edu/.

 

Contact

Doug Ramsey

858-822-5825

dramsey@ucsd.edu

About Couchbase

Couchbase's mission is to be the data platform that revolutionizes digital innovation. To make this possible, Couchbase created the world's first Engagement Database to help deliver ever-richer and ever more personalized customer and employee experiences. Built with the most powerful NoSQL technology, the Couchbase Data Platform was architected on top of an open source foundation for the massively interactive enterprise. Our geo-distributed Engagement Database provides unmatched developer agility and manageability, as well as unparalleled performance at any scale, from any cloud to the edge.

Couchbase has become pervasive in our everyday lives; our customers include industry leaders Amadeus, AT&T, BD (Becton, Dickinson and Company), Carrefour, Cisco, Comcast, Disney, DreamWorks Animation, eBay, Marriott, Neiman Marcus, Tesco, Tommy Hilfiger, United, Verizon, Wells Fargo, as well as hundreds of other household names. For more information, visit www.couchbase.com.

 

Press Contact

Christina Knittel

christina.knittel@couchbase.com
(775) 209-2461

Couchbase in news and press