A SQL++ Implementation

Summary

With the proliferation of JSON databases in recent years, a new query language SQL++ has begun to emerge that could standardize on how to access these data sources. In fact the language has been around for a number of years now, but has existed mostly in the academic realm. Couchbase N1QL, a SQL language for JSON data, was released in 2015. The company has continued to develop its N1QL/SQL++ language and in October 2018, it announced the release of Couchbase Analytics. Don Chamberlin, co-inventor of the original SQL, has also authored a SQL++ tutorial book with practical examples that work with Couchbase Analytics.

Overview

SQL++ is a database query language that is designed to work with both structured and semi-structured data. The language is based on the original SQL with extensions for it to work with JSON document database. In relational databases, data is represented in a tabular fashion. The rows in a table each have the same flat record structure, with identical field names and field types (according to the table’s schema). Semistructured databases relax these constraints, allowing records to be nested, to have different field names and types, and do not require a schema. Accordingly, SQL++ “extends” SQL, the query language standard used in the relational world, by relaxing its restrictions on the data model. By doing so, SQL++ retains the benefits of SQL, including its high-level declarative nature, while allowing it to handle the more flexible structures commonly found in the semi-structured world. Relational database vendors like IBM , Microsoft, and Oracle, as well as open-source systems like PostgreSQL and MySQL, have extended their own versions of SQL to work with JSON data. They add (often system-specific) extensions for JSON as a column type as well as new functions, and in some cases new syntax, to enable the manipulation of JSON documents[1][2][3][4][5] The ANSI/ISO SQL standard itself was extended in a similar fashion in 2016[6].

In contrast, as explained above, SQL++ was developed by relaxing SQL’s target data model in order to arrive at a query language where JSON data is treated as “first class” data and where relational data is a special case whose records are regular, identically typed, and not nested.[7]

History

SQL++ was originally developed by Yannis Papakonstantinou and others at the University of California, San Diego^[8]. SQL++ was also used in the NSF funded FORWARD project^[9] at the UCSD.

Applications

One of the early adopters of SQL++ was Apache AsterixDB, an open source Big Data Management System, originally co-developed by a team of faculty, staff, and students at UC Irvine and UC Riverside in 2009. Another early SQL++ adopter is Couchbase, Inc., a scalable JSON database vendor whose 6.0 release has adopted SQL++ for its Couchbase Analytics [8] offering.

Books

Donald D Chamberlin, one of the principal designers of the original SQL language specification, authored a tutorial for the SQL++ language[11] that includes practical examples to show how the language works with Couchbase Analytics.

Examples

SELECT & SELECT VALUE

One of the key differences between SQL and SQL++ is in the format of the result. Standard SQL, designed for row and table, returns the result set in a table format. SQL++, on the other hand, returns the result set in JSON format.

(Q1) List the customer id, name, zipcode, and credit rating of all customers, in order by customer id.

SELECT custid, name, address.zipcode, rating 
FROM customers ORDER BY custid LIMIT 2; 

Result: 
[ 
  { 	“custid”	: “C13”, 
	“Name”		: “T. Cruise”, 
	“Zipcode”	: “63101”,
	“Rating”	: 750 
  }, 
  {     “custid”	: “C25”, 
	“Name”		: “M. Streep”, 
	“Zipcode”	: “02340”,
	“Rating”	: 690 
  }
]

SELECT custid, name, address.zipcode, rating

FROM customers ORDER BY custid LIMIT 2;

Result:

[

{ “custid” : “C13”,

“Name” : “T. Cruise”,

“Zipcode” : “63101”,

“Rating” : 750

{ “custid” : “C25”,

“Name” : “M. Streep”,

“Zipcode” : “02340”,

“Rating” : 690

}

]

(Q2) Find the names of customers with a rating greater than 650.

SELECT name FROM customers WHERE rating > 650;

Result:
[   
  { “name”: “T. Cruise” },
  { “name”: “M. Streep” },
  { “name”: “T. Hanks”  } 
]

SELECT name FROM customers WHERE rating > 650;

Result:

[

{ “name”: “T. Cruise” },

{ “name”: “M. Streep” },

{ “name”: “T. Hanks” }

]

(Q3) Shows the effects of SELECT VALUE (compare to Q2).

SELECT VALUE name FROM customers Queries WHERE rating > 650;

Result:
[ “T. Cruise”, “M. Streep”, “T. Hanks” ]

SELECT VALUE name FROM customers Queries WHERE rating > 650;

Result:

[ “T. Cruise”, “M. Streep”, “T. Hanks” ]

A SELECT VALUE query can be used with an object constructor to create labels or to give some structure to a query result, as in the following example.

(Q4) List customers with credit rating greater than 650, in order by descending credit rating, and again in ascending order by zipcode.

SELECT VALUE 
{ “high-rated customers, ordered by rating”: 
  (SELECT c.rating, c.custid, c.name 
      FROM customers AS c WHERE c.rating > 650 
      ORDER BY c.rating DESC), 
  “high-rated customers, ordered by zipcode”: 
  (SELECT c.address.zipcode, c.custid, c.name 
      FROM customers AS c WHERE c.rating > 650 
      ORDER BY c.address.zipcode) 
};
Result:
[ 
   { “high-rated customers, ordered by rating”: 
    [ { “rating”: 750, “custid”: “C13”, “name”: “T. Cruise” }, 
      { “rating”: 750, “custid”: “C37”, “name”: “T. Hanks” }, 
      { “rating”: 690, “custid”: “C25”, “name”: “M. Streep” } 
    ], 
    “high-rated customers, ordered by zipcode”: 
    [ { “zipcode”: “02115”, “custid”: “C37”, “name”: “T. Hanks” }, 
      { “zipcode”: “02340”, “custid”: “C25”, “name”: “M. Streep” }, 
      { “zipcode”: “63101”, “custid”: “C13”, “name”: “T. Cruise” } 
    ] 
   } 
]

SELECT VALUE

{ “high-rated customers, ordered by rating”:

(SELECT c.rating, c.custid, c.name

FROM customers AS c WHERE c.rating > 650

ORDER BY c.rating DESC),

“high-rated customers, ordered by zipcode”:

(SELECT c.address.zipcode, c.custid, c.name

FROM customers AS c WHERE c.rating > 650

ORDER BY c.address.zipcode)

};

Result:

[

{ “high-rated customers, ordered by rating”:

[ { “rating”: 750, “custid”: “C13”, “name”: “T. Cruise” },

{ “rating”: 750, “custid”: “C37”, “name”: “T. Hanks” },

{ “rating”: 690, “custid”: “C25”, “name”: “M. Streep” }

“high-rated customers, ordered by zipcode”:

[ { “zipcode”: “02115”, “custid”: “C37”, “name”: “T. Hanks” },

{ “zipcode”: “02340”, “custid”: “C25”, “name”: “M. Streep” },

{ “zipcode”: “63101”, “custid”: “C13”, “name”: “T. Cruise” }

]

}

]

GROUPING, AGGREGATION and UNNEST

SQL++ supports the same SQL concept of grouping and aggregation. The UNNEST takes the contents of nested arrays, i.e. orders and join them with their parent object, i.e. customers.

(Q6) List the first order by order number and item number, together with total quantity for all the orders made on 2017-05-01.

SELECT   o.orderno, 
         i.itemno AS item_number,
         sum(i.qty) AS quantity
FROM orders AS o 
  UNNEST o.items AS i
WHERE o.order_date = “2017-05-01”
  GROUP BY o.orderno, i.itemno 
  ORDER BY o.orderno, item_number
LIMIT 1;
Result:
[
  {
    “orderno”: 1002,
    “item_number”: 460,
    “quantity”: 95
  }
]

SELECT o.orderno,

i.itemno AS item_number,

sum(i.qty) AS quantity

FROM orders AS o

UNNEST o.items AS i

WHERE o.order_date = “2017-05-01”

GROUP BY o.orderno, i.itemno

ORDER BY o.orderno, item_number

LIMIT 1;

Result:

[

{

“orderno”: 1002,

“item_number”: 460,

“quantity”: 95

}

]

GROUP AS

A query can generate output data at summary level. The level definition is provided in the GROUP BY clause. The Q6 query generates a summary of orders at the order number and order item number level. Often you will want to generate output that includes both summary data and line items within the summaries. For this purpose, SQL++ supports several important extensions to the traditional grouping features of SQL. The familiar GROUP BY and HAVING clauses are still there, and they are joined by a new clause called GROUP AS.

(Q7) List all orders by order number and item number, together with total quantity for all the orders made on 2017-05-01, also include all the orders and order items that made for each summary line. (Compare to Q6)

SELECT   o.orderno, 
         i.itemno AS item_number,
         sum(i.qty) AS quantity,
         od
FROM orders AS o 
  UNNEST o.items AS i
WHERE o.order_date = “2017-05-01”
  GROUP BY o.orderno, i.itemno 
  GROUP AS od
LIMIT 1;

Result:
[
  {
    “od”: [
      {
        “o”: {
          “custid”: “C13”,
          “items”: [
            {
              “itemno”: 460,
              “price”: 100.99,
              “qty”: 95
            },
            {
              “itemno”: 680,
              “price”: 8.75,
              “qty”: 150
            }
          ],
          “order_date”: “2017-05-01”,
          “orderno”: 1002,
          “ship_date”: “2017-05-03”
        },
        “i”: {
          “itemno”: 460,
          “price”: 100.99,
          “qty”: 95
        }
      }
    ],
    “orderno”: 1002,
    “item_number”: 460,
    “quantity”: 95
  }
]

SELECT o.orderno,

i.itemno AS item_number,

sum(i.qty) AS quantity,

FROM orders AS o

UNNEST o.items AS i

WHERE o.order_date = “2017-05-01”

GROUP BY o.orderno, i.itemno

GROUP AS od

LIMIT 1;

Result:

[

{

“od”: [

{

“o”: {

“custid”: “C13”,

“items”: [

{

“itemno”: 460,

“price”: 100.99,

“qty”: 95

{

“itemno”: 680,

“price”: 8.75,

“qty”: 150

}

“order_date”: “2017-05-01”,

“orderno”: 1002,

“ship_date”: “2017-05-03”

“i”: {

“itemno”: 460,

“price”: 100.99,

“qty”: 95

}

“orderno”: 1002,

“item_number”: 460,

“quantity”: 95

}

]

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

A SQL++ Implementation

Azure Key Vault for Credentials

Your AI Agents Are Stuck in Pilot. It’s a Data Problem, Not a Model Problem.

When the Internet Goes Down, Your Business Shouldn’t

Distributed Databases: An Overview

On-Device AI: Benefits, Use Cases, and Challenges

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch

Platform

Services

Self-Managed

Capabilities

By Use Case

By Industry

Popular Docs

Quickstart

Resource Center

About

Partnerships

A SQL++ Implementation

Summary

Overview

History

Applications

Books

Examples

SELECT & SELECT VALUE

GROUPING, AGGREGATION and UNNEST

GROUP AS

Get Couchbase blog updates in your inbox

Author

Posted by Binh Le

Leave a comment Cancel reply

Ready to get Started with Couchbase Capella?

Start building

Use Capella free

Get in touch