On April 5, 2022, the US Patent and Trademark Office granted a second patent to Couchbase for its novel approach to optimizing document-oriented database queries on arrays! This feature has been available since Couchbase Server 7.1 and Couchbase Capella 7.0 but this patent recognizes our innovation in cost-based optimization for document-oriented databases.

Optimizing queries is a science that has been ongoing in relational data systems since the 1970s. And consistent with our leadership in bringing innovations to market, Couchbase has been recognized for deeply technical work in bringing query optimization to unstructured data in a JSON format. The Couchbase engineering team has been at the forefront of evolving the performance of document databases for the past decade. Our engineers’ commitment to excellence is the reason why some of the largest enterprises in the world now trust Couchbase for their mission-critical applications. We have recently patented a novel approach to cost-based optimization (CBO) for document-oriented database queries on arrays as part of this commitment. Couchbase engineering continues to bring the power of cost-based optimization to NoSQL, and this patent grant recognizes our continued innovation.  

Felicitamos a Bingjie Miao, Keshav Murthy, Marco Grecoy Prathibha Bisarahalli for their continued impressive work in the field of Cost-based Optimization! 

This post will dive into cost-based optimization (CBO), why it matters, and why CBO for queries to document databases is unique to Couchbase.

¿Qué es la optimización basada en los costes?

La optimización basada en costes (CBO) es un proceso que permite seleccionar la forma más eficaz de ejecutar una consulta a una base de datos teniendo en cuenta el coste de memoria, CPU, transporte de red y uso de disco. La CBO compara el coste de rutas de consulta alternativas y selecciona el plan de ejecución de consultas con el menor coste. 

Keshav Murthy, our VP of Engineering and one of the patent authors, uses the following map analogy to explain what CBO is:

One way to grasp CBO is to consider an airplane plan de vueloun avión puede tomar cualquier número de rutas para ir de San Francisco a São Paulo, pero sólo hay unas pocas rutas óptimas si se tienen en cuenta los costes de combustible, la resistencia del viento, el tráfico aéreo, etc. Del mismo modo, una consulta a una base de datos necesita un plan de consulta. Hay muchas formas de ejecutar la consulta, pero sólo unos pocos planes óptimos. 

Una forma de seleccionar una ruta de consulta es utilizar una optimización basada en reglas (RBO), que toma decisiones sobre la ruta de consulta basándose en reglas (por ejemplo, preferir siempre los índices con más claves). Sin embargo, RBO puede volverse muy desordenado e ineficiente muy rápidamente. Y rara vez produce la ruta de consulta más óptima. En el mundo de las bases de datos NoSQL, la mayoría de las bases de datos siguen utilizando la optimización basada en reglas.

Cost-based optimization takes a user-submitted query, selects from millions of query plans, and chooses the most performant and resource-efficient plan for query execution based on statistics. 

¿Por qué es importante la optimización basada en los costes?

The implications of CBO are that queries leverage less memory, less disk, less IO, fewer partitions, and less overflow, which leads to lower latency and lower cost for users. This is particularly meaningful for databases that handle a large number of transactions––even minor performance improvements can have a significant impact.

Keshav Murthy continuó explicando por qué es importante la CBO, utilizando de nuevo la analogía del mapa:

When it matters — like getting to your kid’s recital or a ballgame on time — would you use a static direction map that doesn’t account for the traffic? Google Maps’ route optimizer will optimize for time. The optimizers develop a plan to execute the query with the least resources: CPU and memory. Knowing this, why would you accept a static rule (or forma de la consulta!) based optimization of your business-critical database workload?

cost-based optimization in a mapping application

The database query optimizer makes decisions. These decisions have major implications on query performance, system throughput, and your ability to meet the SLAs. Databases with a better optimizer will make it easier to develop, manage and meet the SLAs. 

How CBO for document-oriented database queries on arrays is unique to Couchbase

La optimización basada en costes (CBO) para SQL existe desde hace más de 40 años and has been critical to the success of RDBMS and developer productivity. However, CBO was not generally available for document database queries until Couchbase implemented CBO for SQL++ (formerly known as N1QL) with the Couchbase Server 6.5 release in 2019. Since then, our customers have enjoyed the performance benefits of CBO for their queries––which is particularly important for many of our customers that rely on the high performance of Couchbase to power their most mission-critical applications.


The patent grant represents a technical commitment from Couchbase to deliver the best elements of SQL for our NoSQL Database platform. And with the recent patent grant, Couchbase is the only document database provider that intelligently executes cost-based optimization for NoSQL database queries––which has enormous implications on performance and cost. Before deciding on a NoSQL database, ask your vendor:  ¿Dispone de un optimizador basado en los costes?

Congratulations to our engineering team for their continued hard work to evolve the standard of excellence for document databases.

Obtenga más información sobre la optimización basada en costes para Couchbase.

Ver un vídeo corto o leer el documentación for an overview of Cost-Based Optimization in N1QL.

For a deep dive into CBO for SQL++, I recommend reading the following blog posts by Keshav Murthy, our VP of Engineering:

Gracias por leerme.

Autor

Publicado por Vinai Amble

Dejar una respuesta