Couchbase Server 7.0 (Enterprise Edition) introduced a Cost-Based Optimizer (CBO). Previous versions of Couchbase Server used a Rule-Based Optimizer (RBO). Couchbase Server has naturally evolved from RBO to CBO like most major relational database systems.

The CBO uses cost estimates to choose a query plan calculated from optimizer statistics. Optimizer statistics are gathered using the UPDATE STATISTICS or ANALYZE statements in a SQL++ query (referred to below as N1QL).

UPDATE STATISTICS/ANALYZE statement

The new UPDATE STATISTICS or ANALYZE statements trigger optimizer statistics gathering necessary for CBO functionality. Currently, a user issues these new statements manually, which computes optimizer statistics on keyspace/collection, indexes defined on the keyspace/collection and distribution information (as histograms) on a list of terms (expressions).

When there are multiple query nodes, optimizer statistics gathered from one query node are available on all query nodes.

The new N1QL_SYSTEM_BUCKET

The new optimizer gathers statistics through the UPDATES STATISTICS/ANALYZE statement and stores them in a new bucket (N1QL_SYSTEM_BUCKET), specifically under a new scope (N1QL_SYSTEM_SCOPE) and a new collection (N1QL_CBO_STATS).

The first time it is run, the UPDATE STATISTICS/ANALYZE statement automatically creates this new bucket/scope/collection. A primary index is also automatically created.

Dropping of the new bucket/scope/collection
These new system objects (bucket, scope, collection) can be dropped from the database, resulting in all optimizer statistics no longer being available.

Any future query will not take advantage of CBO functionality until a user requests statistics to be analyzed again. Until that time, the system falls back to RBO for query planning.

Memory quota for the new bucket/scope/collection
The new system bucket (N1QL_SYSTEM_BUCKET) is created with a minimum memory quota of 100 MB. On a large system, DBAs should monitor memory residency on the bucket, and memory quota increased when necessary to avoid its memory residency dropping below the recommended 15% level.

Syntax

At a high level, there are three aspects of an UPDATE STATISTICS/ANALYZE statement:
1. Keyspace/collection statistics – gathered every time an update/analyze statement is issued
2. Index statistics – gathered for “relevant” indexes
3. Distribution statistics (histogram) – gathered on the terms specified.

There are several ways to specify the list of terms for which distribution statistics are collected.

Providing a list of terms to build statistics on

These two statements are equivalent. The UPDATE STATISTICS statement has an optional FOR keyword followed by a keyspace name. The ANALYZE statement has an optional KEYSPACE or COLLECTION keyword followed by a keyspace name.

The keyspace name is typically a collection name. You can specify a bucket name as the keyspace name, in which case optimizer statistics are gathered on the _default collection under the _default scope for that bucket.

The list of terms is similar to the list of expressions used in the CREATE INDEX statement. In fact, any expression supported in a CREATE INDEX statement can also be used as a term in an UPDATE STATISTICS/ANALYZE statement. For example:

The above statements can be simplified if the query-context is set to travel-sample.inventory:

Functional key expression

Similar to specifying a functional index key in a CREATE INDEX statement, you can also specify a function key in an UPDATE STATISTICS/ANALYZE statement. For example:

Array key expression

Like specifying an array index key in a CREATE INDEX statement, you can also specify an array index key in an UPDATE STATISTICS/ANALYZE statement. For example:

or using the shorthand version:

Analyzing one or more indexes

Analyze statements can optionally specify one or more indexes using the INDEX option:

When the INDEX option is specified, the list of index key expressions for all the indexes is used as the list of terms for the UPDATE STATISTICS/ANALYZE statement.

Expressions in the WHERE clause for a partial index are also added to terms for analyzing. Thus, the INDEX option is a convenient shortcut as the user does not need to list all terms explicitly.

The above statements are equivalent to the following since each index only has a single index key, and there is no WHERE clause for either index referenced:

Analyzing all indexes

To go one step further, you can use the INDEX ALL option for an UPDATE STATISTICS/ANALYZE statement, which considers all available indexes defined on the collection and are in an online state. For example:

While other options allow you to specify either a bucket name or a collection name, the INDEX ALL option is only valid for a collection. Specifying INDEX ALL with a bucket name will return an error.

Index residency

Analyzing also updates optimizer statistics for relevant indexes. When a list of terms is specified, the applicable index list is constructed by checking for index keys that reference any terms in the list. Relevant index lists are deduced from the indexes when the INDEX or INDEX ALL option is used.

Optimizer statistics for an index can only be gathered when the index is at least partially memory-resident. The optimizer statistics are gathered from the memory-resident portion of an index. A higher memory-resident ratio produces more accurate optimizer statistics for that index.

If an index has a memory-resident ratio of zero, then the following warnings are returned by the optimization statement:

Optional WITH clause

Similar to the CREATE INDEX statement, you can also specify an optional WITH clause to add options for statistics analysis. For example:

The following options are currently supported in the WITH clause of an UPDATE STATISTICS/ANALYZE statement: resolution, sample_size, batch_size, update_statistics_timeout. Each one is covered below.

The resolution option

The term’s distribution statistics (histogram) is represented as a series of “bins”. The resolution indicates what percentage of documents is represented in each bin. By default, a resolution of 1.0 (i.e., 100 bins) is used.

You can specify a different resolution in the WITH clause. For example, if you need a finer granularity of distribution statistics, you can specify a smaller resolution (corresponding to a larger number of bins).

Note that the smaller the resolution (larger number of bins), the more samples are required to construct the histogram, and the more time/resources are required to complete the UPDATE STATISTICS/ANALYZE statement.

The valid range of the resolution option is 0.02 – 5.0.

The sample_size option

Samples are required to construct the distribution statistics (histogram) for the terms specified in an UPDATE STATISTICS/ANALYZE statement. The resolution determines the sample size and is not directly related to the number of documents in a collection.

You can specify a sample size to be used. However, a minimum sample size is required for a given resolution. If the sample size is smaller than the minimum required sample size, the minimum required sample size will be used instead, and a warning will be returned.

The batch_size option

Construction of histograms requires a sorting operation and thus is resource-intensive. To avoid excessive resource utilization, the histograms are constructed in batches. By default, each batch processes up to 10 terms for histogram construction. If more are listed, or the combination of the index key expressions (from indexes specified by INDEX options) exceeds ten terms, multiple batches are required to process all the terms.

You can specify a batch_size option to let the server know how many terms to process in each batch. For example, a user can request a larger batch size if the system is not busy and plenty of compute resources are available. On the other hand, if the system is already overloaded, then a smaller batch size can be used. It’s a balance of resource usage vs. analyze statement duration.

The update_statistics_timeout option

The UPDATE STATISTICS/ANALYZE statement is a maintenance task and can cause the response time to be significantly higher than a simple query, especially when multiple batches are required.

A timeout value of 60 seconds is applied to an update/analyze statement by default. One can specify an optional value (in seconds) for the timeout value. Note that the timeout value applies to each batch when multiple batches are required.

DELETE clause

Users can drop current optimizer statistics using a DELETE clause with the update/analyze statement. This clause selectively disables CBO for a query or set of queries.

There are a couple of ways to specify the DELETE option – by specifying a list of terms for which optimizer statistics will be deleted or request to delete all optimizer statistics for a keyspace/collection.

DELETE clause with a list of terms

The query will delete the distribution statistics for only the specified terms with this option. Other optimizer statistics such as keyspace and index statistics will not be affected. With this option, if a query has a predicate on the “city” or “country” field of “airport” collection, CBO will be disabled for the query.

DELETE clause for all optimizer statistics of a keyspace/collection

With this option, all optimizer statistics for the collection “airport” will be deleted, including keyspace statistics, index statistics, distribution statistics. Any query referencing the “airport” collection will have CBO disabled in this case.

Which fields/expressions to analyze?

How do I know which fields/expressions to include in an UPDATE STATISTICS/ANALYZE statement? As a general rule, any expression referenced in a query (except those in the projection clause only) requires optimizer statistics.

WHERE, ON, GROUP BY, ORDER BY clauses

Expressions referenced in the WHERE, ON, GROUP BY and ORDER BY clauses require optimizer statistics. For example:

For this query, optimizer statistics are required on:

If a more complex expression is used:

For this query, optimizer statistics are required on:

Array index expressions

For ANY or ANY AND EVERY clauses, utilizing array indexes, optimizer statistics are required for the corresponding array index key expression. For example:

For this query, optimizer statistics are required on:

Note the expression used is the same as the array index key expression for the corresponding array index definition.

For an UNNEST operation:

For this query, optimizer statistics are required on:

Again, the expression used is the same as the array index key expression for the corresponding array index definition.

Subquery and Common-Table-Expression (CTE)

For a query involving a subquery or a CTE, look at the query block for the subquery or CTE to determine what expressions require optimizer statistics:

Looking at the query block defining the CTE, optimizer statistics are required on:

Window functions

If a window function is used, then optimizer statistics are required on the expressions used in the PARTITION BY and ORDER BY clauses:

For this query, optimizer statistics are required on:

View optimizer statistics

There are two new system keyspaces introduced for optimizer statistics – system:dictionary and system:dictionary_cache. The first corresponds to the on-disk documents stored in N1QL_SYSTEM_BUCKET.N1QL_SYSTEM_SCOPE.N1QL_CBO_STATS, while system:dictionary_cache corresponds to the in-memory cached representation of a subset of the same information.

The dictionary cache initializes when a query node starts, and cache information is populated when running queries. The data retrieved from system:dictionary will be the same regardless of which query node gets the info when using multiple query nodes.

However, cached information from all query nodes in a multi-node cluster will be in system:dictionary_cache, where each node may potentially have a different subset of cached information.

This query shows the statistic in system:dictionary:

The query results are:

Each entry contains basic information on the keyspace, an array of distributionKeys for which histograms are available and an array of index information.

Similarly, when querying system:dictionary_cache:

The result is the same as the system:dictionary query except for an extra node field that shows entries from different query nodes.

Explain output with optimizer_estimates

When appropriate optimizer statistics are available, the EXPLAIN output will contain a new optimizer_estimates section for each operator. These sections reflect the optimizer’s estimate for the cost information for this operator and all of its children. For example:

The following output is for the IndexScan3 operator:

Having the optimizer_estimates section for an operator is an indication that the database is using CBO for this stage of the query planning process.

Summary

Innovative new database features were made available in Couchbase 7.0; the addition of the Cost-Based Optimizer (CBO) is just one of many. The update/analyze queries shown in this blog post are essential to collecting the statistics needed by the CBO to function efficiently.

Once the database gathers accurate optimizer statistics, the EXPLAIN output shows the estimates that CBO will use to choose a query plan. This will help guide you to more efficient and predictable query generation in the future.

Author

Posted by Bingjie Miao, Principal Software Engineer Couchbase

Bingjie Miao is a principal software engineer at Couchbase. Bingjie has 20 years of experience in relational and NoSQL databases. His main area of expertise is query optimization and query execution.

2 Comments

  1. This was helpful, thanks Bingjie.

    A query: Currently, there is no out-of-the-box job, like available in Oracle, to gather statistics. So it has to be done manually on a regular basis, perhaps with cron jobs. Is the understanding correct?

    1. Bingjie Miao, Senior Software Engineer, Couchbase December 7, 2023 at 10:18 am

      Yes your understanding is correct. Currently optimizer statistics gathering is a manual process. We are considering enhancements in this area to make it easier to use.

Leave a reply