Couchbase GSI Index partitioning

In Couchbase, data is always partitioned using the consistent hash value of the document key into vbukets which are stored on the data nodes. Couchbase Global Secondary Index (GSI) abstracts the indexing operations and runs as a distinct service within the Couchbase data platform. When a single index can cover a whole type of documents, everything is good. But, there are cases where you’d want to partition an index.

Capacity: You want increased capacity because a single node is unable to hold a big index
Queriability: You want to avoid rewriting the query to work with manual partitioning of the index using a partial index.
Performance: Single index is unable to meet the SLA

To address this, Couchbase 5.5 introduces automatic hash partitioning of the index. You’re used to having bucket data hashed into multiple nodes. Index partitioning enables you to hash the index into multiple nodes as well. There is good symmetry.

Creating the index is easy. Simply add a PARTITION BY clause to the CREATE index definition.


CREATE INDEX ih ON customer(state, name, zip, status) 
PARTITION BY HASH(state) 
WHERE type = "cx" WITH {"num_partition":8}

CREATE INDEX ih ON customer(state, name, zip, status)

PARTITION BY HASH(state)

WHERE type = "cx" WITH {"num_partition":8}

This as the following meta data in the system:indexes. Note the new field partition with the hash expression. The HASH(state) is the basis on which the index logically named customer.ih is divided into a number of physical index partitions. By default, the number of index partitions are 16 and it can be changed by specifying num_partition parameter. In the example above, we create 8 partitions for the index customer.ih.


select * 
from system:indexes 
where keyspace_id = "customer" and name = "ih" ;
  {
    "indexes": {
      "condition": "(`type` = \"cx\")",
      "datastore_id": "http://127.0.0.1:8091",
      "id": "b3ce745f84256319",
      "index_key": [
        "`state`",
        "`name`",
        "`zip`",
        "`status`"
      ],
      "keyspace_id": "customer",
      "name": "ih",
      "namespace_id": "default",
      "partition": "HASH(`state`)",
      "state": "online",
      "using": "gsi"
    }
  }

select *

from system:indexes

where keyspace_id = "customer" and name = "ih" ;

{

"indexes": {

"condition": "(`type` = \"cx\")",

"datastore_id": "http://127.0.0.1:8091",

"id": "b3ce745f84256319",

"index_key": [

"`state`",

"`name`",

"`zip`",

"`status`"

"keyspace_id": "customer",

"name": "ih",

"namespace_id": "default",

"partition": "HASH(`state`)",

"state": "online",

"using": "gsi"

}

Now, issue the following query. You don’t need additional predicate on the hash key for the query to use the index. The index scan simply scans all of the index partitions as part of the index scan.


SELECT * 
FROM customer 
WHERE type = "cx" 
and name = "acme" 
and zip = "94051";

SELECT *

FROM customer

WHERE type = "cx"

and name = "acme"

and zip = "94051";

However, if you do have an equality predicate on the hash key, index scan detects the right index partition having the right range of data and prunes rest of the index nodes from the index scan. This makes the index scan very efficient.


SELECT * 
FROM customer 
WHERE type = "cx" 
and name = "acme" 
and zip = "94051"
and state = "CA";

SELECT *

FROM customer

WHERE type = "cx"

and name = "acme"

and zip = "94051"

and state = "CA";

Now, let’s look at how this index helps you with three things we mentioned before: Capacity, Queriability and Performance.

Capacity

The query customer.ih will be partitioned to a specified number of partitioned with each partition stored on one of the index nodes on the cluster. The indexer uses a stochastic optimization algorithm to determine how to distribute the partitions onto the set of indexer nodes, based on the free resource available on each node. Alternatively, to restrict the index to a specific set of nodes, use the nodes parameter. This index will create eight index partitions and store four each on the four index nodes specified.


CREATE INDEX ih ON customer(state, name, zip, status) 
PARTITION BY HASH(state) 
WHERE type = "cx" WITH {"num_partition":8, 
"nodes":["172.23.125.32:9001", "172.23.125.28:9001", "172.23.93.82:9001","172.23.45.20:9001" ]}

CREATE INDEX ih ON customer(state, name, zip, status)

PARTITION BY HASH(state)

WHERE type = "cx" WITH {"num_partition":8,

"nodes":["172.23.125.32:9001", "172.23.125.28:9001", "172.23.93.82:9001","172.23.45.20:9001" ]}

So, with this hash partitioned index, one logical index (customer.ih) will be partitioned into a number of physical index partitions (in this case, 8 partitions) and give the query an illusion of a single index.

Because this index uses the multiple physical nodes, the index will have more disk, memory and CPU resources available. Increased storage in these nodes makes it possible to create larger indexes.

You write your queries, as usual, requiring predicates only the WHERE clause (type = “cx”) an at least on one of the leading index keys (e.g. name).

Queriability

Limitations in the Couchbase 5.0 indexing:

Until Couchbase 5.0, you could manually partition the index like below. You had to partition them manually using the WHERE clause on the CREATE INDEX. Consider the following indexes, one per state. By using the node parameter, you could place them in specific index nodes or the index will try to automatically spread out within the index nodes.


CREATE INDEX i1 ON customer(name, zip, status) WHERE state = "CA";
CREATE INDEX i2 ON customer(name, zip, status) WHERE state = "NV";
CREATE INDEX i3 ON customer(name, zip, status) WHERE state = "OR";
CREATE INDEX i4 ON customer(name, zip, status) WHERE state = "WA";

CREATE INDEX i1 ON customer(name, zip, status) WHERE state = "CA";

CREATE INDEX i2 ON customer(name, zip, status) WHERE state = "NV";

CREATE INDEX i3 ON customer(name, zip, status) WHERE state = "OR";

CREATE INDEX i4 ON customer(name, zip, status) WHERE state = "WA";

For a simple query with equalify predicate on state, it all works well.


SELECT * 
FROM customer 
WHERE state = "CA" and name = "acme" and zip = "94051";

SELECT *

FROM customer

WHERE state = "CA" and name = "acme" and zip = "94051";

There are two issues with this manual partitioning.

Consider the following with slightly complex predicate on the state. Because the predicate (state IN [“CA”, “OR”]) is not a subset of any of the WHERE clauses of the index, none of the indexes can be used for the query below.


SELECT * FROM customer 
WHERE state IN ["CA", "OR"] and name = ACME; 
SELECT * FROM customer 
WHERE state > "CA" and name = ACME;

SELECT * FROM customer

WHERE state IN ["CA", "OR"] and name = ACME;

SELECT * FROM customer

WHERE state > "CA" and name = ACME;

2. If you get data to a new state, you’d to be aware of it and create the index in advance.


SELECT * FROM customer WHERE state = "CO" and name = ACME

SELECT * FROM customer WHERE state = "CO" and name = ACME

If the field numerical field, you can use the the MOD() function.


CREATE INDEX ix1 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 0);
CREATE INDEX ix2 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 1);
CREATE INDEX ix3 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 2);
CREATE INDEX ix4 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 3);

CREATE INDEX ix1 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 0);

CREATE INDEX ix2 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 1);

CREATE INDEX ix3 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 2);

CREATE INDEX ix4 ON customer(name, zip, status) WHERE (MOD(cxid) % 4 = 3);

Even this work around each query block can only use one index and requires queries to be written carefully to match one of the predicates in the WHERE clause.

Solution:

As you see from the figure above, the interaction between the query and index goes through the GSI client sitting inside each query node. Each GSI client gives the illusion of a single logical index (customer.ih) on top of eight physical index partitions.

The GSI client takes all of the index scan request and then using the predicate, tries to see if it can identify which of index partitions has the data needed for the query. This is the process of partition pruning (aka partition elimination). For the hash based partitioning scheme, equality and IN clause predicates get the benefit of partition pruning. All other expressions use the scatter-gather method. After the logical elimination, GSI client sends the request to the remaining nodes, gets the result, merges the result and sends the result back to query. The big benefit of this is that queries can be written without worrying about the manual partitioning expression.

Example query below does not even have a predicate on the hash key, state. The below query does not get the benefit of partition elimination. Therefore, the GSI client issues scan to every index partition in parallel and then merges the result from each of the index scan. The big benefit of this is that queries can be written without worrying about the manual partitioning expression to match the partial index expression and still use the full capacity of the cluster resources.


CREATE INDEX ih1 ON customer(name, zip, status) 
PARTITION BY HASH(state) 
WHERE type = "cx" WITH {"num_partition":8}

SELECT * 
FROM customer 
WHERE type = "cx" 
and name = "acme" 
and zip = "94051";

CREATE INDEX ih1 ON customer(name, zip, status)

PARTITION BY HASH(state)

WHERE type = "cx" WITH {"num_partition":8}

SELECT *

FROM customer

WHERE type = "cx"

and name = "acme"

and zip = "94051";

Additional predicate on the hash key (state = “CA”) in the query below will benefit from partition pruning. For query processing, for simple queries with equality predicates on hash key, you get uniform distribution of the workload on these multiple partitions of the index. For complex queries including the grouping & aggregation we discussed above, the scans, partial aggregations are done in parallel, improving the query latency.


SELECT * 
FROM customer 
WHERE type = "cx" 
and name = "acme" 
and zip = "94051"
and state = "CA";

SELECT *

FROM customer

WHERE type = "cx"

and name = "acme"

and zip = "94051"

and state = "CA";

You can create indexes by hashing on one or more keys, each of which could be an expression. Here are some examples.


CREATE INDEX idx1 ON customer(name) PARTITION BY HASH(META().id);
CREATE INDEX idx2 ON customer(name) PARTITION BY HASH(name, zip);
CREATE INDEX idx3 ON customer(name) 
                      PARTITION BY HASH(SUBSTR(name, 5, 10));
CREATE INDEX idx3 ON customer(name) 
PARTITION BY 
HASH(SUBSTR(META().id, POSITION(META().id, "::")+2), zip)

CREATE INDEX idx1 ON customer(name) PARTITION BY HASH(META().id);

CREATE INDEX idx2 ON customer(name) PARTITION BY HASH(name, zip);

CREATE INDEX idx3 ON customer(name)

PARTITION BY HASH(SUBSTR(name, 5, 10));

CREATE INDEX idx3 ON customer(name)

PARTITION BY

HASH(SUBSTR(META().id, POSITION(META().id, "::")+2), zip)

Performance

For majority of database features, performance is everything. Without a great performance, proven by good benchmarks, the features are simply pretty syntax diagrams!

Index partitioning gives you improved performance in two ways.

Scale out. The partitions are distributed into multiple nodes, increasing the CPU and memory availability of for the index scan.
Parallel scan. Right predicate giving queries the benefit of partition pruning. Even after the pruning process, scans of all the indexes are done in parallel.
Parallel grouping and aggregation. The DZone article Understanding Index Grouping and Aggregation in Couchbase N1QL Queries explains the core performance improvement of grouping and aggregation using indexes.
The parallelism of the index parallel scan (and grouping, aggregation) is determined by the max_parallelism parameter. This parameter can be set per query node and/or per query request.

Consider the following index and query:


CREATE INDEX ih1 ON customer(name, zip, status) 
PARTITION BY HASH(state) 
WHERE type = "cx" WITH {"num_partition":8}

select zip, count(1) zipcount
from customer
where type = "cx" and name is not missing
group by zip;

CREATE INDEX ih1 ON customer(name, zip, status)

PARTITION BY HASH(state)

WHERE type = "cx" WITH {"num_partition":8}

select zip, count(1) zipcount

from customer

where type = "cx" and name is not missing

group by zip;

The index is partitioned by HASH(state), but state predicate is missing from the query. For this query, we cannot do partition pruning or create groups within individual scans of the index partitions. Therefore, it will need a merge phase after the partial aggregation with the query (not shown in the explain). Remember, these partial aggregations happen in parallel and therefore reduces the latency of the query.

Consider the following index and query:


CREATE INDEX ih2 ON customer(state, city, zip, status) 
PARTITION BY HASH(zip) 
WHERE type = "cx" WITH {"num_partition":8}

CREATE INDEX ih2 ON customer(state, city, zip, status)

PARTITION BY HASH(zip)

WHERE type = "cx" WITH {"num_partition":8}

Example a:


select state, count(1) zipcount
from customer
where state is not missing
group by state, city, zip;

select state, count(1) zipcount

from customer

where state is not missing

group by state, city, zip;

In the above example, the group by is on the leading keys (state, city, zip) of the index and hash key (zip) is part of the group by clause. This will help the query to scan the index and simply created the required groups.

Example b:


select zip, count(1) zipcount
from customer
where type = "cx" 
and city = "San Francisco"
and state = "CA"
group by zip;

select zip, count(1) zipcount

from customer

where type = "cx"

and city = "San Francisco"

and state = "CA"

group by zip;

In the above example, the group by is on the third key (zip) of the index and hash key (zip) is part of the group by clause. In the predicate clause (WHERE clause), there is single equality predicate on the leading index keys before the key zip (state and city). Therefore, we implicitly include the keys (state, city) in the group by without affecting the query result. This will help the query to scan the index and simply created the required groups.

Example c:


select zip, count(1) zipcount
from customer
where type = "cx" 
and city like "San%"
and state = "CA"
group by zip;

select zip, count(1) zipcount

from customer

where type = "cx"

and city like "San%"

and state = "CA"

group by zip;

In the above example, the group by is on the third key (zip) of the index and hash key (zip) is part of the group by clause. In the predicate clause (WHERE clause), there is range predicate on city. The index key (city) is before the hash key(zip). So, we create partial aggregates as part of the index scan and then then query will merge these partial aggregates to create the final resultset.

Summary:

Index partition gives you increased capacity for your index, better queriability and higher performance for your queries. By exploiting the Couchbase scale-out architecture, indexes improve your capacity, queriability, performance and TCO.

Reference:

Couchbase documentation: https://developer.couchbase.com/documentation/server/5.5/indexes/gsi-for-n1ql.html
Couchbase N1QL documentation: https://developer.couchbase.com/documentation/server/5.5/indexes/gsi-for-n1ql.html#

Keshav Murthy

2 Comments

Debashis Paul January 4, 2020 at 12:02 pm

Dear Keshav , Thanks for always providing great articles .
Few questions : if we use “num_partition”:8 while creating partition index by hash does it mean that we need to have 8 indexer nodes present in cluster ? or this # can be mutually exclusive and no correlation with this and only correlation with max_parallelism setting in admin ?

Also Can you please help if bumping up “Servicers” in admin settings can make any difference in query performance ?
thanks for helping out

Log in to Reply
Keshav Murthy January 5, 2020 at 6:59 pm

Each node can have multiple partitions of an index. You don’t need to have 8 index nodes to have an index with 8 partitions.

The servicers parameter restricts the maximum number of concurrent queries in a query node. It will only improve performance (throughput) if you have CPUs remaining and you increase the number of concurrent queries. It doesn’t improve the performance of a single query.

Also, we have an active forum. Please feel free to ask questions there. https://www.couchbase.com/forums/

Log in to Reply

Platform

Self-Managed

Services

Capabilities

Why Couchbase?

Migrate to Capella

By Use Case

By Industry

By Application Need

Popular Docs

By Developer Role

Quickstart

Resource Center

About

Partnerships

Our Services

Partners: Register a Deal

Ready to register a deal with Couchbase?

Marriott

All Posts

Divide and Conquer: Couchbase GSI Index partitioning.

Capacity

Queriability

Performance

Summary:

Reference:

Author

Posted by Keshav Murthy

2 Comments

Leave a reply Cancel reply