Index Normalization

btburnett3 · February 28, 2018, 2:30pm

I know this is a bit of a stretch, but I was wondering if I could get a link to the code that handles index normalization? I’m working on this project to help manage indexes in our CI/CD pipeline:

When we create an index, CB is normalizing the form for the “index_key” and “condition” attributes returned when you query system:indexes. This means that if we don’t make our index definition exactly match the normalized form then the index appears different. I’d really like to be able to duplicate the normalization logic so I can more accurately compare to see if the index definition is different from the actual index.

Thanks,
Brant

btburnett3 · April 16, 2018, 7:27pm

Nevermind on this one, found the solution. Running an EXPLAIN CREATE INDEX ... query will return a query plan which has the keys and condition normalized. This is a great solution since it guarantees consistent normalization patterns even across different versions of Couchbase. The only minor difference is the keys array in the plan is an array of strings pre-5.0 and is an array of objects with an expr attribute post-5.0.

vsr1 · April 16, 2018, 8:26pm

Hi @btburnett3,

In 5.0 onwards Index keys can have DESC collation, Format has changed to array of objects to incorporate this change.
Example: CREATE INDEX ix1 ON default(k1 DESC, k2, k3DESC);

btburnett3 · April 16, 2018, 8:40pm

Ahhh, I’ll need to make sure I account for that as well, then. Thanks for the tip @vsr1.

vsr1 · April 16, 2018, 8:57pm

Hi @btburnett3,

Also In 5.5.0 There is PARTITION BY HASH(…)
Couchbase GSI Index partitioning - The Couchbase Blog

Also checkout this it will play key role in the index advisory
SQL Group by Index | Aggregate Index SQL | Couchbase

btburnett3 · April 16, 2018, 8:58pm

Yeah, I’m working on that one right now! https://github.com/brantburnett/couchbase-index-manager/issues/25

btburnett3 · April 17, 2018, 4:00pm

@vsr1

Actually, I do have one question about partitioned indexes and replicas. I can’t seem to create replicas appear effective in terms of their node assignments. If I assign nodes, both replicas use all of the nodes listed. If I don’t assign nodes, both replicas use all nodes in the cluster. This isn’t really providing redundancy against node failure.

I’ve experimented with alternative syntax to try to control replica node assignment with more granularity, but haven’t had any luck. Any pointers? Or is this something that isn’t expected until 5.5 GA?

Thanks,
Brant

vsr1 · April 17, 2018, 4:05pm

cc @deepkaran.salooja

deepkaran.salooja · April 17, 2018, 10:15pm

With partitioning, redundancy is provided by placing the replicas of a partition on different nodes. This doesn’t exclude other partitions/replicas from being co-located. At an index level, you’ll see the index being placed on all assigned nodes.

If a node goes down, an equivalent replica copy of that partition will be chosen to answer queries.

btburnett3 · April 18, 2018, 1:51pm

@deepkaran.salooja Okay, I think that makes sense to me. Basically replicas are provided using a slightly different methodology for partitioned indexes. And num_replica is effectively decoupled from nodes when creating the index.

Topic		Replies	Views
Differences between primary and secondary Indexes? Couchbase Server	16	15464	July 23, 2015
Issues Creating Partitioned Indexes on 5.5 Beta SQL++ index	3	824	April 19, 2018
Query Execution Community index	5	1660	December 20, 2018
Interesting Indexing Behaviour on Multi-node Cluster Couchbase Server	5	1172	August 8, 2017
Best practice on n1ql index? SQL++	3	2176	January 6, 2016

Index Normalization

Related topics