Coucbase partition key vs Primary key

@abhinav So this partition of FTS indexes happening automatically right?You dont need to specify PARTITION BY key word as in GSIs?

@vsr1
Take below 2 index creation queries.
CREATE INDEX ih ON customer(state, name, zip, status)
PARTITION BY HASH(state)

REATE INDEX ih ON customer(state, name, zip, status)
PARTITION BY HASH(state, name, zip, status)

It seems in both cases you can search by utlising the index in below combinations.
(state)
(state,name)
(state,name.zip)
(state,name,zip,status)
But you cant search utlising the index for the below.
(name only)
(name,zip)
(name,zip,status)
(status only) etc

Please clarify whether above combination is right since GSI uses general non inverted mechanism you cant search any arbitary combination?

Also clarify on the differences of using the partion key as (state) only and using all compisite (state, name, zip, status) as partion key?What are the key differences in these 2 methods?

Thanks,
Isuru

GSI index will NOT index the document when leading index key is MISSING in the document.
GSI index similar like b-tree
Query that has predicate on leading index key only qualify for selection (Because in leading index key is not present in predicate query suppose to return that document also . As index doesn’t have info it doesn’t qualify).
Once state has query predicate it qualifies. some might efficient some not (example: state, status in this case given state all names, all zips then it needs to search status)

take US there is 50 states, and you have 16 partitions. So happen to be CA, NY,FL all fall in one partition (based on hash code). One partition has almost 40-50% data, some might have only 1-2%.
To want evenly distribute you can add more partition keys.

NOTE: partition keys must be IMMUTABLE.

@vsr1
Lets assume (State,name,zip,status) all are immutable.
If so arent the below indexes are identical ?
If not what are the differences?
CREATE INDEX ih ON customer(state, name, zip, status)
PARTITION BY HASH(state)

CREATE INDEX ih ON customer(state, name, zip, status)
PARTITION BY HASH(state, name, zip, status)

As far indexes are some what identical based on index keys, but underneath it will different how data is placed.

Say you have 8 partitions. If state =“CA”
index 1 : all documents go in partition1 on indexer node1. The data on that index node huge.
index2: partition it goes decided by state, name, zip ,status. state = “CA” might go all 8 partitions and evenly distribute index data.

WHERE state = “CA” ; index 1 can do partition elimination and scan only one partition. index2 has to scatter +gather on all parttions.
WHERE state = “CA” AND name = “xyz” AND zip =12345 AND status = “success” ; both can do partition elimination.

WHEN all partition keys are equality predicate it can do partition elmination.

Thanks @vsr1 its clear

@vsr1

Lets say you have 2 index nodes and both index nodes crashed but data and query nodes are up.
In this situation search queries backed by GSIs are served by bypassing index services or will they failed and user will get failure response?

If query uses index, index and its replicas hosting index nodes are down query will return error.
To get the data from data node, you need document keys. index scan gives document keys. with out document keys query will not complete.

When all index nodes down only query that will work is query with USE KEYS.

Thanks @vsr1 for clear prompt replies