We are on CB version 6.0.2
I want to know about the process of index rebuilding while adding/removing a field from FTS field.
What does happen when we add/remove a field from FTS index? Does index rebuild happens in background (means while index rebuilding, can we still use our application to throw queries on search service)?
Do We need to take downtime during index rebuild??
Note: during index rebuild, i don’t care for the new field addition. Even if it is giving data, we are okay. I mean will availability of search be there ? (at least it should return older response instead of empty response)
My second doubt:
Index rebuild can happen by means of 2 ways. Either by update/insert/delete of data or by change in definition of index. While index rebuild happening due to change in definition and data is also getting modified/added at the same time, will FTS be consistent after full index building process?
Yes, Whenever the user edits the index definition, it results in an index rebuild except for a few cases in the recent releases. (>6.5).
- Adding/removing replica partition count won’t result in index rebuild.
- Any scorch storage property change won’t result in an index rebuild.
All other index definition changes result in a rebuild. During this time, live traffic would be affected.
Ideally, these shouldn’t be a major concern for the production systems since most of these fields to index/search would have been finalized during the dev period itself.
Now, for accommodating any production time index maintenance tasks, we have the index-alias feature which users can use to manage the index rebuilds/recreations without affecting the live traffic.
ref - Creating Indexes | Couchbase Docs
Indexes never get rebuilds by DML/CRUD operations on the documents, it only gets rebuild upon index definition changes.
FTS index will always be streaming all the latest changes into it’s index and should give the latest results.
Users can use consistency level “at_plus” and provide vectors for verifying this.
Whenever we need to read your own writes(RYOW),
then we are supposed to pass a consistency vector conveying that to the FTS back end.
Today FTS supports only `at_plus` consistency level.
In short, looks like one has to use consistentWith(MutationState) in the API calls while searching where the MutationState is derived from the previous write ops which one wants to read.
MutationResult mutationResult = collection.upsert("key", JsonObject.create());
MutationState mutationState = MutationState.from(mutationResult.mutationToken().get());
SearchResult searchResult = cluster.searchQuery(
Thanks @sreeks for the explanation.
What are the scorch storage properties?
And as per the explanation, i can understand that we need to take downtime of index rebuilding process to avoid bad consistency?
If you are not ready/don’t want to use the index-alias feature to guard against runtime maintenance of your production index(recommended practice), then such rebuilds can affect your consistency.
In all circumstances, clients with strict consistency requirements can decorate their queries with consistency levels and vectors to protect against such consistency worries.
What are scorch storage properties?
Those are the configuration knobs for changing the storage level properties of an index, eg: compaction aggressiveness.
We don’t expose them to users unless they have some issues out of the default configuration values.
We want the FTS users not to worry about those intricacies.