FTS and Bucket strategy

We are using a bucket with 21 different types of Documents.
For one type of documents (type=‘Patient’) we have defined a FTS index for patient search.
We experience that when adding new documents not of type=‘Patient’ the FTS service indexer starts running consuming high CPU.
What is the best strategy when creating FTS and bucket assignment?
Creating new bucket for every new FTS index ?

Thank.
Gunter

@gunter, interesting query and i am hearing it for the first time about the cpu consumption worries for the non indexed document type mutations. Perhaps- this is one area where we can look further on how to minimise/optimise the impacts out of such mutation noice in the buckets. There are already some long term plans for handling this.

What is the best strategy when creating FTS and bucket assignment?
Creating new bucket for every new FTS index ?

Not sure whether you are on an EE version of couchbase? Are you having MDS enabled/explicit FTS node?
If the cpu consumption is of really worrying magnitude then you may segregate all the FTS indexable data in a single bucket and then create a single FTS index with multi type mappings.

One bucket per FTS index may not scale well with more buckets/indexes.

Cheers!

Our project is not yet on production.
We are on EE version testing with 6.5 Beta2 in docker container only with one node, not yet using a dedicated node for FTS or MDS.
When switching from 6.0.3 to 6.5 Beta2 increase of CPU usage when synchronization of FTS runs is something we are facing .

Thanks.
Gunter

Note that regardless of the what type of documents you are indexing within the FTS index, all documents added to the data node will be shipped to the FTS indexer. The “doc count” on the FTS UI will reflect that. Only documents of the relevant “type” or “types” will be indexed and the rest will be dropped (This is definitely an area we want to improve, where we can stream documents based on a filter - perhaps in the next release).

So that said, the indexer starting to consume CPU when documents are added to the data node is sort of expected.

@gunter couple questions for you …

  • Now when you mention …

    Switching from 6.0.3 to 6.5Beta2 increase of CPU usage when synchronization of FTS runs is something we are facing.

    Do you mean to say that CPU utilization is more with the later release in the exact same scenario of documents intake?

  • What indexer are you using - scorch or upside_down?

Yes, using same QA testing.
We are using indexer scorch for FTS.
We also detected some GSI indexes where not used when switching to 6.5 beta and that can generate more CPU increase.
Are there any document in your website related to migration considerations on indexes when upgrading from 6.0.3 to 6.5 ?
thanks
Gunter

@gunter,

In your original question, you were sure/specific about the high cpu utilisation of the FTS indexer?
IIRC, for FTS scorch indexes, there aren’t any specific upgrade considerations documented between 6.0.3 to 6.5.

Another thing to note is that - the usage observations in your docker container on a single machine mostly not very relevant for a production cluster.
-as there may be other processes running in your machine than the Couchbase server container racing for resources.
-all services running with in a single container also need to compete for resources against each other.

Though not sure whether you have already applied any cpu configs for the container like ref- https://docs.docker.com/engine/reference/run/#cpuset-constraint

Even if you adopt a docker container based deployment strategy my understanding here is that ,
-Running all service containers in a single node is highly not recommendable for a performance sensitive production systems. Placing all containers on a single physical machine means all containers will compete for the same resources. All containers on a single physical machine also eliminates the built-in protection against CB server node failures with replication when the single physical machine fail, all containers experience unavailability at the same time loosing all replicas.
-Running each couchbase server container on it’s own server machine is recommended for production deployments.

Cheers!

Ok thanks.
we will apply recommendations for cpu on docker containers.
For the moment we are using docker infrastructure only for developing and testing. We will follow the recommendation when setting the production environment.

Have a nice day.
Gunter