Metadata overhead memory best practices

We have a 3 node cluster - Community Edition 6.5.0 build 4966.
We have a bucket with 40 Million docs which their user data ranges between 500 bytes - 900 bytes.
Ejection method is Value-only.
Replicas: 1
Bucket RAM quota: 15GB and Cluster RAM quota: 17GB.
Metadata in RAM is 4GB.
User data in RAM in 6GB.
We started facing this warning - [20 Jun, 2021, 9:36:29 AM] - Metadata overhead warning. Over 55% of RAM allocated to bucket “my-bucket” on node “couchbase1” is taken up by keys and metadata.

Worths noting that we do many sub-document counter increments - we have for example a field called - inboundCounter that we perform increase pretty often if this is a popular key.

Few questions:

  1. whats the best practice with such bucket usage? if our ejection is value only - how come keys+metadata are such too large? example key is: “00000b9d22ac762db5b8b24921409ef6:9c70933aff6b2a6d08c687a6cbb6b765” . What do we need to do in such situation? just increasing RAM or add more nodes?
  2. is such sub-document counter increments can increase metadata sizes?

Current usage latency is pretty good (0.9p around 60ms) which i dont like to hurt that much… what do you suggest?

Value-eviction requires that all documents’ key+metadata is kept resident. The warning is basically telling you “you’re spending a majority of your memory on just keys+metadata - even ones which are never accessed. You might get better performance by only spending that memory on items in your working set.”

I would suggest looking at your cache miss ratio - if that is high (close to 100%); then there’s not really any advantage to using the key+metadata memory for other values as you already have the vast majority cached - and value eviction also has the advantage it never has to go to disk to check if key exists or not.

No. Metadata size is fixed per-document; the number of updates it has is irrelevant.

Thanks, we have a cache miss ratio of 6% currently.
So if I understand correctly, it means that as more as this “warning” on metadata grows, it means that our keys+metadata are taking more space in RAM while user data values are more ejected to disk, right?

So as soon as our cache miss ratio is increasing + our operation latencies grow, we will need to increase memory / add nodes to make more place for user data in memory, right?

It means the keys+metadata is taking more room; but doesn’t necessarily mean user data values are getting ejected - if your dataset is 100% resident for example then nothing had been ejected - it’s just that key+meta is taking up a significant amount of that RAM.

Yes - or you could look at switching to full-eviction. Note that requires the nodes to restart.