Functional index component and write performance

naftali · February 18, 2020, 10:53pm

We have an index that is updated and read throughout our application, and we are considering adding a functional component as its final key. The current final key, the one that would become penultimate should we add this one, is likely to be unique for each document, so the added functional component shouldn’t affect the actual sort order of the records.

But since this is one of our most write heavy indexes, we are concerned about a potential index write performance degradation. We really don’t know how to reason about what this will cost us in performance.

Here are the two possibilities I can see:

The only thing the functional component will add is cost ‘C’ of running the function statement on the data field. If this is the case, the added cost per record is simply C. It’s also possible that cost C is significantly higher in the context of index writing than it is when, say, mutating a projection, and if so, we really need to know.
The very existence of a functional component on the index necessitates a different handling flow/path/parser… This is our largest concern and in this case we’d certainly be in the dark concerning its performance impact.

I’ll note that this index already has an array component, if that makes a difference.

Please help, as is this is super important to us.

prathibha · February 19, 2020, 5:17am

@naftali

What is the expected size of this new key being added to this index? If the size is small enough, then the real cost is C which is additional evaluation that needs to be done as part of projecting the full index entry.
I am not sure I follow the concern here. How is this new key going to be scanned? Will it be part of scan predicates or is it just going to be added as a covering field?
Also, which version of server are you using? I would suggest actually adding this new component in a test bed and test out if you see any impact on performance.

naftali · February 19, 2020, 3:57pm

@prathibha

Thank you for the response:

the actual key will be small, because it’s the output of a function. but the input to the function is potentially large, namely a string of arbitrary length, I’m hoping to find a function that doesn’t need to traverse the entire string, something like slicing out the first character. The point is to identify whether the field exists on the document and whether it contains a non blank string. We are trying to retrieve keys of eligible documents without fetching anything extraneous from the data nodes.
will be used as a scan predicate. we do not need to include it in the projection
we are using 6.0 and upgrading to 6.5 is on our midterm roadmap

vsr1 · February 19, 2020, 5:20pm

You can also try partial index on IFMISSINGORNULL(f1,"") = “” , Indexes only when f1 IS MISSING OR NULL or “”

CREATE INDEX ix1 ON default(c1,c2) WHERE  IFMISSINGORNULL(f1,"") = "";
SELECT META(d).id 
FROM default AS d
WHERE c1 ..... AND IFMISSINGORNULL(f1,"") = "";

naftali · February 19, 2020, 5:29pm

@vsr1

Thank you, I would need to do IFMISSINGORNULL(f1, “”) != “” (looking for when f1 is substantial string) but it would work the same way, correct?

vsr1 · February 19, 2020, 5:39pm

Index WHERE clause evaluated by projector when true only indexes. There is no impact on index how big string is.
If you are looking non empty string.
You can do f1 > “” (this filters out MISSING, NULL, “” , assume f1 is string). If you do != it may not covered without explicitly present in the index, verify it.

In addition you always have functional key giving single character as index key

naftali · February 19, 2020, 5:45pm

thank you so much.
this information is solid gold.

Can you clarify this “Index WHERE clause evaluated by projector when true only indexes.” I can’t understand the sentence.

vsr1 · February 19, 2020, 5:49pm

CREATE INDEX ix1 ON default(c1,c2) WHERE IFMISSINGORNULL(f1,“”) = “”;
“k01” {“c1”:1, “c2”:10,“f1”: “----------10k”}
“k02” {“c1”:1, “c2”:10,“f1”: “”}
Above example Index WHERE clause evaluated by projector (It is separate process used by indexer) when it is true then only document is passed to indexer for index. In this case “k01” evaluates false so projector never sends to indexer, it only sends “k02”

Projector and Router Using Indexes | Couchbase Docs

If needed Checkout Chapter Designing Index For Query In Couchbase N1QL https://blog.couchbase.com/wp-content/uploads/2017/10/N1QL-A-Practical-Guide-2nd-Edition.pdf

Topic		Replies	Views
Some questions on best index performance Couchbase Server	1	1899	December 31, 2014
Array indexing / Search Performance Question SQL++	2	510	March 11, 2021
Very slow performance on query without index on simple documents SQL++	9	10819	July 26, 2016
Performance Basic Query Update Couchbase Server	12	2317	September 13, 2019
Why huge mutations on an itemless index Couchbase Server query	2	362	September 20, 2023

Functional index component and write performance

Related topics