Use GSI to speed up query, no effect

This is a typically query in my application. I tested it from the browser workbench.

SELECT * FROM `feeder` WHERE META(self).id like 'GO_SF_USER::%' order by LASTNAME offset 100 limit 25

It took 3 seconds with only the primary index to get my 25 results out of 110.000.
I created an index

CREATE INDEX sf_user ON `feeder` (LASTNAME) WHERE META(self).id like 'GO_SF_USER::%' USING GSI;

Then I tested my first query again and it still took 3 seconds. I used EXPLAIN and saw, that it still only uses the primary index.
What is my fault?

I played around a little. For me it looks like using like causes the performance problems.
Additional Info: I user 4.6 DB…

After changing my data structure a bit and uploading the 110.000 documents with new fields, it works much better.

CREATE INDEX sf_user ON feeder (_eType,_oType,LASTNAME,FIRSTNAME) WHERE _eType = "GenericObject" and _oType = "SF_User" USING GSI;
SELECT * FROM feeder WHERE _eType = 'GenericObject' and _oType = 'SF_User' order by LASTNAME, FIRSTNAME offset 100 limit 25

Now it took just 23ms. This seems a reasonable value to me.
As soon as I add another sorting, it gets slower then without any index. I expected it to be slightly worse but much better than without index.

SELECT * FROM feeder WHERE _eType = 'GenericObject' and _oType = 'SF_User' order by LASTNAME,FIRSTNAME,USERNAME offset 100 limit 25

Even if I change the sort order, it falls down to the no index speed (3 seconds) or worse.

SELECT * FROM feeder WHERE _eType = 'GenericObject' and _oType = 'SF_User' order by LASTNAME DESC offset 100 limit 25

How can I increase my N1QL speed for dynamic querying?
What indexes do I have to create, to query by eType and oType and sort by LASTNAME, FIRSTNAME and USERNAME in different sort orders?

I would suggest creating an index on _eType and _oType only. This is the simplest solution.

create index feeder_eTypeoType on feeder(_eType, _oType)

With that index in place, the execution pipeline should be an index scan (feeder_eTypeoType), followed by a fetch, followed by a sort. That means the index helps with the selection, but not with the sorting.

Does that deliver acceptable performance?

Sorting is expensive.

How about the following three indexes:

LASTNAME, _oType, _eType, FIRSTNAME, USERNAME
FIRSTNAME, _oType, _eType, LASTNAME, USERNAME
USERNAME, _oType, _eType, LASTNAME, FIRSTNAME

In all your queries, add

WHERE
LASTNAME IS NOT NULL
AND FIRSTNAME IS NOT NULL
AND USERNAME IS NOT NULL

in addition to your conditions on _oType and _eType.

Instead of SELECT *, SELECT those five fields above, or SELECT META().id.

It might be simple, but the result is not acceptable.
The users of my product should not have to wait 3 seconds to get the next 25 records, when the click next page at the browser.

Good Idea, thank you.
Also the proposal to create more indexes works well. I already did that while testing the behaviour of Couchbase.

Never the less, It surprised me, that Couchbase falls back to no index speed in the following cases:

  • Index with LASTNAME,FIRSTNAME > N1QL with order by LASTNAME,FIRSTNAME,USERNAME: the correct index is used, but the query has the same speed as if there is no index at all. The pre-sorting by LASTNAME and FIRSTNAME has no positive effect
  • Index with LASTNAME > N1QL with order by LASTNAME desc: the correct index is used, but the query is as slow as without an index. I though just reversing the order of an existing index should be as fast as using the current index order

Conclusion:

  • If one build the correct indexes, Couchbase reaches the expected speed
  • Without MOI Couchbase is to slow (only with memory optimized indexes I reached the response time in milliseconds, without MOI the response time is over 1 second (on SSD), that is unacceptable slow for navigation or type ahead)
  • MOI is reserved for Enterprise Edition as fare as I read. Question: Will it be available in the Community Edition?
  • Question: How can I create an index with inverse order? CREATE INDEX sf_user ON feeder (_eType,_oType,LASTNAME,FIRSTNAME) I want to inverse the order of LASTNAME. I don’t see how to do it here. I tried order by during creation, but the workbench complains about an error.

We’re looking into to ASC and DESC support.

Now:
For numerical fields, you can exploit the index ordering by negating the field.
This does affect how you write your predicates…

Create index idx1 on t(-id);
select -(-id)
from t
where (-id) < (-10)
order by (-id)
limit 1;

Did you measure MOI vs. standard GSI? The performance difference should be mostly in writes, not reads. Make sure the standard GSI has enough memory.

For pre-sorting, put all the ORDER BY keys in your index, at the beginning of the index.

I mean the difference between

CREATE INDEX sf_user ON feeder (_eType,_oType,LASTNAME,FIRSTNAME) WHERE _eType = "GenericObject" and _oType = "SF_User"

and

CREATE INDEX sf_user ON feeder (_eType,_oType,LASTNAME,FIRSTNAME) WHERE _eType = "GenericObject" and _oType = "SF_User" USING GSI;

Isn’t that what’s different between MOI and standard GSI?
My Couchbase Server has 4GB RAM and tells me 3.79 is unused.

Those two are the same. MOI vs. standard GSI is configured once at install / setup time.

Really? Why then do I have different response times?
When I tried it yesterday I got different results.But you are right, I retested it and now it does not matter.
If it does not matter, what is the purpose of USING GSI vs not using it?

Thank you for this information.
This really disqualifies Couchbase from being our new database back end.
Our users want to access the data in different customizable ways.
In addition the lack of full text search capabilities is a big big problem. Elastisearch is not supported any more and the new suggested tool for it, is beta.
It was an interesting proof of concept but it seems it does not fit our needs.

USING GSI is the default.

Search is available in N1QL 4.6.0 GA (see TOKENS function), and also in Developer Preview of Full Text Search.

ORDER BY DESC is underway. When is your GA or go-live date?