Expected performance for N1QL

a.reisberg · January 31, 2016, 1:08am

I have a small Couchbase bucket consisting of 10k documents, each document has around 5 fields, and each field is a string of length less than 20. I’m running Couchbase 4.1, java sdk latest version, all on the same computer (i5, 12GB Ram, SSD). I repeatedly query the following statement (adhoc = false, phone is indexed)

select phone, name from users where phone=$1

From the Couchbase console, I only get around 200 N1QL queries per second. Is that normal?

ingenthr · January 31, 2016, 2:53am

@geralds or @keshav_m can confirm, but it might be that parameterized queries may not use index scans. I’m not sure if that applies to 4.1 though. Can you try it without the parameter and see if you see better latencies?

keshav_m · January 31, 2016, 3:35am

Parameter queries are indeed indexed. Use the EXPLAIN and verify.

If the key phone is is indexed, you should be issuing query like below:

SELECT phone FROM users WHERE phone=$1; ?

If you’re running Couchbase 4.1, this query would be running just using the index (you’ll see “cover” in the explain plan), which makes the query even faster.

Combination of prepared statement, covering index should make the query execute in a millisecond on the server.
You can simply test it on CBQ with:
CREATE INDEX idxphone ON users(phone) USING GSI;
PREPARE p1 FROM SELECT phone FROM users WHERE phone= "650-925-4823"
EXECUTE p1;

Please see what you’re getting directly via the cbq and Java SDK. Thank you.

a.reisberg · January 31, 2016, 3:46am

The index is indeed used. If I don’t use index, it’s like 10 queries per second (which makes sense, since each query requires scanning over 10k documents). But with index, it should be much faster that 200 queries per second, which is what I’m getting. For comparison, when I use get (using primary id), I get a couple of thousand gets per second. Secondary should be as fast, as far as expectation goes, right?

Thanks.

keshav_m · January 31, 2016, 3:55am

How fast does the query run via cbq? Try with adhoc and prepare/execute.

a.reisberg · January 31, 2016, 4:48am

Thanks for your reply!

Running a one off query is pretty fast: less than 2ms per query. So as far as latency is concerned, it’s as fast as gets. However throughput is much worse somehow.

a.reisberg · February 1, 2016, 1:34am

For some reason, adhoc=true has around 1.5-2 times the throughput of adhoc=false. I also try running 2 jvm’s at the same time, and got the server to execute around 600 queries per second. So this might suggest that the sdk can be improved in terms of parallelism?

In any case, 600 queries per second is still very low compared to what I get with get, which can go up to 30k per second!

Any advice would really be appreciated! Thanks.

daschl · February 1, 2016, 6:20am

@a.reisberg by default the Java SDK limits the open connections to the server to 1, but you can tune that easily. On the environment, bump the number of queryEndpoints up to 4 and re-test, that will probably help with your throughput.

We are looking into more intelligent pooling solutions on the client side, but you can easily increase the number of connections in the meantime. Can you try it out and report if it helps?

a.reisberg · February 1, 2016, 3:05pm

@daschl Thanks for the suggestion! Changing the number of queryEndpoints indeed helps. It gets me as high as 600 queries per second on average. I’ve experimented with different numbers of queryEndpoints, up to 500, but the max is still 600 queries per sec. Somehow, this is still very low compared to the K-V operation, which easily gives me 25k gets per second. In your opinion, what creates this large difference?

ingenthr · February 1, 2016, 6:13pm

One difference for sure is that a bucket.get() uses our tight, binary wire protocol. This is the binary protocol that came out of the memcached project, which a number of us at Couchbase had worked with and helped develop before we founded Couchbase. The cluster implementation is also very tight, requiring only a quick in-memory lookup after parsing the header and the key.

N1QL query exposes a REST interface which is quite good and as you have learned N1QL is very expressive. In order to give you that expressiveness, there is more work done by the query engine. There’s a good overview in @geraldss’s presentation from Couchbase Connect (see slides 25-27 in particular).

With our Global Secondary Indexes (and 4.1’s covering indexes), you can quickly express complex application logic you may need and N1QL’s planner/optimizer do the hard work. In other situations where you’re doing a simple fetch of a document, the bucket.get() option requires less work and system resources, giving you lower latencies/higher throughputs.

a.reisberg · February 1, 2016, 9:35pm

@ingenthr Thanks for your detailed answer! Something still puzzles me though. Back to the example that I had at the beginning:

select phone, name from users where phone=$1

It looks like (from slide 26 of @geraldss’s presentation) that if the statement is already prepared (adhoc=false), then only 4, 5, 6, and 7 need to be done. 5 and 7 are the same as get, and 6 is already done by the covering index. So only 4 is left. In this case, 4 is just another key look up right? Which is like a get since the predicate is an equality =. So very naively, it looks like the performance should be around 3-4 times slower than a bare get. So do you think the performance will improve in the near future? I.e. would it be reasonable to hope that it’s possible to do 10k queries per second in the near future (on a computer that can easily do 25k gets)?
Why is there no performance gain when I choose adhoc=false (sometimes, even slower)?
What are the alternative ways with Cb to achieve the same functionality as the query above, but with higher performance?

Thanks a lot!

keshav_m · February 2, 2016, 6:39am

You analysis is correct.

When adhoc=false, on the N1QL engine, we simply have to make a copy of the plan, bind the parameters, instantiate the plan and execute it. Plan for this would be quite simple to scan using the index and then fetch the corresponding document and then project it. Somewhat equivalent of key-value SDK to simply GET the document would be:

SELECT phone, name FROM users USE KEYS [“user:427823:482”];

[“user:427823:482”] is the document ID. this plan would simply access the KV store directly without going through the index. While index access itself is quick, a query plan execution takes resources (memory & cpu to setup), etc.

 To improve the speed, you could create a composite index so your access path is covered by index.

 CREATE INDEX myidx on users(phone, name);
 select phone, name from users where phone=$1

See bit more details on : http://blog.couchbase.com/sql-for-documents-n1ql-brief-introduction-to-query-planning

For queries like this, we’d always expect the PREPARE-EXECUTE path to be faster or atleast same as adhoc execution. That’s what we see in out lab measurements.

Michael pointed out some configuration in SDK. We’d discuss the end-to-end performance starting from Java SDK & update.

Fully covering the query with index will improve your query performance & throughput. If you need to scale, you could also have multiple indices with same definition (different name) on different nodes to scale-out.

a.reisberg · February 2, 2016, 2:38pm

@keshav_m Thanks for the very detailed answer. So essentially, I underestimated how much resource is needed to execute the plan. I still have two related questions below; it would be great if you could answer them as well.

Is USE KEYS for secondary indexes planned? From your reply, it seems that such a thing might speed up certain queries.
Also to confirm: to scale out, the names of the indices on different machines have to be different? If they have the same name, load balancing won’t kick in?

Thanks a lot.

geraldss · February 2, 2016, 4:04pm

Hi @a.reisberg,

USE KEYS does not apply to indexes, because it bypasses indexes altogether and does a key-value GET. Were you thinking about JOINs instead?
Indexes on a given keyspace must have distinct names. This is true whether they are on the same machine or different machines. However, indexes on separate keyspaces can have overlapping names, because the names are qualified by the keyspace. As for load balancing, the GSI indexer currently provides index failover. If two indexes have the same definition, the indexer will fail over if one index becomes unavailable. To actually obtain load balancing, there two options: (1) use index hints, using the USE INDEX syntax. (2) Partition your index by adding WHERE clauses to the CREATE INDEX statements. You can have different data ranges in each index, and then your queries can be routed to the appropriate index.

a.reisberg · February 4, 2016, 1:55am

@geraldss Thanks for your reply. I look forward to performance improvement for N1QL queries in future releases!

Han_Chris1 · July 9, 2020, 1:31pm

Hi,

Is there any resolution for this one?
I’ve found same issue, I have 2 query node and total 200 ops,
I need more that 1k ops.
I’m using java sdk 2.7.9 and couchbase 6.5.1

How to tune the performance?

THanks

Han_Chris1 · July 12, 2020, 9:42am

@a.reisberg have you solved this issue?

Topic		Replies	Views
N1QL Performances - Are 2,5K requests by second at 2ms each an expected average? SQL++ query , n1ql , sdk , dot-net	23	3832	May 25, 2017
Couchbase 4.1 N1QL - perfomance issue SQL++	14	4564	August 31, 2016
Low performance with N1QL queries SQL++	9	2302	October 20, 2016
N1QL queries very slow Couchbase Server query , java , n1ql	6	2341	August 25, 2016
N1ql Performance issue with JAVA SDK Java SDK	6	1667	March 24, 2017

Expected performance for N1QL

Related topics