Stream phase of the N1QL query is taking time without and with covering index

mreiche · July 25, 2025, 7:48pm

What does HandleDocument do? Could calling it 193,215 times take a while? Have you tried removing the HandleDocument call?

btw - your cbq output shows 18577 rows. The metrics (that vsr1 posted) shows 193215 rows. It’s not surprising that the cbq execution is 10x faster.

we verified, in both case the same index is used.

Just to be really sure - try the EXACT same query. You’d be surprise the number of times that kinda-sorta the same was not exactly the same.

pccb · July 28, 2025, 3:24pm

thanks

What does HandleDocument do? Could calling it 193,215 times take a while? Have you tried removing the HandleDocument call?

Dev is checking on this

btw - your cbq output shows 18577 rows. The metrics (that vsr1 posted) shows 193215 rows. It’s not surprising that the cbq execution is 10x faster.

It could be some misreading. Currently the data is 221k docs and we have verified from multiple angles and the count of docs returned is same for both - cbq initiated execution as well as the app initiated one.

Just to be really sure - try the EXACT same query. You’d be surprise the number of times that kinda-sorta the same was not exactly the same.

didn’t quite get this, can you pls elaborate?

mreiche · July 28, 2025, 4:12pm

@pccb by any chance is it a query that you provide parameters for through the SDK, but hardcode those parameters into the statement in cbq?

thanks @graham.pople , we verified, in both case the same index is used.

don’t justy verify that the same index is used. Run the EXACT same query.

I don’t know where I got the 18577 from.

pccb · July 29, 2025, 8:45am

@mreiche , @graham.pople, An additional observation, while Dev is looking in the HandleDocument suggestion

We compared the completed requests and noticed the following differences:

Executed From	cpuTime	elapsedTime	phaseTimes.run	phaseTimes.stream	resultSize	serviceTime
cbq	16s	13s	13s	9s	290,143,635	13s
app	38s	35s	35s	32s	200,370,930	35s

I believe the key difference to focus on is the phaseTimes.stream, which seems to be significantly higher when executed from the app. This, in turn, appears to be driving the higher values seen in cpuTime , elapsedTime , phaseTimes.run and serviceTime .

mreiche · July 29, 2025, 3:09pm

The query results are not the same size. definitely not the same query, so no point in trying to guess why they execute differently- beyond them being different queries.

The time taken for streaming should not affect cpu time. The difference in cpu time points to the queries being different.

vsr1 · July 30, 2025, 12:39am

cbq by default pretty=true (this add ups space size by approximately 1/3) vs all other(s)

It depends. stream writes buffered write. If client doesn’t consume network sockets can backedup which in turn buffered write that can all they way backup every operator, which can do context switching etc…

Why don’t test with standalone SDK just run the query vs app and see.

pccb · July 30, 2025, 2:44pm

cbq by default pretty=true (this add ups space size by approximately 1/3) vs all other(s)

Thanks, @vsr1 The pretty parameter was indeed the reason for the difference in result size. re-executed the query with pretty=false, and now the sizes match. Appreciate the clarification!

srinu471 · July 30, 2025, 2:47pm

The query results are not the same size. definitely not the same query, so no point in trying to guess why they execute differently- beyond them being different queries.

@mreiche, it’s the same query. Even though we verified that the query triggered from CBQ and the application in the completed requests is identical.

mreiche · July 30, 2025, 4:15pm

@mreiche, it’s the same query.

Please show the query used in cbq and the code that calls it in the SDK. Thanks. I cannot count the number of times I’ve been told “It’s the same query” when there are differences such as literals vs parameters being used, or the projection was ‘*’ vs all the properties etc.

What was the result of commenting out the call to HandleDocument() ?

which can do context switching etc…

It’s difficult to imagine context switching adding 22 seconds to 16 seconds of cpu. My point was that slower streaming will not change the query plan, the number of fetches, the number of rows returned, etc, etc, etc. 16 seconds vs 38 seconds looks very much like a different query plan. Comparing the execution from completed_requests would be useful.

pccb · August 4, 2025, 2:06pm

What was the result of commenting out the call to HandleDocument() ?

The callback couldn’t be removed, as it appears to be mandatory. However, it was modified to always return TRUE, and with that change, the execution time matched what we were seeing in cbq. Thanks a ton

Any thoughts on what might have been going wrong in the original callback?

mreiche · August 4, 2025, 3:54pm

Any thoughts on what might have been going wrong in the original callback?

Probably nothing. The only issue was that there seems to have been an expectation that executing it would take zero time.

Topic		Replies	Views
N1QL query Performance is getting low irrespective of iterations SQL++ n1ql	11	2808	January 29, 2016
Couchbase 4.1 N1QL - perfomance issue SQL++	14	4587	August 31, 2016
N1QL Performances - Are 2,5K requests by second at 2ms each an expected average? SQL++ query , n1ql , sdk , dot-net	23	3869	May 25, 2017
N1Ql with cover index performance very slow SQL++ query , n1ql	41	4430	October 21, 2016
N1QL SELECT * VS SELECT <lots of fieldnames> performance SQL++ query , n1ql	19	5079	January 8, 2017

Stream phase of the N1QL query is taking time without and with covering index

Related topics