Stream phase of the N1QL query is taking time without and with covering index

srinu471 · July 18, 2025, 1:49pm

We observed that the N1QL query takes over a minute to retrieve approximately 200,000 documents.
*To improve performance, we created a covering index, which significantly reduced the execution time.
*Since reducing the number of columns in the query is not an option, are there any optimization recommendations you suggest we try before creating a covering index, considering the associated overhead?
*The stream phase time remains high even with the covering index in place, although it is slightly reduced compared to the non-covering index scenario.

mreiche · July 18, 2025, 2:03pm

Can you show the query and the execution metrics? Can you show the client code that calls the query and consumes the stream?
Could it be the client that is slowing the streaming? Could it be the network? The server cannot produce the stream faster than the the client consumes it.

Do you have a ticket open with support? If so, what is the ticket number?

vsr1 · July 18, 2025, 8:14pm

CE query service limited by 4 cores.
Best way is create index covers predicates and use covering query project the document keys and use KV get (async, parallel) If need only few fields use SUBDOC API

stream phase time high means client might not reading or check upstream operator times where bottle neck or post profile timings

https://www.couchbase.com/blog/optimize-n1ql-performance-using-request-profiling/

srinu471 · July 21, 2025, 7:05am

Hi @mereiche,

Thank you for the help.
Query_CI_WithoutCI.zip (9.1 KB)

Can you show the query and the execution metrics? Can you show the client code that calls the query and consumes the stream?

Attached the sheet having a query with CI (covering index) and without CI, where the stream takes time. Refer to Column AE with CI and column AB without CI

Could it be the client that is slowing the streaming? Could it be the network? The server cannot produce the stream faster than the the client consumes it.

We need to understand why is streaming takes time, then optimize stream time.

Do you have a ticket open with support? If so, what is the ticket number?

Raised ticket with Couchbase support. ticker number #70847

srinu471 · July 21, 2025, 12:24pm

Hi @vsr1 ,
Thanks for getting back so quickly.

We are verifying whether this is currently enabled, as mentioned in the blog.

Stream: high execution time

The request is struggling to convert the result in JSON form and send it to the client.

The pretty REST parameter may have been set to true in error.

vsr1 · July 21, 2025, 1:24pm

Looked at the support case.
client seems libcouchbase and SDKs so pretty will be false by default. So that is not the issue.

Some of quick observations: This may be different request than what you looking

resultSize are in Mbs
Stream execTime is high, Two possibilities client is not reading fast (resultSize is very high) or upstream Operator in this case Order is issue. Order operator execTime low and kernTime high then look upstream InitialProject execTime high

That tells InitialProject is taking time, that is due to number of projection list and number of items qualified are high.

                "#operator": "InitialProject",
                            "#stats": {
                                "#itemsIn": 207794,
                                "#itemsOut": 207794,
                                "#phaseSwitches": 831180,
                                "execTime": "40.064461391s",
                                "kernTime": "1.709503988s"
                            },

"#operator": "Order",
                "#stats": {
                        "#itemsIn": 207794,
                        "#itemsOut": 193215,
                        "#phaseSwitches": 608808,
                        "execTime": "2.637306576s",
                        "kernTime": "41.692093248s"
                    },

                   "#operator": "Stream",
                  "#itemsIn": 193215,
                        "#itemsOut": 193215,
                        "#phaseSwitches": 193218,
                        "execTime": "32.312172445s",
                        "kernTime": "2.364µs"

Some requests timedout due to not optimized If you look IndexScan count very high and after Filter the count is very low that causes lot of wasted time
Some requests i have seen FetchTime very high which uses earlyProjection in fetch https://jira.issues.couchbase.com/browse/MB-66614

srinu471 · July 22, 2025, 2:24am

Hi @vsr1
Thanks for the update.

resultSize are in Mbs
Stream execTime is high, Two possibilities client is not reading fast (resultSize is very high) or upstream Operator in this case Order is issue. Order operator execTime low and kernTime high then look upstream InitialProject execTime high
That tells InitialProject is taking time, that is due to number of projection list and number of items qualified are high.

For points 0 and 1, we could not alter the projection list or result size

Some requests timedout due to not optimized If you look IndexScan count very high and after Filter the count is very low that causes lot of wasted time

Is this observation before creating a covering index? This is the area we look for optimization.

Some requests i have seen FetchTime very high which uses earlyProjection in fetch https://jira.issues.couchbase.com/browse/MB-66614

Is this also before creating a covering index

vsr1 · July 22, 2025, 2:43am

If query results count high and result size is high want faster best option will be

non aggegate/join queries(subqueries etc) will be
change Query project document key only make that query covered. Once you have document key
get document from KV through SDK asycnhronous/parallel (This avoids index leaner, avoids Fetch via 2 hops (KV to N1QL to client). I know this required application change and may not convenient.

pccb · July 22, 2025, 5:43pm

thanks @vsr1

This is indeed a simple query — no joins, subqueries, or aggregates — but the number of documents returned is quite large (~210k), and each document includes several fields. There’s an ORDER BY clause, and we did test the query without it, but that didn’t significantly reduce the time; Stream phase continues to be the one where most of the time is spent.

The workaround you suggested is definitely worth exploring, though it comes with some complexity. However, before heading down that path, we’re trying to confirm whether the time spent in the Stream phase is justified — i.e., whether there’s any network bottleneck or client-side slowness (SDK or app logic). That’s been difficult to isolate - any suggestions here would be helpful.

In parallel, we’re looking into tuning options to help the Stream phase complete faster. We came across a few parameters in the documentation (https://docs.couchbase.com/server/current/n1ql/n1ql-manage/query-settings.html# )

pipeline-cap
scan-cap
pipeline-batch

Would tweaking these help in this scenario, or are they unlikely to have much impact when streaming large result sets?

Appreciate any insights you might have.

mreiche · July 22, 2025, 6:24pm

Can you show the client code that calls the query and consumes the stream?
Could it be the client that is slowing the streaming? Could it be the network? The server cannot
produce the stream faster than the the client consumes it.

still waiting for the client code.
The measured network bandwith would be useful and the size of the documents would be useful (i.e. if you try to stream 100 GB on a 1 Mb/s network, it’s going to be slow).

From the ticket, it says the result size 219 MB. If you can get 100 Mbit sustained from server to the client, that is going to take about 20 seconds.

If you want to eliminate the client, you can execute the query with curl using the REST API…

curl -v http://10.143.210.101:8093/query/service -u Administrator:password -d 'statement=SELECT meta().id FROM my_collection WHERE type = "hotel" and meta().id LIKE $1 &args=["hotel_1002%25"]&query_context=default:travel-sample.my_scope' -o result.out

vsr1 · July 22, 2025, 6:28pm

Those unlikely improve stream
This is not 100% proof some of the technique i use so that avoid client transfer huge data let query process rest of pipeline

SELECT COUNT(1) FROM (actual query) AS t;

SELECT 1 FROM (actual query ) AS t WHERE t.xxx IS NOT NULL;

pccb · July 22, 2025, 6:39pm

thanks @mreiche, on our side it would be 1Gb/s nw (but let us get that re-verified), so even after consdering the protocol overheads, it should not have exceeded 5 sec.

will get the client code. thanks

mreiche · July 22, 2025, 6:41pm

FYI - when one has a “1Gb/s” network, the actual achievable throughput is much less. Maybe 50%, maybe 10% of that “1Gb/s”.

vsr1 · July 22, 2025, 6:50pm

Idel system with only query running

UPSERT INTO default (KEY d.k, VALUE d)
SELECT {"k":TO_STR(t), "c1": LPAD("a",1024,"b")} AS d FROM ARRAY_RANGE(0,200000) AS t;
SELECT RAW t FROM default AS t;

time curl -u Administrator:password http://localhost:8093/query/service -H "Content-Type: application/json"  -d '{"statement":"SELECT RAW t FROM default AS t", "pretty"false}' > /dev/null

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  199M    0  199M  100    62  21.1M      6  0:00:10  0:00:09  0:00:01 21.5M

real	0m9.433s
user	0m0.200s
sys	0m0.285s

srinu471 · July 23, 2025, 12:04pm

Hi @mreiche, We have asked the developer for the code and here it is.

int CCouchbaseDB::ExecuteQuery(KeyValueDBQuery *qry, CCallBackDetailParams *userdata, SKeyValueDBParams *parameters)

{

    if(!m_connected)

    return KVDB_CONNECTION_FAILED;

 

    Query *query = (Query *)qry;

 

    if(!query->m_queryHandle)

        return KVDB_INVALID_PARAMS;

 

    userdata->cmd = query->m_queryHandle ;

    lcb_STATUS rc;

 

    if((rc = lcb_cmdquery_callback(query->m_queryHandle, query_callback_param)) != LCB_SUCCESS)

    {

        CLogger::Error(__FILE__, __LINE__, "Failed to make n1ql command. Error[%s]", GetErrorDesc(rc));

        return rc;

    }

 

    if((rc = lcb_query(m_instance, userdata, query->m_queryHandle)) != LCB_SUCCESS)

    {

        CLogger::Error(__FILE__, __LINE__, "Failed to schedule n1ql query. Error[%s]", GetErrorDesc(rc));

        return rc;

    }

 

    lcb_wait(m_instance, LCB_WAIT_DEFAULT);

 

    return userdata->rc;

}

mreiche · July 23, 2025, 2:21pm

Please provide the callback

srinu471 · July 24, 2025, 2:34pm

Hi @mreiche,

Thank you for your patience as we share the callback code.

static void query_callback_param(lcb_INSTANCE* instance, int type, const lcb_RESPQUERY *resp)

{

              CCallBackDetailParams *result = NULL ;

              CSTR        row     = NULL ;

              size_t     row_len  = 0 ;

 

              lcb_STATUS rc = lcb_respquery_status(resp);

              lcb_respquery_cookie(resp, (void**)(&result)) ;

 

              lcb_respquery_row(resp, &row, &row_len);

 

              if(lcb_respquery_is_final(resp))

              {

            // In final success response, only couchbase status will be received in below format, hence not calling HandleDocument.

            /* In timeout scanerio, status will be closed/timeout and partial record may return which we are not passing to application.

             * [{

             *              "requestID": "58bdd66c-bdee-45cf-b226-8cc4beb08f52",

             *              "clientContextID": "6a14c164e6fa3c7d",

             *              "signature": {"*":"*"},

             *              "results": [

             *              ],

             *              "status": "success",

             *              "metrics": {"elapsedTime": "2.287467ms","executionTime": "2.134303ms","resultCount": 4,"resultSize": 18557}

             *              }]

             */

           

            if (rc != LCB_SUCCESS)

            {

              // logging error when return code will be other than success. Same will be send to application.

              CLogger::Error (__FILE__, __LINE__, "Error returned by Couchbase [0x%x], row_len[%d]", rc, (int)row_len) ;

              if ((int)row_len > 0)

              {

                             char buf[5000] = {0};

                             int len = (row_len > 5000) ? 5000 : (int)row_len ;

                             STRNCPY(buf, row, len) ;

                             CLogger::Error(__FILE__, __LINE__, "Error returned by Couchbase while executing query[%s]", buf) ;

              }

            }

 

                  // When no document is present, CB returns success, use internal code to send to application to handle no document.

            if(result->nRec == 0 && rc == LCB_SUCCESS )

                result->rc = KVDB_LCB_KEY_ENOENT ; 

            else

            {

                result->rc = rc ;

            }

              }

    else

    {

        BOOL res = result->HandleDocument(row, (int)row_len, (int)rc, result->userArg, result); /* Sends error description to applications */   

        result->nRec++;

        if(res != TRUE)

        {

            char buf[5000] = {0};

            int len = (row_len > 5000) ? 5000 : (int)row_len ;

            STRNCPY(buf, row, len) ;

            CLogger::Info(__FILE__, __LINE__, "Error found while parsing document [%s]", buf) ;

            lcb_QUERY_HANDLE* handle;

            lcb_respquery_handle(resp, &handle);

            lcb_query_cancel(instance, handle);

            result->rc = LCB_ERR_GENERIC;

        }

    }

}

pccb · July 25, 2025, 7:32am

@mreiche, Do you happen to see anything off with the callback or the part of the code that invokes the query?

What’s puzzling is that the exact same query, when run via cbq on the same machine, completes in about 11–12 seconds — but through the app using the embedded C SDK, it takes around 35–37 seconds. So at least for now, we’ve ruled out any network-related issues. It seems more likely that the delay is within the SDK or the app logic itself.

Would appreciate your thoughts.

graham.pople · July 25, 2025, 9:30am

@pccb by any chance is it a query that you provide parameters for through the SDK, but hardcode those parameters into the statement in cbq?

This has been a past cause of ‘SDK is slower than UI/cbq’ issues before, e.g. see Query is fast in couchbase query window but slow in C# code - #9 by btburnett3

pccb · July 25, 2025, 4:49pm

thanks @graham.pople , we verified, in both case the same index is used.

Topic		Replies	Views
N1QL query Performance is getting low irrespective of iterations SQL++ n1ql	11	2808	January 29, 2016
Couchbase 4.1 N1QL - perfomance issue SQL++	14	4586	August 31, 2016
N1QL Performances - Are 2,5K requests by second at 2ms each an expected average? SQL++ query , n1ql , sdk , dot-net	23	3869	May 25, 2017
N1Ql with cover index performance very slow SQL++ query , n1ql	41	4430	October 21, 2016
N1QL SELECT * VS SELECT <lots of fieldnames> performance SQL++ query , n1ql	19	5077	January 8, 2017

Stream phase of the N1QL query is taking time without and with covering index

Stream: high execution time

Related topics