Couchbase Query Question (SQL, KV Scan)

bellpumpkin · October 30, 2024, 7:25am

When selecting in Couchbase, I want to select all document values. I understand that using select * from bk.sc.coll scans the primary index. Would it be better to create a different index, or perhaps use a KV range scan instead? If there’s a recommended approach, please let me know

version 7.6.X

dh · October 30, 2024, 7:51am

If you are just retrieving all the documents without any filtering or ordering etc, you can just use SELECT * FROM ...; in 7.6 as long as you’re an admin user or you have been granted permission to use sequential scans, a primary index isn’t needed. Ref: Sequential Scans

Obviously this does mean all documents move from Data nodes to the Query node to the client; they are moved untouched if you are just passing them through unfiltered/otherwise accessed. Depending on the volume of data this may be acceptable. If not and/or overheads must be reduced, then you can use KV range scans (with documents or key-only in conjunction with GETs) in the application to remove that extra hop. Ref: KV Range Scans

Any recommendation must relate to the frequency of the operation, the chances for changes in the future (ease of maintenance) and the volume of data.

If this is something done infrequently, without particular constraint on the time it takes, I’d use the SQL statement with a sequential scan.

If it is performed a bit more frequently and key-ordering is necessary, then I’d create a primary index (this effectively pre-sorts the keys and statements then use the cached pre-sorted results), otherwise it is unnecessary overhead. (If sufficiently infrequently then the overhead of maintaining the primary index may not be worth it.)

If it is something that will be performed often or must be as highly optimised as possible, then I’d go the direct application approach - it just makes sense in such circumstances to eliminate the extra hop the data must take. You can also choose in your application to drive your data nodes to their limits at the expense of concurrency; the Query service will attempt to not do this ever.

That said, if you may want to start filtering or varying the order etc. in the future then SQL++ would once again likely make the most sense.

HTH.

system · January 28, 2025, 7:52am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Index and range scan for primary id Couchbase Server	5	4878	September 17, 2015
Issues on querying data SQL++ query , java , n1ql	4	574	October 3, 2022
Clarifications on Couchbase index SQL++ query , n1ql	4	1555	March 2, 2022
Sub-queries vs. KV Couchbase Server query , n1ql , index	7	1343	February 8, 2022
Range Queries - How to build primary OR secondary index Couchbase Server	4	2239	January 4, 2015

Couchbase Query Question (SQL, KV Scan)

Related topics