Optimize count on IS NOT MISSING subdocuments

tiste · August 17, 2017, 10:14am

Hello,

I currently experience some bad query performance, with a large dataset of documents that have publication keys like draft, staging etc, which are subdocuments.

Our goal is to count the documents with a publication state missing, like:
SELECT count(*) FROM data WHERE draft IS NOT MISSING AND domain='www'

We tried multiple things like creating an index over data(draft,domain) but it takes a huge part of our disk size.
We also tried an index like data(IFMISSING(draft,null),domain) but the query is to long as well.

Thanks for your help.

atom_yang · August 17, 2017, 10:50am

How about create index by

CREATE INDEX `idx_draft_domain` ON `data`(IFMISSINGORNULL(draft,1), domain) USING GSI;

and query by

SELECT count(1) FROM `data` WHERE IFMISSINGORNULL(draft,1) !=1 AND domain == 'www'

vsr1 · August 17, 2017, 12:13pm

Try with one of the following indexes.

CREATE INDEX ix1 ON `data`(draft) WHERE domain = "www";
CREATE INDEX ix2 ON `data`(domain, draft);

tiste · August 17, 2017, 12:14pm

Just run it, but what will be indexed? The entire draft subdocument?

Actually, these are really large document and we don’t really want to index them, but only the count of them.

tiste · August 17, 2017, 12:20pm

Creating a simple index on the draft key kills the server because of the size of the document…

vsr1 · August 17, 2017, 12:30pm

CREATE INDEX ix1 ON default(domain) WHERE draft IS NOT MISSING;
SELECT COUNT(1) FROM default WHERE draft IS NOT MISSING AND domain = "www";

Topic		Replies	Views
Count query on subdocument SQL++	10	2049	July 8, 2019
Query to get count on sub document SQL++	7	1214	August 3, 2019
IF MISSING Index not updating SQL++ query , n1ql , index	2	1074	July 12, 2020
Secondary index entries for docs with missing key values Couchbase Server query , n1ql	10	4890	July 19, 2016
Index on If missing field Couchbase Server n1ql	4	1348	December 16, 2019

Optimize count on IS NOT MISSING subdocuments

Related topics