Hi all,
I have a collection with up to one millions documents and I’d like to get a combined data from more than one documents.
Document has five fields, [id, key, category, label, value] where id always consists of key:category:label in lower casePreformatted text
For example,
DocA
{
id: "a:my-category:my-label",
key: "a",
category: "my-category",
label: "my-label",
value:"This is DocA"
}
DocB
{
id: "b:my-category:my-label",
key: "b",
category: "my-category",
label: "my-label",
value:"This is DocB"
}
DocA and DocB have same category and label
I have a requirement to fetch the combined data based on key
Desired Data format
{
id: "a:my-category:my-label",
key: "a",
category: "my-category",
label: "my-label",
value:"This is DocA"
extraValue:"This is DocB"
}
This is my n1ql query:
SELECT d1.*, d2.value AS extraValue
FROM `doc-collection` AS d1
LEFT JOIN `doc-collection` AS d2 ON KEYS REPLACE(d1.`id`,"a","b")
WHERE d1.`key` = "a" OFFSET 0 LIMIT 10
It works well when offset is low, but it takes forever when offset is huge.
I assuming this query needs to scan the entire index.
Here is my index
CREATE INDEX by_category_label
ON doc-collection
(category
,label
)
CREATE INDEX by_key_category
ON doc-collection
(key
,category
)
What would be the most efficient way to achieve this?
Much appreciate for any suggestions.