Very simple query, returns wrong result. Probably size problem

cecca_92 · July 8, 2017, 11:09am

Hi everybody,
today I faced an unusual problem.
{
[
…
],
“cityname”: “Geneve”,
“countryname”: “Switzerland”,
“lat”: “46.2043907”,
“lon”: “6.1431577”
}

The following query:

SELECT cityname, countryname
FROM bucket
WHERE cityname = ‘Geneve’

works just fine, and it gives me the following result:

[
{
“cityname”: “Geneve”,
“countryname”: “Switzerland”
}
]

But when i execute the query (query which works fine for the other cities):

SELECT cityname, countryname
FROM bucket
WHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’

returns an empty result.

At first, I thought it was some kind of unexplicable bug of Couchbase, but then I’ve noticed that this particular document has a ~ 258KB dimension.
So, I’ve tried the same query with the only other city document which is bigger than 256KB and it gave me the same problem.

Could please someone explain me why some query (the second one, in this case) doesn’t work in some case, probably related to the size?
Thanks.

vsr1 · July 8, 2017, 2:28pm

The document size should be fine. Post the EXPLAIN of the query and index definition. What is the Couchbase Version.
Also check indexer.log any skipped entries due to index size limitations.

Also try the following index.

CREATE INDEX ix1 bucket(cityname,countryname);

cecca_92 · July 8, 2017, 3:05pm

EXPLAIN of the not working query, before creating the index you suggested:

[
{
“plan”: {
“#operator”: “Sequence”,
“~children”: [
{
“#operator”: “DistinctScan”,
“scan”: {
“#operator”: “IndexScan”,
“covers”: [
“cover ((bucket.countryname))”,
“cover ((bucket.cityname))”,
“cover ((distinct (array (city.auth_id) for city in (bucket.authors) end)))”,
“cover ((meta(bucket).id))”
],
“index”: “idx_countryname_cityname_auths”,
“index_id”: “b291df65543e8d06”,
“keyspace”: “bucket”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“"Switzerland"”,
“successor("Geneve")”
],
“Inclusion”: 1,
“Low”: [
“"Switzerland"”,
“"Geneve"”
]
}
}
],
“using”: “gsi”
}
},
{
“#operator”: “Parallel”,
“~child”: {
“#operator”: “Sequence”,
“~children”: [
{
“#operator”: “Filter”,
“condition”: “((cover ((bucket.cityname)) = "Geneve") and (cover ((bucket.countryname)) = "Switzerland"))”
},
{
“#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((bucket.cityname))”
},
{
“expr”: “cover ((bucket.countryname))”
}
]
},
{
“#operator”: “FinalProject”
}
]
}
}
]
},
“text”: “SELECT cityname, countryname\r\nFROM bucket\r\nWHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’”
}
]

The default index choosen from the RDBMS, is the following one:

CREATE INDEX idx_countryname_cityname_auths ON bucket(countryname,cityname,(distinct (array (city.auth_id) for city in authors end)))

I created the index you told me, and now it works! But in this moment, I’m really confused: how could that be possible? I thought that the use of indexes could not affect the result of a query, but just the execution time as indexes are just useful as optimization.

By the way the EXPLAIN with your index is the following one:

[
{
“plan”: {
“#operator”: “Sequence”,
“~children”: [
{
“#operator”: “IndexScan”,
“covers”: [
“cover ((bucket.cityname))”,
“cover ((bucket.countryname))”,
“cover ((meta(bucket).id))”
],
“index”: “idx_cityname_countryname”,
“index_id”: “e0cc44b3419b750”,
“keyspace”: “bucket”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“"Geneve"”,
“"Switzerland"”
],
“Inclusion”: 3,
“Low”: [
“"Geneve"”,
“"Switzerland"”
]
}
}
],
“using”: “gsi”
},
{
“#operator”: “Parallel”,
“~child”: {
“#operator”: “Sequence”,
“~children”: [
{
“#operator”: “Filter”,
“condition”: “((cover ((bucket.cityname)) = "Geneve") and (cover ((bucket.countryname)) = "Switzerland"))”
},
{
“#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((bucket.cityname))”
},
{
“expr”: “cover ((bucket.countryname))”
}
]
},
{
“#operator”: “FinalProject”
}
]
}
}
]
},
“text”: “SELECT cityname, countryname\r\nFROM bucket\r\nWHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’”
}
]

vsr1 · July 8, 2017, 3:18pm

The above index has ARRAY index key. Indexer has limit on index key size for each entry and total index key size for each document (cc @deepkaran.salooja) . If it exceed that limit it will skip indexing document and log the entry in indexer.log. If it is skipped indexing query will not find the entry. If the number of elements in array has lot it might exceeded and skipped.

cecca_92 · July 8, 2017, 3:47pm

Oh right, I hadn’t considered the size of the index!
Thank you, you’ve been very helpful.

Topic		Replies	Views
C# performance query SQL++ query , n1ql	15	3000	September 21, 2017
More than 11 seconds to execute an N1QL query SQL++ query , n1ql	3	1173	June 15, 2017
Same N1QL query sometimes works and sometimes not SQL++	3	987	January 11, 2018
Document Size with N1QL query SQL++	7	11154	April 26, 2019
The same query returns different results SQL++ java , n1ql	2	769	July 9, 2020

Very simple query, returns wrong result. Probably size problem

Related topics