Very simple query, returns wrong result. Probably size problem

Hi everybody,
today I faced an unusual problem.
{
[

],
“cityname”: “Geneve”,
“countryname”: “Switzerland”,
“lat”: “46.2043907”,
“lon”: “6.1431577”
}

The following query:

SELECT cityname, countryname
FROM bucket
WHERE cityname = ‘Geneve’

works just fine, and it gives me the following result:

[
{
“cityname”: “Geneve”,
“countryname”: “Switzerland”
}
]

But when i execute the query (query which works fine for the other cities):

SELECT cityname, countryname
FROM bucket
WHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’

returns an empty result.

At first, I thought it was some kind of unexplicable bug of Couchbase, but then I’ve noticed that this particular document has a ~ 258KB dimension.
So, I’ve tried the same query with the only other city document which is bigger than 256KB and it gave me the same problem.

Could please someone explain me why some query (the second one, in this case) doesn’t work in some case, probably related to the size?
Thanks.

The document size should be fine. Post the EXPLAIN of the query and index definition. What is the Couchbase Version.
Also check indexer.log any skipped entries due to index size limitations.

Also try the following index.

CREATE INDEX ix1 bucket(cityname,countryname);

EXPLAIN of the not working query, before creating the index you suggested:

[
{
“plan”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “DistinctScan”,
“scan”: {
#operator”: “IndexScan”,
“covers”: [
“cover ((bucket.countryname))”,
“cover ((bucket.cityname))”,
“cover ((distinct (array (city.auth_id) for city in (bucket.authors) end)))”,
“cover ((meta(bucket).id))”
],
“index”: “idx_countryname_cityname_auths”,
“index_id”: “b291df65543e8d06”,
“keyspace”: “bucket”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“"Switzerland"”,
“successor("Geneve")”
],
“Inclusion”: 1,
“Low”: [
“"Switzerland"”,
“"Geneve"”
]
}
}
],
“using”: “gsi”
}
},
{
#operator”: “Parallel”,
“~child”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Filter”,
“condition”: “((cover ((bucket.cityname)) = "Geneve") and (cover ((bucket.countryname)) = "Switzerland"))”
},
{
#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((bucket.cityname))”
},
{
“expr”: “cover ((bucket.countryname))”
}
]
},
{
#operator”: “FinalProject”
}
]
}
}
]
},
“text”: “SELECT cityname, countryname\r\nFROM bucket\r\nWHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’”
}
]

The default index choosen from the RDBMS, is the following one:

CREATE INDEX idx_countryname_cityname_auths ON bucket(countryname,cityname,(distinct (array (city.auth_id) for city in authors end)))

I created the index you told me, and now it works! But in this moment, I’m really confused: how could that be possible? I thought that the use of indexes could not affect the result of a query, but just the execution time as indexes are just useful as optimization.

By the way the EXPLAIN with your index is the following one:

[
{
“plan”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “IndexScan”,
“covers”: [
“cover ((bucket.cityname))”,
“cover ((bucket.countryname))”,
“cover ((meta(bucket).id))”
],
“index”: “idx_cityname_countryname”,
“index_id”: “e0cc44b3419b750”,
“keyspace”: “bucket”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“"Geneve"”,
“"Switzerland"”
],
“Inclusion”: 3,
“Low”: [
“"Geneve"”,
“"Switzerland"”
]
}
}
],
“using”: “gsi”
},
{
#operator”: “Parallel”,
“~child”: {
#operator”: “Sequence”,
“~children”: [
{
#operator”: “Filter”,
“condition”: “((cover ((bucket.cityname)) = "Geneve") and (cover ((bucket.countryname)) = "Switzerland"))”
},
{
#operator”: “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((bucket.cityname))”
},
{
“expr”: “cover ((bucket.countryname))”
}
]
},
{
#operator”: “FinalProject”
}
]
}
}
]
},
“text”: “SELECT cityname, countryname\r\nFROM bucket\r\nWHERE cityname = ‘Geneve’ AND countryname = ‘Switzerland’”
}
]

The above index has ARRAY index key. Indexer has limit on index key size for each entry and total index key size for each document (cc @deepkaran.salooja) . If it exceed that limit it will skip indexing document and log the entry in indexer.log. If it is skipped indexing query will not find the entry. If the number of elements in array has lot it might exceeded and skipped.

1 Like

Oh right, I hadn’t considered the size of the index!
Thank you, you’ve been very helpful.