Possible Inconsistency in Index

Here is sample docs:
We have the following two sets of docs
[
{
“Key_”: “u:3:u:4:rel”,
“moddt”: 1500492737924,
“src”: [
“a”
],
“status”: “confirmed”,
“tp”: “rel”,
“uidhi”: 4,
“uidlow”: 3
},
{
“Key_”: “u:4:u:6:rel”,
“moddt”: 1500494209607,
“src”: [
“a”
],
“status”: “confirmed”,
“tp”: “rel”,
“uidhi”: 6,
“uidlow”: 4
},
{
“Key_”: “u:5:u:6:rel”,
“moddt”: 1500494827427,
“src”: [
“a”
],
“status”: “confirmed”,
“tp”: “rel”,
“uidhi”: 6,
“uidlow”: 5
},
{
“Key_”: “u:5:u:7:rel”,
“moddt”: 1500586041280,
“src”: [
“a”
],
“status”: “confirmed”,
“tp”: “rel”,
“uidhi”: 7,
“uidlow”: 5
}
]

Second Set:

[
{
“Key_”: “u:4”,
“fsname”: “cristiano”,
“id”: 4,
“tp”: “u”
},
{
“Key_”: “u:7”,
“fsname”: “juan”,
“id”: 7,
“tp”: “u”
},
{
“Key_”: “u:9”,
“fsname”: “eliana”,
“id”: 9,
“tp”: “u”
},
{
“Key_”: “u:2”,
“fsname”: “samsung”,
“id”: 2,
“tp”: “u”
},
{
“Key_”: “u:6”,
“fsname”: “raul”,
“id”: 6,
“tp”: “u”
},
{
“Key_”: “u:5”,
“fsname”: “gareth”,
“id”: 5,
“tp”: “u”
},
{
“Key_”: “u:3”,
“fsname”: “John”,
“id”: 3,
“tp”: “u”
},
{
“Key_”: “u:8”,
“fsname”: “Don Omar”,
“id”: 8,
“tp”: “u”
},
{
“Key_”: “u:1”,
“fsname”: “Daddy Yankee”,
“id”: 1,
“tp”: “u”
}
]

We have the following indexes:

CREATE INDEX idx_relationship_syntax1 ON users(‘u:’ || to_string(uidhi), uidlow, moddt, status, src) WHERE (tp = ‘rel’) USING GSI;
CREATE INDEX idx_relationship_syntax2 ON users(‘u:’ || to_string(uidlow), uidhi, moddt, status, src) WHERE (tp = ‘rel’) USING GSI;
CREATE INDEX idx_relationship_join_users ON users (fsname) WHERE (tp = ‘u’) USING GSI;

Two almost identical Queries:

Q1)
SELECT r.status, r.src, u.fsname
FROM users u
JOIN users r ON KEY ‘u:’ || to_string(r.uidhi) FOR u
WHERE r.uidlow = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = ‘rel’ and u.fsname is not missing and u.tp = ‘u’

Q2)
SELECT r.status, r.src, u.fsname
FROM users u
JOIN users r ON KEY ‘u:’ || to_string(r.uidlow) FOR u
WHERE r.uidhi = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = ‘rel’ and u.fsname is not missing and u.tp = ‘u’

Expected Results:

Q1)
[
{
“fsname”: “raul”,
“src”: [
“a”
],
“status”: “confirmed”
}
]

Q2)

[
{
“fsname”: “John”,
“src”: [
“a”
],
“status”: “confirmed”
}
]

Actual Results:

Q1 Erroneous Results)

{
“results”: [],
“metrics”: {
“elapsedTime”: “44.337062ms”,
“executionTime”: “44.289982ms”,
“resultCount”: 0,
“resultSize”: 0
}
}
Q2 Correct Results)
[
{
“fsname”: “john”,
“src”: [
“a”
],
“status”: “confirmed”
}
]

Explain of Q1)

[
{
“plan”: {
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexScan”,
“covers”: [
“cover ((u.tp))”,
“cover ((u.fsname))”,
“cover ((meta(u).id))”
],
“filter_covers”: {
“cover ((u.tp))”: “u”
},
“index”: “idx_relationship_join_users”,
“index_id”: “2d313c395fc5bd19”,
“keyspace”: “users”,
“namespace”: “default”,
“spans”: [
{
“Exact”: true,
“Range”: {
“High”: [
“successor(“u”)”
],
“Inclusion”: 1,
“Low”: [
"“u”",
“null”
]
}
}
],
“using”: “gsi”
},
{
"#operator": “Parallel”,
"~child": {
"#operator": “Sequence”,
"~children": [
{
"#operator": “IndexJoin”,
“as”: “r”,
“for”: “u”,
“keyspace”: “users”,
“namespace”: “default”,
“on_key”: “cover ((“u:” || to_string((r.uidhi))))”,
“scan”: {
“covers”: [
“cover ((“u:” || to_string((r.uidhi))))”,
“cover ((r.uidlow))”,
“cover ((r.moddt))”,
“cover ((r.status))”,
“cover ((r.src))”,
“cover ((meta(r).id))”
],
“filter_covers”: {
“cover ((r.tp))”: “rel”
},
“index”: “idx_relationship_syntax1”,
“index_id”: “acd7095af6f981bb”,
“using”: “gsi”
}
},
{
"#operator": “Filter”,
“condition”: “(((((cover ((r.uidlow)) = 4) and ((cover ((r.moddt)) is missing) or (0 <= cover ((r.moddt))))) and (cover ((r.tp)) = “rel”)) and (cover ((u.fsname)) is not missing)) and (cover ((u.tp)) = “u”))”
},
{
"#operator": “InitialProject”,
“result_terms”: [
{
“expr”: “cover ((r.status))”
},
{
“expr”: “cover ((r.src))”
},
{
“expr”: “cover ((u.fsname))”
}
]
},
{
"#operator": “FinalProject”
}
]
}
}
]
},
“text”: “SELECT r.status, r.src, u.fsname\nFROM users u \nJOIN users r ON KEY ‘u:’ || to_string(r.uidhi) FOR u\nWHERE r.uidlow = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = ‘rel’ and u.fsname is not missing and u.tp = ‘u’”
}
]
Problem:
Both queries are almost identical. Both queries appear to be using the correct indexes. However, query Q1 fails to fetch the right results while Q2 manages to get the proper results. Interestingly, if we force the query to use a Primary Index or any other index it fetches the proper results! We are doing that by using the following query:

SELECT r.status, r.src, u.fsname
FROM users u
JOIN users r ON KEY ‘u:’ || to_string(r.uidhi) FOR u
WHERE r.uidlow = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = ‘rel’

We are a bit confused and concerned about this situation. Please help us find what exactly wrong we are doing.

Thanks,

B

Looks like you are hitting https://issues.couchbase.com/browse/MB-24594. The fix is available in 4.6.3 and 5.0.0

Work around will be make Right side of the Index JOIN not covered by removing some key from the Index.

Example: CREATE INDEX idx_relationship_syntax1 ON users(‘u:’ || to_string(uidhi)) WHERE (tp = ‘rel’) USING GSI;
CREATE INDEX idx_relationship_syntax2 ON users(‘u:’ || to_string(uidlow)) WHERE (tp = ‘rel’) USING GSI;

I don’t know much of data model and applications but you have high selective predicate on right side i.e r.uidlow = 4 why are not considering the LOOK UP joins (even though right side is not covered)

CREATE INDEX idx_4 ON default(uidlow, moddt, uidhi, status, src) WHERE (tp = 'rel') USING GSI;
CREATE INDEX idx_4 ON default(uidhi, moddt, uidlow, status, src) WHERE (tp = 'rel') USING GSI;
SELECT r.status, r.src, u.fsname FROM default r
JOIN default u ON KEYS 'u:' || to_string(r.uidhi)
WHERE r.uidlow = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = 'rel' and u.fsname is not missing and u.tp = 'u';
SELECT r.status, r.src, u.fsname FROM default r
JOIN default u ON KEYS  'u:' || to_string(r.uidlow)
WHERE r.uidhi = 4 and (r.moddt is missing or r.moddt >= 0) and r.tp = 'rel' and u.fsname is not missing and u.tp = 'u';

When will 4.6.3 and 5.0.0 be available for Enterprise community?
We used have LOOK UP JOINs and are now moving to INDEXJOIN JOINs to improve performance. I guess we will have to wait till 4.6.3 to do that.
Thank you :slight_smile:
B

@Baker a sidenote: with 5.0, you can use Index Replicas : https://blog.couchbase.com/transition-index-replicas/