Indexer crashes and restarts very frequently Community Edition 7.1.0 build 2556 ‧ IPv4

We’re frequently facing the issue of indexer restarts to the point the process gets unavailable and all the indexes start showing stale and eventually node fails over.
Sharing few errors:
1st Type of Error:
9,135729, 3837 committed:true
2022-06-29114:05:23.834+00:00 [Info] StorageMgr: :handleCreateSnapshot Added New Snapshot Index: 13519174020281430894 PartitionId: 0 sliceId: 0 Crc64: 9751788
094360975075
(SnapshotInfo: segos: 135729, 135729, 3837 committed:true) SnapType FORCE COMMIT SnapAligned true SnapCreateDur 11.26834ms SnapopenDur
39.69us
fatal error:
unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x7f01£888¢73bl
runtime stack:
runtime. throw (0x12ad0€5,
0x2a)
/home/couchbase/.cbdepscache/exploded/x8664/go-1.16.5/go/src/runtime/panic.go:1117+0x72
runtime.sigpanic()
/home/couchbase/.cbdepscache/exploded/x8664/00-1.16.5/ao/src/runtime/signalunix.do:718+0x2e5
goroutine 116565 [svscalll:
runtime. cgocall (Oxfe4e62,
Oxc002639768,
Oxc002639700)
/home/couchbase/.chdepscache/exploded/y8664/00-1.16.5/oo/src/runtime/coocall.do:154
+0×56 f0=0yc002639738 sp=08c002639700 0c=0×409796

2nd Type of Error:
inioscancoordinator::nandleAdaindexinstanced+xclu48ubalscancoordinator(4/5260y4/1360y/513/+401690043455/65/685navik
efnum index forestdb navik 2c6cf1c29dlale89ffclf0f14791519d false (‘reference number"] N1QL SINGLE
“document type"awb generated”)
[false] false false
10.128.0.61:80911 false false 0 false 0 (true 0 0 0)
default
default 0 0 false 0 0 0 0 0 0 0 O lI (I O 0) 4 0 1 0x0036c6ac0 I] 0 0 false forestab
map[0: (10 0 1:91051} 0x0003783850 1 0x005646800 0xc0036d€2c0)
0?
2022-06-29T11:39:45.241+00:00 [Info] Indexer::initPartnInstance Initialized Partition:
Index: 18327922870596519179 Partition: PartitionId: 0 Endpoints: [:9105]
2022-06-29T11:39:45.241+00:00 [INFO] [FDB] Forestdb opened database file /opt/couchbase/var/lib/couchbase/data/021/navik navik phone filter index 1 1832792287
0596519179 0.index/data.€db.7
2022-06-29711:39:45.242+00:00 (ERR] (FDB] doc length body checksum mismatch error in a database file '/opt/couchbase/var/lib/couchbase/data/021/navik_navik

hone filter index 1 18327922870596519179 0.index/data.fdb.71 crc 6c |= 39 (crc in doc) keylen 14128 metalen 12850 bodylen 875771186 bodylen ondisk 859256632
offset 32643467
2022-06-29T11:39:45.242+00:00 (ERRO][FDB] Error in reading a stale region info document from a database file "/opt/couchbase/var/lib/couchbase/data/021/navik
navik phone filter index 1 18327922870596519179 0.index/data.fdb.7:
revnum 2,
offset 32643467
2022-06-29T11:39:45.242+00:00 (ERRO](FDB] doc length body checksum mismatch error in a database file "/opt/couchbase/var/lib/couchbase/data/021/navik navik phone filter index 1 18327922870596519179 0.index/data.fdb.7 crc e9 != 30 (crc in doc) keylen 12322 metalen 11298 bodylen 740438050 bodylen ondisk 740438050
offset 32874935
2022-06-29T11:39:45.242+00:00 [ERRO][FB] Error in reading a stale region info document from a database file
"/opt/couchbase/var/lib/couchbase/data/021/navik
navik phone filter index 1 18327922870596519179 0.index/data.fdb.7’: revnum 3,
offset 32874935
2022-06-29T11:39:45.242+00:00 [ERRO1 (FDBl
doc length body checksum mismatch error in a database file
/opt/couchbase/var/lib/couchbase/data/021/navik_navik_p
hone filter
index 1 18327922870596519179 0.index/data.€db.7
FfSeE 33075307
crc 9 != 37 (crc in doc) keylen 14646 metalen 11313 bodylen 825833009 bodylen ondisk 861613149*
2022-06-29T11:39:45.242+00:00 (ERRO1[FDB] Error in reading a stale region info document from a database file
"/opt/couchbase/var/lib/couchbase/data/021/navik
navik phone filter index 1 18327922870596519179 O.index/data.fdb.7:
revnum 4,
offset 33075307
2022-06-29T11:39:45.242+00:00 (ERRO1 (FDB] doc length body checksum mismatch error in a database file
'/opt/couchbase/var/lib/couchbase/data/021/naviknavik
hone filter index 1 18327922870596519179 0.index/data.fdb.7 crc 71 != 33 (crc in doc) keylen 13104 metalen 13875 bodylen 741882160 bodvlen ondisk 959788852
offset 33276011
2022-06-29T11:39:45.242+00:00 [ERRO1 (FDB] Error in reading
a stale region info document
from a
database
file
“/opt/couchbase/var/lib/couchbase/data/021/navik navik phone filter index 1 18327922870596519179 0.index/data.fdb.7”:
revnum 5,
offset 33276011

3rd Type of Error:
101 115 116 97 109 112 0 0
6 50 48 50 50 45 48 54 45 50 57 84 49 48 58 53 56 58 51 53 46 54 49
50 45 48 54
B 240671467) </ud» in Slice: 0. Error: Encoded secondary key
is too long(> 13824) . Skipped.
Previosuly we had indexer.settings.max_array_seckey_size=10024, on changing that to 51200 we still faced this issue/some docs are getting skipped from the index.

The index keysize is exceeding and cause this crash. It looks the index is array index and might have stored whole array for covering. You should remove the whole array part of the index

Thank You for the Solution.
We’ve not observed the issue for past one month after making the changes and removing array from one of our indices.
Regards

1 Like

hi @vsr1 , we are facing similar issue.hwo do we find indexes key sizes and list down array indexes. we have around 2-3 hundred indexes.
could you help me how to find its size.

the issue here due to some bug app team is facing bellow issue

"timestamp": "2024-03-05T22:10:59.771",
    "message": "Resolved [com.couchbase.client.core.error.InternalServerFailureException: Internal Couchbase Server error {\"completed\":true,\"coreId\":\"0x821ac3eb00000001\",\"errors\":[{\"code\":5000,\"message\":\"*All indexer replica is down or unavailable or unable to process request - cause:* queryport.client.noHost\",\"retry\":false}],\"httpStatus\":500,\"idempotent\":false,\"lastDispatchedFrom\":\"[240b:c0e0:204:5400:b434:2:0:5fe4%0]:48828\",\"lastDispatchedTo\":\"[240d:c0e0:104:5451:9737:2:7:b10]:8093\",\"requestId\":40934315,\"requestType\":\"QueryRequest\",\"retried\":0,\"service\":{\"operationId\":\"94333647-e3d5-49e8-b27a-a6a9771fc19c\",\"statement\":\"SELECT `sessiondb_dep`.* , META(`sessiondb_dep`).cas AS version FROM `sessiondb_dep` WHERE `classId` = '201' AND `accountId`= ? AND `packageId` IN ? \",\"type\":\"query\"},\"timeoutMs\":50000,\"timings\":{\"dispatchMicros\":11558,\"totalDispatchMicros\":11558,\"totalMicros\":15993}}]",
    "class": "org.springframework.web.servlet.handler.AbstractHandlerExceptionResolver",
    "method": "resolveException",

Hi @kanamani92, the error posted seems to be a bit different than the original issue

\"message\":\"*All indexer replica is down or unavailable or unable to process request - cause:* queryport.client.noHost

.

  1. Can you let me know if you see this log in indexer.log?
Error: Encoded secondary key is too long
  1. Furthermore can you check if the following log is present in query.log ?
[WARN] Fail to find indexers to satisfy query request
  1. Lastly you can look at the indexer.log and query.log files and check for any other warning or error log popping up. And also please check if the indexer nodes are healthy in the cluster.

there is no such msg like that , but the combination of index keys length is crossed more than 64kb, which is causing this issue. looks indexer process is not able to respond as like normal causing this issue.