Persistent read error - "Indexer rollback from 127.0.0.1:9101 - cause: Indexer rollback from 127.0.0.1:9101"

Yesterday, I reached a state in couchbase where I could successfully save to the bucket, but I could not query the bucket. Unfortunately I don’t know how to recreate this, and the problem has now resolved itself. But I would like any info on how to properly deal with this in the future, if it comes up again.

Details:

  • Saving programatically to the bucket was successful
  • Any query on the bucket produced an error:

[
{
“code”: 5000,
“msg”: “Indexer rollback from 127.0.0.1:9101 - cause: Indexer rollback from 127.0.0.1:9101”,
“query_from_user”: “select * from default”
}
]

  • This problem continued to happen after flushing the bucket, deleting the bucket and adding a new bucket, stopping & starting the couchbase server multiple times, and restarting my computer.

  • I was able to drop the index of the bucket, but it would hang if I tried to create the index again. All new buckets I created seemed to have the index already created - I would receive an error (“Index #primary already exists”) when running CREATE PRIMARY INDEX ON 'bucketName' with a new bucket.

  • After stopping & starting the couchbase server for probably the 5th time, the problem resolved itself. I don’t know if it was specifically the restart that fixed the problem.

Does anyone have any information about how I may have gotten into this state, and how to resolve this issue predictably in the future?

1 Like

I’m experiencing the same error when querying with consistency requlstPluse

@amit2, was there any bucket flush operation done on the cluster? We’ll need the logs to see why indexer got the rollback. Which Couchbase server version are you using?

Yes, there was a bucket flush operation done on the cluster. i’ve used the node.js sdk. it was “completed” in the sense that the the flush function callback returned.

I’m using the latest couchbase docker.

Facing same issue with Java SDK. Query fails if fired immediately after a flush.

After a bucket is flushed, the indexes need to rollback the indexed data. The notification of bucket flush comes asynchronously to indexer. There can be a small time lag between flush and indexes getting rolled back, during which duration it is possible to see the above error for consistent scans. Does this error go away after let’s say a few seconds? If not, please collect the logs from UI -> Logs and share with us. We can investigate further.

If the error persists, you can restart the indexer process after collecting the logs to unblock.

Yes, if there is a delay of few seconds between the flush and query being fired, it works fine. However, I am querying with a scan consistency of ScanConsistency.REQUEST_PLUS. As per my understanding, shouldn’t this ensure that the index is up to date before firing the query? Or is that not the case?

The consistency guarantee is maintained by the index service. It is not checked by sdk client/query service before sending the query down to indexer. Index service will return the above error for any waiting/in-flight queries whenever it has to process rollback.

Each insert/update/delete operation in data service is assigned a unique sequence number(which keeps incrementing). The consistency is based on this sequence number. During rollback, these sequence numbers go back in time and get reused e.g. a flush would cause the sequence number to go to 0 and be reused. To avoid inconsistency, the scan needs to be retried.