Hello,
I’m running a CE 7.2.4 cluster and would like to upgrade to 7.6.2. I read this notice:
Warning: Upgrading from Versions 7.1 or 7.2 to Versions 7.6.0 or 7.6.1
If there are index service nodes running in your cluster, you must use the swap rebalance method when upgrading from Couchbase Server 7.1 or 7.2 to Server 7.6.0 or 7.6.1.
If I understand correctly, this message doesn’t apply in my case because the CE upgrade path goes to 7.6.2 directly. However I’m a bit unsure whether I can indeed just do a rolling upgrade without swap rebalance + spare node. My fear is that this notice might be outdated because it just wasn’t updated once 7.6.2 was released. Could you please confirm that it really doesn’t apply for going to 7.6.2 directly? Thank you very much!
I assume that warning is because of https://jira.issues.couchbase.com/browse/MB-62547. However plasma is only used in Enterprise Edition, so none of that would apply to CE. If this assumption is correct, then upgrading CE to any 7.6.x would not require swap rebalance.
From 7.6.0 onwards, plasma (Standard Index Storage) has changed the on-disk format for shard metadata. This includes changing the shard naming scheme to UUID based (MB-53667: Use UUID for ShardIdResolved). A shard is basically a logical entity which stores plasma instances (GSI indexes)
Can you please help summarize the issue and any workaround or guidance we’d give to users?
This affects a cluster with index service.
Due to the issue, after an offline (i.e. in-place) upgrade from 7.0/7.1/7.2 to 7.6.0/7.6.1, shards with both the old and new naming scheme will continue to exist on disk. This means for an index on non-default scope/collection, any new index data will be written to the old shard path instead of the new. The shard metadata which persists information regarding the instances also continues to exist in two places. Over time (especially on restart after upgrade), this issue can manifest itself in various various forms (storage corruption, crashes, rollbacks to zero, mutation being stuck, shard being destroyed under the hood).
Workaround:
-
Recommended approach is to use swap rebalance (instead of offline upgrade) for upgrade to 7.6.0/7.6.1. (https://couchbasecloud.atlassian.net/browse/DOC-12334)
-
The issue is fixed in 7.6.2.
-
For a customer who has already upgraded,
a) We must drop all the indexes. This can affect production traffic. But I cannot think of any other approach.
b) Followed by restarting the indexer. (This will cleanup both older and the newly created shards from disk)