[MB-5050] autofailover should not auto failover anybody when rebalance is marked as in-progress. Or at least be more careful there Created: 11/Apr/12  Updated: 13/May/12  Resolved: 18/Apr/12

Status: Resolved
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 1.8.0
Fix Version/s: 1.8.1
Security Level: Public

Type: Bug Priority: Major
Reporter: Aleksey Kondratenko Assignee: Aleksey Kondratenko
Resolution: Fixed Votes: 0
Labels: 1.8.1-release-notes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

In one of customers:

* rebalance was running
* node orchestrating rebalance failed
* node orchestrating rebalance was automatically failed over by new master
* node orchestrating rebalance re-joined (still thinking it was rebalancing see MB-5049)
* we had config conflict and bad things as a result

Looks like during rebalance we should just disable autofailover because during rebalance humans can take care of correct actions

Comment by Dipti Borkar [ 11/Apr/12 ]
Looks like this is a major code change, we should move to 1.8.2
Comment by Aleksey Kondratenko [ 11/Apr/12 ]
hm. It's not that major. And I've just uploaded fix into gerrit. http://review.couchbase.org/14789
Comment by Aliaksey Artamonau [ 18/Apr/12 ]
Fix merged.
Comment by Thuan Nguyen [ 20/Apr/12 ]
Integrated in github-ns-server-2-0 #333 (See [http://qa.hq.northscale.net/job/github-ns-server-2-0/333/])
    disallow automatic failover during rebalance.MB-5050 (Revision e7db1454427a6e9cfe0bb2730ba27f025c1fafab)

     Result = SUCCESS
Aliaksey Artamonau :
Files :
* src/auto_failover_logic.erl
* src/auto_failover.erl
Generated at Tue Oct 21 00:42:38 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.