[MB-5108] Rolling upgrade from 172 to latest 181 fails with failed rebalance {type,exit}, {what,{noproc, {gen_fsm,sync_send_event,}}} Created: 18/Apr/12 Updated: 09/Jan/13 Resolved: 27/Apr/12 |
|
| Status: | Closed |
| Project: | Couchbase Server |
| Component/s: | ns_server |
| Affects Version/s: | 1.8.1-release-candidate |
| Fix Version/s: | 1.8.1 |
| Security Level: | Public |
| Type: | Bug | Priority: | Blocker |
| Reporter: | Karan Kumar | Assignee: | Aleksey Kondratenko |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 1.8.1-release-notes | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: | Ubuntu 10.04 | ||
| Attachments: |
|
| Description |
|
Failing test
upgradetests.MultipleNodeUpgradeTests.test_upgrade,initial_version=1.7.2,create_buckets=True,insert_data=True,start_upgraded_first=False,load_ratio=10,online_upgrade=True 2012-04-18 05:05:23,369 - root - INFO - adding node : 10.3.121.92:8091 to the cluster 2012-04-18 05:05:23,370 - root - INFO - adding remote node : 10.3.121.92 to this cluster @ : 10.3.121.98 2012-04-18 05:05:24,121 - root - INFO - added node : ns_1@10.3.121.92 to the cluster 2012-04-18 05:05:24,134 - root - INFO - rebalance params : password=password&ejectedNodes=&user=Administrator&knownNodes=ns_1%4010.3.121.94%2Cns_1%4010.3.121.92%2Cns_1%4010.3.121.98%2Cns_1%4010.3.121.93%2Cns_1%4010.3.121.97%2Cns_1%4010.3.121.95 2012-04-18 05:05:24,140 - root - ERROR - http://10.3.121.98:8091/controller/rebalance error 500 reason: unknown ["Unexpected server error, request logged."] 2012-04-18 05:05:24,140 - root - ERROR - rebalance operation failed INFO REPORT <0.6440.0> 2012-04-18 05:06:53 =============================================================================== ns_log: logging menelaus_web:19:Server error during processing: ["web request failed", {path,"/controller/rebalance"}, {type,exit}, {what, {noproc, {gen_fsm,sync_send_event, [{global,ns_orchestrator}, {start_rebalance, ['ns_1@10.3.121.94','ns_1@10.3.121.92', 'ns_1@10.3.121.98','ns_1@10.3.121.93', 'ns_1@10.3.121.97','ns_1@10.3.121.95'], [],[]}]}}}, {trace, [{gen_fsm,sync_send_event,2}, {menelaus_web,do_handle_rebalance,3}, {menelaus_web,loop,3}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}]}] |
| Comments |
| Comment by Karan Kumar [ 18/Apr/12 ] |
| This looks to be regression in ns_server |
| Comment by Aleksey Kondratenko [ 18/Apr/12 ] |
|
Found this to be issue in old ns_server. Rebalance requests needs to either be sent to node running new version or work around this issue by waiting and retrying. Commit that fixed it (for 1.8.0) is:
commit d45ccaab92158d4a4fc882d3216d1557b7b39816 Author: Aliaksey Kandratsenka <alk@tut.by> Date: Tue Nov 29 15:10:09 2011 +0300 wait for orchestrator presense for key operations. |
| Comment by Aleksey Kondratenko [ 18/Apr/12 ] |
| Not a "bug". |
| Comment by Karan Kumar [ 26/Apr/12 ] |
|
Still failing. The suggested workaround does not work. In the test we are issuing rebalance call to the newly upgraded 181 node.. Results in the rebalance failure. Rebalance exited with reason {{case_clause, {badrpc, {'EXIT', {{badfun,#Fun<erl_eval.4.88154533>}, [{erlang,apply,2}, {rpc,'-handle_call_call/6-fun-0-',5}]}}}}, [{ns_vbm_sup,change_vbucket_filter,4}, {ns_vbm_sup,'-set_replicas/3-fun-2-',5}, {lists,foldl,3}, {ns_vbm_sup,set_replicas,3}, {ns_vbm_sup,'-set_replicas/2-fun-1-',3}, {lists,foreach,2}, {ns_vbm_sup,apply_changes,2}, {ns_vbucket_mover,sync_replicas,0}]} |
| Comment by Karan Kumar [ 26/Apr/12 ] |
| Neither does waiting for the newly added node to become orchestrator solves this issue. |
| Comment by Aleksey Kondratenko [ 26/Apr/12 ] |
| That's different failure. Thanks for finding it. |
| Comment by Aleksey Kondratenko [ 27/Apr/12 ] |
| Fixed in http://review.couchbase.org/15366 |
| Comment by Thuan Nguyen [ 28/Apr/12 ] |
|
Integrated in github-ns-server-2-0 #342 (See [http://qa.hq.northscale.net/job/github-ns-server-2-0/342/]) reimplemented backwards-compat for change_vbucket_filter. forward-ported new change_filter code (023a90b14). Result = SUCCESS Aliaksey Kandratsenka : Files : * src/ns_vbm_sup.erl Aliaksey Kandratsenka : Files : * src/ns_vbm_sup.erl * src/cb_gen_vbm_sup.erl |