Details
Description
Failing test case:-
swaprebalance.SwapRebalanceFailedTests.test_add_back_failed_node,replica=1,num-buckets=1,num-swap=3
[user:warn] [2012-05-21 18:30:02] [ns_1@10.1.3.74:ns_node_disco:ns_node_disco:handle_info:150] Node 'ns_1@10.1.3.74' saw that node 'ns_1@10.1.3.77' went down.
[ns_server:info] [2012-05-21 18:30:02] [ns_1@10.1.3.74:ns_node_disco_events:ns_node_disco_log:handle_event:46] ns_node_disco_log: nodes changed: ['ns_1@10.1.3.74','ns_1@10.1.3.76',
'ns_1@10.1.3.79','ns_1@10.1.3.80']
[ns_server:warn] [2012-05-21 18:30:02] [ns_1@10.1.3.74:mb_master:mb_master:master:399] Master got candidate heartbeat from node 'ns_1@10.1.3.75' which is not in peers ['ns_1@10.1.3.74',
'ns_1@10.1.3.76',
'ns_1@10.1.3.79',
'ns_1@10.1.3.80']
[rebalance:warn] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.975.0>:ebucketmigrator_srv:do_confirm_sent_messages:321] Got error while trying to read close ack:{error,closed}
[ns_server:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.1440.0>:ns_vbm_sup:kill_child:214] Stopped replicator:{child_id,[0,1],'ns_1@10.1.3.75'} on {'ns_1@10.1.3.74',
"default"}
[user:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.217.0>:ns_orchestrator:handle_info:245] Rebalance exited with reason {shutdown,
{gen_server,call,
[{'ns_vbm_sup-default','ns_1@10.1.3.76'},
which_children,infinity]}}
[ns_server:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.1688.0>:diag_handler:log_all_tap_and_checkpoint_stats:123] logging tap & checkpoint stats
[ns_server:debug] [2012-05-21 18:30:03] [ns_1@10.1.3.74:ns_config_log:ns_config_log:log_common:111] config change:
counters ->
[{rebalance_fail,1},
{rebalance_start,2},
{failover_node,3},
{rebalance_success,1}]
swaprebalance.SwapRebalanceFailedTests.test_add_back_failed_node,replica=1,num-buckets=1,num-swap=3
[user:warn] [2012-05-21 18:30:02] [ns_1@10.1.3.74:ns_node_disco:ns_node_disco:handle_info:150] Node 'ns_1@10.1.3.74' saw that node 'ns_1@10.1.3.77' went down.
[ns_server:info] [2012-05-21 18:30:02] [ns_1@10.1.3.74:ns_node_disco_events:ns_node_disco_log:handle_event:46] ns_node_disco_log: nodes changed: ['ns_1@10.1.3.74','ns_1@10.1.3.76',
'ns_1@10.1.3.79','ns_1@10.1.3.80']
[ns_server:warn] [2012-05-21 18:30:02] [ns_1@10.1.3.74:mb_master:mb_master:master:399] Master got candidate heartbeat from node 'ns_1@10.1.3.75' which is not in peers ['ns_1@10.1.3.74',
'ns_1@10.1.3.76',
'ns_1@10.1.3.79',
'ns_1@10.1.3.80']
[rebalance:warn] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.975.0>:ebucketmigrator_srv:do_confirm_sent_messages:321] Got error while trying to read close ack:{error,closed}
[ns_server:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.1440.0>:ns_vbm_sup:kill_child:214] Stopped replicator:{child_id,[0,1],'ns_1@10.1.3.75'} on {'ns_1@10.1.3.74',
"default"}
[user:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.217.0>:ns_orchestrator:handle_info:245] Rebalance exited with reason {shutdown,
{gen_server,call,
[{'ns_vbm_sup-default','ns_1@10.1.3.76'},
which_children,infinity]}}
[ns_server:info] [2012-05-21 18:30:03] [ns_1@10.1.3.74:<0.1688.0>:diag_handler:log_all_tap_and_checkpoint_stats:123] logging tap & checkpoint stats
[ns_server:debug] [2012-05-21 18:30:03] [ns_1@10.1.3.74:ns_config_log:ns_config_log:log_common:111] config change:
counters ->
[{rebalance_fail,1},
{rebalance_start,2},
{failover_node,3},
{rebalance_success,1}]
*) during rebalance .76's replication supervisor finally died due to max_restart_intensity caused by it's inability to replicate into no more existing bucket on .77
*) right at that time we were asking it for it's child's to do replication changes thus rebalance failed