Details
-
Type:
Bug
-
Status:
Closed
-
Priority:
Blocker
-
Resolution: Fixed
-
Affects Version/s: 2.0
-
Fix Version/s: 2.0
-
Component/s: couchbase-bucket, ns_server
-
Security Level: Public
-
Labels:
-
Environment:build 10.3.3.59
Description
1 node went down while loading data on 22 node cluster. (possibly related to xen-hypervisor as it could not ping gateway and network interface needed to be restarted)
While node was down I tried to fail it over and rebalance.
However, rebalance never completes and looks like there is no rebalance activity occuring on tap.
Some activity seen in logs at time of node down:
10.3.3.59 sees .60 nodedown :
[user:warn,2012-10-22T11:06:38.896,ns_1@10.3.3.59:ns_node_disco:ns_node_disco:handle_info:168]Node 'ns_1@10.3.3.59' saw that node 'ns_1@10.3.3.60' went down.
at the same time stamp node .60 shows:
[ns_server:error,2012-10-22T11:06:00.350,ns_1@10.3.3.60:<0.12281.36>:ns_janitor:cleanup_with_states:84]Bucket "default" not yet ready on ['ns_1@10.3.2.84','ns_1@10.3.2.
85',
'ns_1@10.3.2.110','ns_1@10.3.2.111',
'ns_1@10.3.2.112','ns_1@10.3.2.113',
'ns_1@10.3.2.114','ns_1@10.3.2.115',
'ns_1@10.3.3.59','ns_1@10.3.3.62',
'ns_1@10.3.3.65','ns_1@10.3.3.66',
'ns_1@10.3.3.69','ns_1@10.3.3.70',
'ns_1@10.3.121.90','ns_1@10.3.121.91',
'ns_1@10.3.2.107','ns_1@10.3.2.108',
'ns_1@10.3.2.109']
[ns_server:debug,2012-10-22T11:06:07.388,ns_1@10.3.3.60:<0.12508.36>:janitor_agent:new_style_query_vbucket_states_loop:116]Exception from query_vbucket_states of "defau
lt":'ns_1@10.3.2.85'
{'EXIT',{{nodedown,'ns_1@10.3.2.85'},
{gen_server,call,
[{'janitor_agent-default','ns_1@10.3.2.85'},
query_vbucket_states,infinity]}}}
While node was down I tried to fail it over and rebalance.
However, rebalance never completes and looks like there is no rebalance activity occuring on tap.
Some activity seen in logs at time of node down:
10.3.3.59 sees .60 nodedown :
[user:warn,2012-10-22T11:06:38.896,ns_1@10.3.3.59:ns_node_disco:ns_node_disco:handle_info:168]Node 'ns_1@10.3.3.59' saw that node 'ns_1@10.3.3.60' went down.
at the same time stamp node .60 shows:
[ns_server:error,2012-10-22T11:06:00.350,ns_1@10.3.3.60:<0.12281.36>:ns_janitor:cleanup_with_states:84]Bucket "default" not yet ready on ['ns_1@10.3.2.84','ns_1@10.3.2.
85',
'ns_1@10.3.2.110','ns_1@10.3.2.111',
'ns_1@10.3.2.112','ns_1@10.3.2.113',
'ns_1@10.3.2.114','ns_1@10.3.2.115',
'ns_1@10.3.3.59','ns_1@10.3.3.62',
'ns_1@10.3.3.65','ns_1@10.3.3.66',
'ns_1@10.3.3.69','ns_1@10.3.3.70',
'ns_1@10.3.121.90','ns_1@10.3.121.91',
'ns_1@10.3.2.107','ns_1@10.3.2.108',
'ns_1@10.3.2.109']
[ns_server:debug,2012-10-22T11:06:07.388,ns_1@10.3.3.60:<0.12508.36>:janitor_agent:new_style_query_vbucket_states_loop:116]Exception from query_vbucket_states of "defau
lt":'ns_1@10.3.2.85'
{'EXIT',{{nodedown,'ns_1@10.3.2.85'},
{gen_server,call,
[{'janitor_agent-default','ns_1@10.3.2.85'},
query_vbucket_states,infinity]}}}
Activity
Mike Wiederhold
made changes -
| Field | Original Value | New Value |
|---|---|---|
| Assignee | Mike Wiederhold [ mikew ] | Aleksey Kondratenko [ alkondratenko ] |
Chisheng Hong
made changes -
| Labels | system-test |
Farshid Ghods
made changes -
| Priority | Major [ 3 ] | Blocker [ 1 ] |
Aleksey Kondratenko
made changes -
| Assignee | Aleksey Kondratenko [ alkondratenko ] | Chiyoung Seo [ chiyoung ] |
Chiyoung Seo
made changes -
| Sprint Status | Current Sprint | |
| Component/s | couchbase-bucket [ 10173 ] |
Chiyoung Seo
made changes -
| Status | Open [ 1 ] | Resolved [ 5 ] |
| Resolution | Fixed [ 1 ] |
Chiyoung Seo
made changes -
| Sprint Status | Current Sprint |
Farshid Ghods
made changes -
| Status | Resolved [ 5 ] | Closed [ 6 ] |
[couchdb:error,2012-10-21T23:07:11.144,ns_1@127.0.0.1:couch_view:couch_log:error:42]Exit on non-updater process: config_change
[couchdb:error,2012-10-21T23:07:11.144,ns_1@127.0.0.1:couch_set_view:couch_log:error:42]Exit on non-updater process: config_change
[error_logger:error,2012-10-21T23:07:11.144,ns_1@127.0.0.1:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: couch_server:init/1
pid: <0.216.0>
registered_name: couch_server
exception exit: {function_clause,
[{couch_server,'-terminate/2-fun-0-',
[{<<"_replicator">>,<0.444.0>}]},
{lists,foreach,2},
{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]}
in function gen_server:terminate/6
ancestors: [couch_primary_services,couch_server_sup,cb_couch_sup,
ns_server_cluster_sup,<0.60.0>]
messages: []
links: [<0.230.0>,<0.444.0>,<0.211.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 2584
stack_size: 24
reductions: 4530
neighbours: