Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 1.8.1-release-candidate
-
Fix Version/s: 2.0
-
Component/s: couchbase-bucket
-
Security Level: Public
-
Labels:
-
Environment:Ubuntu 64 bit
181-831-rel
Description
Failing test is:-
rebalancetests.RebalanceInOutWithParallelLoad.test_load,get-logs:True,replica:2,num_nodes:7
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15105.17>:ns_replicas_builder:kill_a_bunch_of_tap_names:209] Killed the following tap names on 'ns_1@10.1.3.109': [<<"replication_building_62_'ns_1@10.1.3.112'">>]
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_single_vbucket_mover:mover_inner:88] Got exit message (parent is <0.13796.17>). Exiting...
{'EXIT',<0.15105.17>,{replicator_died,{'EXIT',<16541.21868.10>,normal}}}
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:sync_shutdown_many:147] Shutdown of the following failed: [{<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}]
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: erlang:apply/2
pid: <0.15105.17>
registered_name: []
exception exit: {replicator_died,{'EXIT',<16541.21868.10>,normal}}
in function ns_replicas_builder:'-build_replicas_main/6-fun-0-'/1
in call from ns_replicas_builder:observe_wait_all_done_tail/5
in call from ns_replicas_builder:observe_wait_all_done/5
in call from ns_replicas_builder:'-build_replicas_main/6-fun-1-'/8
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_replicas_builder:build_replicas_main/6
ancestors: [<0.15104.17>,<0.13796.17>,<0.13763.17>]
messages: [{'EXIT',<16541.21868.10>,normal}]
links: [<0.15104.17>]
dictionary: []
trap_exit: true
status: running
heap_size: 121393
stack_size: 24
reductions: 12423
neighbours:
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:try_with_maybe_ignorant_after:68] Eating exception from ignorant after-block:
{error,{badmatch,[{<0.15105.17>,
{replicator_died,{'EXIT',<16541.21868.10>,normal}}}]},
[{ns_replicas_builder,sync_shutdown_many,1},
{ns_replicas_builder,try_with_maybe_ignorant_after,2},
{ns_single_vbucket_mover,mover,6},
{proc_lib,init_p_do_apply,3}]}
[rebalance:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.13796.17>:ns_vbucket_mover:handle_info:158] <0.15104.17> exited with {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.9348.1>:ns_port_server:log:161] memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Schedule the backfill for vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "pending"
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Backfill is completed with VBuckets 61,
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "close_backfill" and vbucket 61
memcached<0.9348.1>: Vbucket <61> is going dead.
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "active"
memcached<0.9348.1>: TAP takeover is completed. Disconnecting tap stream <eq_tapq:rebalance_61>
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: ns_single_vbucket_mover:mover/6
pid: <0.15104.17>
registered_name: []
exception exit: {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
in function ns_single_vbucket_mover:mover_inner/6
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_single_vbucket_mover:mover/6
ancestors: [<0.13796.17>,<0.13763.17>]
messages: []
links: [<0.13796.17>]
dictionary: [{cleanup_list,[<0.15105.17>]}]
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 4014
rebalancetests.RebalanceInOutWithParallelLoad.test_load,get-logs:True,replica:2,num_nodes:7
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15105.17>:ns_replicas_builder:kill_a_bunch_of_tap_names:209] Killed the following tap names on 'ns_1@10.1.3.109': [<<"replication_building_62_'ns_1@10.1.3.112'">>]
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_single_vbucket_mover:mover_inner:88] Got exit message (parent is <0.13796.17>). Exiting...
{'EXIT',<0.15105.17>,{replicator_died,{'EXIT',<16541.21868.10>,normal}}}
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:sync_shutdown_many:147] Shutdown of the following failed: [{<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}]
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: erlang:apply/2
pid: <0.15105.17>
registered_name: []
exception exit: {replicator_died,{'EXIT',<16541.21868.10>,normal}}
in function ns_replicas_builder:'-build_replicas_main/6-fun-0-'/1
in call from ns_replicas_builder:observe_wait_all_done_tail/5
in call from ns_replicas_builder:observe_wait_all_done/5
in call from ns_replicas_builder:'-build_replicas_main/6-fun-1-'/8
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_replicas_builder:build_replicas_main/6
ancestors: [<0.15104.17>,<0.13796.17>,<0.13763.17>]
messages: [{'EXIT',<16541.21868.10>,normal}]
links: [<0.15104.17>]
dictionary: []
trap_exit: true
status: running
heap_size: 121393
stack_size: 24
reductions: 12423
neighbours:
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:try_with_maybe_ignorant_after:68] Eating exception from ignorant after-block:
{error,{badmatch,[{<0.15105.17>,
{replicator_died,{'EXIT',<16541.21868.10>,normal}}}]},
[{ns_replicas_builder,sync_shutdown_many,1},
{ns_replicas_builder,try_with_maybe_ignorant_after,2},
{ns_single_vbucket_mover,mover,6},
{proc_lib,init_p_do_apply,3}]}
[rebalance:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.13796.17>:ns_vbucket_mover:handle_info:158] <0.15104.17> exited with {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.9348.1>:ns_port_server:log:161] memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Schedule the backfill for vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "pending"
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Backfill is completed with VBuckets 61,
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "close_backfill" and vbucket 61
memcached<0.9348.1>: Vbucket <61> is going dead.
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "active"
memcached<0.9348.1>: TAP takeover is completed. Disconnecting tap stream <eq_tapq:rebalance_61>
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: ns_single_vbucket_mover:mover/6
pid: <0.15104.17>
registered_name: []
exception exit: {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
in function ns_single_vbucket_mover:mover_inner/6
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_single_vbucket_mover:mover/6
ancestors: [<0.13796.17>,<0.13763.17>]
messages: []
links: [<0.13796.17>]
dictionary: [{cleanup_list,[<0.15105.17>]}]
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 4014
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Aleksey Kondratenko
made changes -
| Field | Original Value | New Value |
|---|---|---|
| Assignee | Aleksey Kondratenko [ alkondratenko ] | Chiyoung Seo [ chiyoung ] |
Chiyoung Seo
made changes -
| Sprint Status | Current Sprint | |
| Sprint Priority | 0 |
Chiyoung Seo
made changes -
| Component/s | couchbase-bucket [ 10173 ] | |
| Component/s | ns_server [ 10019 ] |
Farshid Ghods
made changes -
| Labels | 1.8.1-release-notes | |
| Fix Version/s | 2.0-developer-preview-5 [ 10290 ] | |
| Fix Version/s | 1.8.1 [ 10295 ] | |
| Priority | Blocker [ 1 ] | Critical [ 2 ] |
| Sprint Status | Current Sprint | |
| Sprint Priority | 0 |
Karan Kumar
made changes -
| Attachment | 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.85-diag.gz [ 13566 ] | |
| Attachment | 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.82-diag.gz [ 13567 ] | |
| Attachment | 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.83-diag.gz [ 13568 ] | |
| Attachment | 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.84-diag.gz [ 13569 ] | |
| Attachment | 2ae93467-b468-4d0e-966e-7f14cdc3bb1f-10.1.3.86-diag.gz [ 13570 ] |
Karan Kumar
made changes -
| Attachment | 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.111-diag.gz [ 13643 ] | |
| Attachment | 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.120-diag.gz [ 13644 ] | |
| Attachment | 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.119-diag.gz [ 13645 ] | |
| Attachment | 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.112-diag.gz [ 13646 ] | |
| Attachment | 32326eea-21d4-4ef0-bf82-233b681a5336-10.1.3.121-diag.gz [ 13647 ] |
Ketaki Gangal
made changes -
| Comment | [ Yes, the password for the bucket- Bucket1 was changed at some point on the cluster, before the last rebalance was issued. ] |
Peter Wansch
made changes -
| Fix Version/s | 2.0-beta [ 10113 ] | |
| Fix Version/s | 2.0-developer-preview-5 [ 10290 ] |
Peter Wansch
made changes -
| Summary | Rebalance failed due to replicator_died: exited (ns_single_vbucket_mover) | memcached dropped connections: Rebalance failed due to replicator_died: exited (ns_single_vbucket_mover) |
Peter Wansch
made changes -
| Assignee | Chiyoung Seo [ chiyoung ] | Farshid Ghods [ farshid ] |
Farshid Ghods
made changes -
| Sprint Status | Next Sprint |
Andrei Baranouski
made changes -
| Attachment | 10.5.2.13-8091-diag.txt.gz [ 14253 ] | |
| Attachment | 10.5.2.14-8091-diag.txt.gz [ 14254 ] | |
| Attachment | 10.5.2.15-8091-diag.txt.gz [ 14255 ] | |
| Attachment | 10.5.2.16-8091-diag.txt.gz [ 14256 ] | |
| Attachment | 10.5.2.18-8091-diag.txt.gz [ 14257 ] | |
| Attachment | 10.5.2.19-8091-diag.txt.gz [ 14258 ] |
Peter Wansch
made changes -
| Fix Version/s | 2.0 [ 10114 ] | |
| Fix Version/s | 2.0-beta [ 10113 ] |
Peter Wansch
made changes -
| Status | Open [ 1 ] | Closed [ 6 ] |
| Resolution | Fixed [ 1 ] |
Peter Wansch
made changes -
| Resolution | Fixed [ 1 ] | |
| Status | Closed [ 6 ] | Reopened [ 4 ] |
| Sprint Status | Next Sprint |
Peter Wansch
made changes -
| Status | Reopened [ 4 ] | Closed [ 6 ] |
| Resolution | Fixed [ 1 ] |
Frank Weigel
made changes -
| Resolution | Fixed [ 1 ] | |
| Status | Closed [ 6 ] | Reopened [ 4 ] |
| Assignee | Farshid Ghods [ farshid ] | Frank Weigel [ frank ] |
Frank Weigel
made changes -
| Status | Reopened [ 4 ] | Resolved [ 5 ] |
| Resolution | Fixed [ 1 ] |