Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 1.8.1-release-candidate
-
Fix Version/s: 2.0
-
Component/s: couchbase-bucket
-
Security Level: Public
-
Labels:
-
Environment:Ubuntu 64 bit
181-831-rel
Description
Failing test is:-
rebalancetests.RebalanceInOutWithParallelLoad.test_load,get-logs:True,replica:2,num_nodes:7
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15105.17>:ns_replicas_builder:kill_a_bunch_of_tap_names:209] Killed the following tap names on 'ns_1@10.1.3.109': [<<"replication_building_62_'ns_1@10.1.3.112'">>]
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_single_vbucket_mover:mover_inner:88] Got exit message (parent is <0.13796.17>). Exiting...
{'EXIT',<0.15105.17>,{replicator_died,{'EXIT',<16541.21868.10>,normal}}}
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:sync_shutdown_many:147] Shutdown of the following failed: [{<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}]
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: erlang:apply/2
pid: <0.15105.17>
registered_name: []
exception exit: {replicator_died,{'EXIT',<16541.21868.10>,normal}}
in function ns_replicas_builder:'-build_replicas_main/6-fun-0-'/1
in call from ns_replicas_builder:observe_wait_all_done_tail/5
in call from ns_replicas_builder:observe_wait_all_done/5
in call from ns_replicas_builder:'-build_replicas_main/6-fun-1-'/8
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_replicas_builder:build_replicas_main/6
ancestors: [<0.15104.17>,<0.13796.17>,<0.13763.17>]
messages: [{'EXIT',<16541.21868.10>,normal}]
links: [<0.15104.17>]
dictionary: []
trap_exit: true
status: running
heap_size: 121393
stack_size: 24
reductions: 12423
neighbours:
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:try_with_maybe_ignorant_after:68] Eating exception from ignorant after-block:
{error,{badmatch,[{<0.15105.17>,
{replicator_died,{'EXIT',<16541.21868.10>,normal}}}]},
[{ns_replicas_builder,sync_shutdown_many,1},
{ns_replicas_builder,try_with_maybe_ignorant_after,2},
{ns_single_vbucket_mover,mover,6},
{proc_lib,init_p_do_apply,3}]}
[rebalance:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.13796.17>:ns_vbucket_mover:handle_info:158] <0.15104.17> exited with {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.9348.1>:ns_port_server:log:161] memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Schedule the backfill for vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "pending"
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Backfill is completed with VBuckets 61,
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "close_backfill" and vbucket 61
memcached<0.9348.1>: Vbucket <61> is going dead.
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "active"
memcached<0.9348.1>: TAP takeover is completed. Disconnecting tap stream <eq_tapq:rebalance_61>
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: ns_single_vbucket_mover:mover/6
pid: <0.15104.17>
registered_name: []
exception exit: {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
in function ns_single_vbucket_mover:mover_inner/6
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_single_vbucket_mover:mover/6
ancestors: [<0.13796.17>,<0.13763.17>]
messages: []
links: [<0.13796.17>]
dictionary: [{cleanup_list,[<0.15105.17>]}]
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 4014
rebalancetests.RebalanceInOutWithParallelLoad.test_load,get-logs:True,replica:2,num_nodes:7
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15105.17>:ns_replicas_builder:kill_a_bunch_of_tap_names:209] Killed the following tap names on 'ns_1@10.1.3.109': [<<"replication_building_62_'ns_1@10.1.3.112'">>]
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_single_vbucket_mover:mover_inner:88] Got exit message (parent is <0.13796.17>). Exiting...
{'EXIT',<0.15105.17>,{replicator_died,{'EXIT',<16541.21868.10>,normal}}}
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:sync_shutdown_many:147] Shutdown of the following failed: [{<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}]
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: erlang:apply/2
pid: <0.15105.17>
registered_name: []
exception exit: {replicator_died,{'EXIT',<16541.21868.10>,normal}}
in function ns_replicas_builder:'-build_replicas_main/6-fun-0-'/1
in call from ns_replicas_builder:observe_wait_all_done_tail/5
in call from ns_replicas_builder:observe_wait_all_done/5
in call from ns_replicas_builder:'-build_replicas_main/6-fun-1-'/8
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_replicas_builder:build_replicas_main/6
ancestors: [<0.15104.17>,<0.13796.17>,<0.13763.17>]
messages: [{'EXIT',<16541.21868.10>,normal}]
links: [<0.15104.17>]
dictionary: []
trap_exit: true
status: running
heap_size: 121393
stack_size: 24
reductions: 12423
neighbours:
[ns_server:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.15104.17>:ns_replicas_builder:try_with_maybe_ignorant_after:68] Eating exception from ignorant after-block:
{error,{badmatch,[{<0.15105.17>,
{replicator_died,{'EXIT',<16541.21868.10>,normal}}}]},
[{ns_replicas_builder,sync_shutdown_many,1},
{ns_replicas_builder,try_with_maybe_ignorant_after,2},
{ns_single_vbucket_mover,mover,6},
{proc_lib,init_p_do_apply,3}]}
[rebalance:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.13796.17>:ns_vbucket_mover:handle_info:158] <0.15104.17> exited with {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.109:<0.9348.1>:ns_port_server:log:161] memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Schedule the backfill for vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "pending"
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 61
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Backfill is completed with VBuckets 61,
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_OPAQUE with command "close_backfill" and vbucket 61
memcached<0.9348.1>: Vbucket <61> is going dead.
memcached<0.9348.1>: TAP (Producer) eq_tapq:rebalance_61 - Sending TAP_VBUCKET_SET with vbucket 61 and state "active"
memcached<0.9348.1>: TAP takeover is completed. Disconnecting tap stream <eq_tapq:rebalance_61>
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62
[error_logger:error] [2012-05-21 8:17:31] [ns_1@10.1.3.109:error_logger:ale_error_logger_handler:log_report:72]
=========================CRASH REPORT=========================
crasher:
initial call: ns_single_vbucket_mover:mover/6
pid: <0.15104.17>
registered_name: []
exception exit: {exited,
{'EXIT',<0.15105.17>,
{replicator_died,
{'EXIT',<16541.21868.10>,normal}}}}
in function ns_single_vbucket_mover:mover_inner/6
in call from ns_replicas_builder:try_with_maybe_ignorant_after/2
in call from ns_single_vbucket_mover:mover/6
ancestors: [<0.13796.17>,<0.13763.17>]
messages: []
links: [<0.13796.17>]
dictionary: [{cleanup_list,[<0.15105.17>]}]
trap_exit: true
status: running
heap_size: 987
stack_size: 24
reductions: 4014
[ns_server:info] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:init:181] Setting {"10.1.3.112",11209} vbucket 62 to state replica
[rebalance:debug] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:init:186] CheckpointIdsDict:
{dict,64,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[[0|1],[16|1],[32|1],[48|1]],
[[3|1],[19|1],[35|1],[51|1]],
[[6|1],[22|1],[38|1],[54|1]],
[[9|1],[25|1],[41|1],[57|1]],
[[12|1],[28|1],[44|1],[60|1]],
[[15|1],[31|1],[47|1],[63|1]],
[[2|1],[18|1],[34|1],[50|1]],
[[5|1],[21|1],[37|1],[53|1]],
[[8|1],[24|1],[40|1],[56|1]],
[[11|1],[27|1],[43|1],[59|1]],
[[14|1],[30|1],[46|1],[62|1]],
[[1|1],[17|1],[33|1],[49|1]],
[[4|1],[20|1],[36|1],[52|1]],
[[7|1],[23|1],[39|1],[55|1]],
[[10|1],[26|1],[42|1],[58|1]],
[[13|1],[29|1],[45|1],[61|1]]}}}
[ns_server:debug] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:init:209] killing tap named: replication_building_62_'ns_1@10.1.3.112'
[rebalance:debug] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:init:247] upstream_sender pid: <0.21869.10>
[rebalance:info] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:process_upstream:447] Initial stream for vbucket 62
[rebalance:info] [2012-05-21 8:17:31] [ns_1@10.1.3.112:<0.21868.10>:ebucketmigrator_srv:do_confirm_sent_messages:315] Got close ack!
And this is logs from memcached on source node (.109)
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Schedule the backfill for vbucket 62
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "opaque_enable_auto_nack" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "enable_checkpoint_sync" and vbucket 0
memcached<0.9348.1>: TAP (Producer) eq_tapq:replication_building_62_'ns_1@10.1.3.112' - Sending TAP_OPAQUE with command "initial_vbucket_stream" and vbucket 62