[MB-7280] [windows] rebalance is hunging when load is going on Created: 28/Nov/12  Updated: 02/Apr/13  Resolved: 31/Jan/13

Status: Resolved
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.0
Fix Version/s: 2.0.1
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Iryna Mironava Assignee: Thuan Nguyen
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: windows server r2 2008
build 1967
<manifest>
<remote name="couchbase" fetch="git://10.1.1.210/"/>
<remote name="membase" fetch="git://10.1.1.210/"/>
<remote name="apache" fetch="git://github.com/apache/"/>
<remote name="erlang" fetch="git://github.com/erlang/"/>
<default remote="couchbase" revision="master"/>
<project name="tlm" path="tlm" revision="12abea946eafd7411273d18a10ae1f84390db3d4">
<copyfile dest="Makefile" src="Makefile.top"/>
</project>
<project name="bucket_engine" path="bucket_engine" revision="70b3624abc697b7d18bf3d57f331b7674544e1e7"/>
<project name="ep-engine" path="ep-engine" revision="de42c8cab65eb4ec7107c1c43c333df0ef8ff73d"/>
<project name="libconflate" path="libconflate" revision="2cc8eff8e77d497d9f03a30fafaecb85280535d6"/>
<project name="libmemcached" path="libmemcached" revision="ca739a890349ac36dc79447e37da7caa9ae819f5" remote="membase"/>
<project name="libvbucket" path="libvbucket" revision="00d3763593c116e8e5d97aa0b646c42885727398"/>
<project name="membase-cli" path="membase-cli" revision="50a8ec94023aff2e2a756c1b0c144a9a6b82dc9b" remote="membase"/>
<project name="memcached" path="memcached" revision="7ea975a93a0231393502af4ca98976eee8a83386" remote="membase"/>
<project name="moxi" path="moxi" revision="52a5fa887bfff0bf719c4ee5f29634dd8707500e"/>
<project name="ns_server" path="ns_server" revision="4216b8613f5457efd9b6391bfec57529a074b291"/>
<project name="portsigar" path="portsigar" revision="1bc865e1622fb93a3fe0d1a4cdf18eb97ed9d600"/>
<project name="sigar" path="sigar" revision="63a3cd1b316d2d4aa6dd31ce8fc66101b983e0b0"/>
<project name="couchbase-examples" path="couchbase-examples" revision="c59551adf11860963c1bba028adf33529a4a4d4a"/>
<project name="couchbase-python-client" path="couchbase-python-client" revision="006c1aa8b76f6bce11109af8a309133b57079c4c"/>
<project name="couchdb" path="couchdb" revision="31560d74c3bbe8c019186923a9db3468a8197ab8"/>
<project name="couchdbx-app" path="couchdbx-app" revision="7b8eb60114fdc3b8e79baf8cc8ec87d32be71983"/>
<project name="couchstore" path="couchstore" revision="b5937c4479bf05dcc67264efe19abaf52870a127"/>
<project name="geocouch" path="geocouch" revision="849d5443689b1924f097548af864c539bffcc929"/>
<project name="mccouch" path="mccouch" revision="88701cc326bc3dde4ed072bb8441be83adcfb2a5"/>
<project name="testrunner" path="testrunner" revision="5b5b7c2a69d0ae5225baea541cf6df6e4478a1ab"/>
<project name="otp" path="otp" revision="b6dc1a844eab061d0a7153d46e7e68296f15a504" remote="erlang"/>
<project name="icu4c" path="icu4c" revision="26359393672c378f41f2103a8699c4357c894be7" remote="couchbase"/>
<project name="snappy" path="snappy" revision="5681dde156e9d07adbeeab79666c9a9d7a10ec95" remote="couchbase"/>
<project name="v8" path="v8" revision="447decb75060a106131ab4de934bcc374648e7f2" remote="couchbase"/>
<project name="gperftools" path="gperftools" revision="8f60ba949fb8576c530ef4be148bff97106ddc59" remote="couchbase"/>
<project name="pysqlite" path="pysqlite" revision="0ff6e32ea05037fddef1eb41a648f2a2141009ea" remote="couchbase"/>
</manifest>

Attachments: Text File tap_rebal_hangs_upgrade.txt    

 Description   
 rebalance is hunging when load is going on
1 node, rebalance 1-> 2, 5 views (1 view per ddoc), load using mcsoda is going on
once load is started rebalance hungs.
Statistics all and tap:
 accepting_conns: 1
 auth_cmds: 4
 auth_errors: 0
 bucket_active_conns: 1
 bucket_conns: 26
 bytes: 755258232
 bytes_read: 2673284540
 bytes_written: 2824684578
 cas_badval: 0
 cas_hits: 0
 cas_misses: 0
 cmd_flush: 0
 cmd_get: 1280
 cmd_set: 4134710
 conn_yields: 113110
 connection_structures: 5000
 curr_connections: 26
 curr_conns_on_port_11209: 18
 curr_conns_on_port_11210: 6
 curr_items: 2027168
 curr_items_tot: 3618437
 curr_temp_items: 0
 daemon_connections: 4
 decr_hits: 0
 decr_misses: 0
 delete_hits: 0
 delete_misses: 0
 ep_access_scanner_last_runtime: 22
 ep_access_scanner_num_items: 3383889
 ep_access_scanner_task_time: 2012-11-29 10:00:00
 ep_allow_data_loss_during_shutdown: 1
 ep_alog_block_size: 4096
 ep_alog_path: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default/access.log
 ep_alog_sleep_time: 1440
 ep_alog_task_time: 10
 ep_backend: couchdb
 ep_bg_fetch_delay: 0
 ep_bg_fetched: 38
 ep_bg_load: 931092
 ep_bg_load_avg: 24502
 ep_bg_max_load: 618147
 ep_bg_max_wait: 3941
 ep_bg_meta_fetched: 0
 ep_bg_min_load: 397
 ep_bg_min_wait: 66
 ep_bg_num_samples: 38
 ep_bg_remaining_jobs: 0
 ep_bg_wait: 16623
 ep_bg_wait_avg: 437
 ep_chk_max_items: 5000
 ep_chk_period: 1800
 ep_chk_persistence_remains: 0
 ep_chk_persistence_timeout: 20
 ep_chk_remover_stime: 5
 ep_commit_num: 657
 ep_commit_time: 6759
 ep_commit_time_total: 235174
 ep_concurrentDB: 1
 ep_config_file:
 ep_couch_bucket: default
 ep_couch_host: localhost
 ep_couch_port: 11213
 ep_couch_reconnect_sleeptime: 250
 ep_couch_response_timeout: 180000
 ep_data_age: 2888
 ep_data_age_highwat: 3580
 ep_data_traffic_enabled: 0
 ep_dbinit: 1
 ep_dbname: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default
 ep_degraded_mode: 0
 ep_diskqueue_drain: 4935087
 ep_diskqueue_fill: 5721297
 ep_diskqueue_items: 824556
 ep_diskqueue_memory: 25198208
 ep_diskqueue_pending: 43177657
 ep_exp_pager_stime: 3600
 ep_expired_access: 0
 ep_expired_pager: 0
 ep_expiry_window: 3
 ep_failpartialwarmup: 0
 ep_flush_all: false
 ep_flush_duration_total: 8736
 ep_flushall_enabled: 0
 ep_flusher_state: running
 ep_flusher_todo: 579935
 ep_getl_default_timeout: 15
 ep_getl_max_timeout: 30
 ep_ht_locks: 5
 ep_ht_size: 3079
 ep_inconsistent_slave_chk: 0
 ep_initfile:
 ep_io_num_read: 38
 ep_io_num_write: 4842366
 ep_io_read_bytes: 10745
 ep_io_write_bytes: 1356419842
 ep_item_begin_failed: 0
 ep_item_commit_failed: 0
 ep_item_flush_expired: 0
 ep_item_flush_failed: 0
 ep_item_num_based_new_chk: 1
 ep_items_rm_from_checkpoints: 4644424
 ep_keep_closed_chks: 0
 ep_klog_block_size: 4096
 ep_klog_compactor_queue_cap: 500000
 ep_klog_compactor_stime: 3600
 ep_klog_flush: commit2
 ep_klog_max_entry_ratio: 10
 ep_klog_max_log_size: 2147483647
 ep_klog_path:
 ep_klog_sync: commit2
 ep_kv_size: 575124879
 ep_max_checkpoints: 2
 ep_max_data_size: 838860800
 ep_max_item_size: 20971520
 ep_max_size: 838860800
 ep_max_txn_size: 10000
 ep_max_vbuckets: 1024
 ep_mem_high_wat: 629145600
 ep_mem_low_wat: 503316480
 ep_mem_tracker_enabled: true
 ep_min_data_age: 0
 ep_mlog_compactor_runs: 0
 ep_mutation_mem_threshold: 0
 ep_num_access_scanner_runs: 1
 ep_num_eject_failures: 276090282
 ep_num_expiry_pager_runs: 3
 ep_num_non_resident: 2884986
 ep_num_not_my_vbuckets: 6
 ep_num_ops_del_meta: 0
 ep_num_ops_get_meta: 1
 ep_num_ops_set_meta: 0
 ep_num_pager_runs: 352
 ep_num_value_ejects: 3245555
 ep_oom_errors: 0
 ep_overhead: 93729316
 ep_pager_active_vb_pcnt: 40
 ep_pager_unbiased_period: 60
 ep_pending_ops: 0
 ep_pending_ops_max: 0
 ep_pending_ops_max_duration: 0
 ep_pending_ops_total: 0
 ep_postInitfile:
 ep_queue_age_cap: 900
 ep_queue_size: 244621
 ep_startup_time: 1354086142
 ep_storage_age: 2888
 ep_storage_age_highwat: 3579
 ep_store_max_concurrency: 10
 ep_store_max_readers: 9
 ep_store_max_readwrite: 1
 ep_stored_val_type:
 ep_tap_ack_grace_period: 300
 ep_tap_ack_initial_sequence_number: 1
 ep_tap_ack_interval: 1000
 ep_tap_ack_window_size: 10
 ep_tap_backfill_resident: 0.9
 ep_tap_backlog_limit: 5000
 ep_tap_backoff_period: 5
 ep_tap_bg_fetch_requeued: 0
 ep_tap_bg_fetched: 1978
 ep_tap_bg_max_pending: 500
 ep_tap_keepalive: 300
 ep_tap_noop_interval: 20
 ep_tap_requeue_sleep_time: 0.1
 ep_tap_throttle_cap_pcnt: 10
 ep_tap_throttle_queue_cap: 1000000
 ep_tap_throttle_threshold: 90
 ep_tmp_oom_errors: 408666
 ep_too_old: 245644
 ep_too_young: 0
 ep_total_cache_size: 531703635
 ep_total_del_items: 0
 ep_total_enqueued: 5943389
 ep_total_new_items: 3060472
 ep_total_persisted: 4842366
 ep_uncommitted_items: 4544
 ep_value_size: 246296364
 ep_vb0: 0
 ep_vb_snapshot_total: 602
 ep_vb_total: 922
 ep_vbucket_del: 102
 ep_vbucket_del_avg_walltime: 1184789
 ep_vbucket_del_fail: 0
 ep_vbucket_del_max_walltime: 6511942
 ep_version: 2.0.0r_140_gde42c8c
 ep_waitforwarmup: 0
 ep_warmup: 1
 ep_warmup_batch_size: 1000
 ep_warmup_dups: 0
 ep_warmup_min_items_threshold: 100
 ep_warmup_min_memory_threshold: 100
 ep_warmup_oom: 0
 ep_warmup_thread: complete
 ep_warmup_time: 2049
 get_hits: 1165
 get_misses: 115
 incr_hits: 0
 incr_misses: 0
 libevent: 2.0.11-stable
 limit_maxbytes: 67108864
 listen_disabled_num: 0
 max_conns_on_port_11209: 1000
 max_conns_on_port_11210: 9000
 mem_used: 755258232
 pid: 2940
 pointer_size: 64
 rejected_conns: 0
 tap_checkpoint_end_received: 6023
 tap_checkpoint_end_sent: 9611
 tap_checkpoint_start_received: 6778
 tap_checkpoint_start_sent: 10551
 tap_connect_received: 310
 tap_mutation_received: 3981711
 tap_mutation_sent: 3854114
 tap_opaque_received: 1877
 tap_opaque_sent: 2073
 tap_vbucket_set_received: 242
 tap_vbucket_set_sent: 204
 threads: 4
 time: 1354098953
 total_connections: 1606
 uptime: 13936
 vb_active_curr_items: 2027168
 vb_active_eject: 1930964
 vb_active_expired: 0
 vb_active_ht_memory: 13219776
 vb_active_itm_memory: 266946416
 vb_active_meta_data_memory: 171586408
 vb_active_num: 531
 vb_active_num_non_resident: 1736718
 vb_active_num_ref_ejects: 1734941
 vb_active_num_ref_items: 78790
 vb_active_ops_create: 1758886
 vb_active_ops_delete: 0
 vb_active_ops_reject: 0
 vb_active_ops_update: 1015722
 vb_active_perc_mem_resident: 14
 vb_active_queue_age: 703101627000
 vb_active_queue_drain: 2918662
 vb_active_queue_fill: 3212663
 vb_active_queue_memory: 9447520
 vb_active_queue_pending: 16159447
 vb_active_queue_size: 295235
 vb_dead_num: 0
 vb_pending_curr_items: 0
 vb_pending_eject: 0
 vb_pending_expired: 0
 vb_pending_ht_memory: 0
 vb_pending_itm_memory: 0
 vb_pending_meta_data_memory: 0
 vb_pending_num: 0
 vb_pending_num_non_resident: 0
 vb_pending_num_ref_ejects: 0
 vb_pending_num_ref_items: 0
 vb_pending_ops_create: 0
 vb_pending_ops_delete: 0
 vb_pending_ops_reject: 0
 vb_pending_ops_update: 0
 vb_pending_perc_mem_resident: 0
 vb_pending_queue_age: 0
 vb_pending_queue_drain: 0
 vb_pending_queue_fill: 0
 vb_pending_queue_memory: 0
 vb_pending_queue_pending: 0
 vb_pending_queue_size: 0
 vb_replica_curr_items: 1591269
 vb_replica_eject: 1313893
 vb_replica_expired: 0
 vb_replica_ht_memory: 9734336
 vb_replica_itm_memory: 264757219
 vb_replica_meta_data_memory: 134821945
 vb_replica_num: 391
 vb_replica_num_non_resident: 1148268
 vb_replica_num_ref_ejects: 1277577
 vb_replica_num_ref_items: 60822
 vb_replica_ops_create: 1172973
 vb_replica_ops_delete: 0
 vb_replica_ops_reject: 0
 vb_replica_ops_update: 679912
 vb_replica_perc_mem_resident: 27
 vb_replica_queue_age: 1161936223000
 vb_replica_queue_drain: 2016425
 vb_replica_queue_fill: 2508634
 vb_replica_queue_memory: 15750688
 vb_replica_queue_pending: 27018210
 vb_replica_queue_size: 492209
 version: 1.4.4_600_g7ea975a

 ep_tap_ack_grace_period: 300
 ep_tap_ack_interval: 1000
 ep_tap_ack_window_size: 10
 ep_tap_backoff_period: 5
 ep_tap_bg_fetch_requeued: 0
 ep_tap_bg_fetched: 1978
 ep_tap_bg_max_pending: 500
 ep_tap_count: 6
 ep_tap_deletes: 0
 ep_tap_fg_fetched: 3973157
 ep_tap_noop_interval: 20
 ep_tap_queue_backfillremaining: 0
 ep_tap_queue_backoff: 79439
 ep_tap_queue_drain: 3837591
 ep_tap_queue_fill: 0
 ep_tap_queue_itemondisk: 0
 ep_tap_throttle_queue_cap: 1000000
 ep_tap_throttle_threshold: 90
 ep_tap_throttled: 892516
 ep_tap_total_backlog_size: 8
 ep_tap_total_fetched: 3997574
 ep_tap_total_queue: 0
 eq_tapq:anon_1:connected: true
 eq_tapq:anon_1:created: 1127
 eq_tapq:anon_1:num_checkpoint_end: 6057
 eq_tapq:anon_1:num_checkpoint_end_failed: 0
 eq_tapq:anon_1:num_checkpoint_start: 6569
 eq_tapq:anon_1:num_checkpoint_start_failed: 0
 eq_tapq:anon_1:num_delete: 0
 eq_tapq:anon_1:num_delete_failed: 0
 eq_tapq:anon_1:num_flush: 0
 eq_tapq:anon_1:num_flush_failed: 0
 eq_tapq:anon_1:num_mutation: 3246722
 eq_tapq:anon_1:num_mutation_failed: 970546
 eq_tapq:anon_1:num_opaque: 1270
 eq_tapq:anon_1:num_opaque_failed: 0
 eq_tapq:anon_1:num_unknown: 0
 eq_tapq:anon_1:num_vbucket_set: 0
 eq_tapq:anon_1:num_vbucket_set_failed: 0
 eq_tapq:anon_1:pending_disconnect: false
 eq_tapq:anon_1:reserved: 0
 eq_tapq:anon_1:supports_ack: true
 eq_tapq:anon_1:type: consumer
 eq_tapq:anon_448:connected: true
 eq_tapq:anon_448:created: 13890
 eq_tapq:anon_448:num_checkpoint_end: 1
 eq_tapq:anon_448:num_checkpoint_end_failed: 0
 eq_tapq:anon_448:num_checkpoint_start: 2
 eq_tapq:anon_448:num_checkpoint_start_failed: 0
 eq_tapq:anon_448:num_delete: 0
 eq_tapq:anon_448:num_delete_failed: 0
 eq_tapq:anon_448:num_flush: 0
 eq_tapq:anon_448:num_flush_failed: 0
 eq_tapq:anon_448:num_mutation: 558
 eq_tapq:anon_448:num_mutation_failed: 6915
 eq_tapq:anon_448:num_opaque: 2
 eq_tapq:anon_448:num_opaque_failed: 0
 eq_tapq:anon_448:num_unknown: 0
 eq_tapq:anon_448:num_vbucket_set: 0
 eq_tapq:anon_448:num_vbucket_set_failed: 0
 eq_tapq:anon_448:pending_disconnect: false
 eq_tapq:anon_448:reserved: 0
 eq_tapq:anon_448:supports_ack: true
 eq_tapq:anon_448:type: consumer
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_log_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_seqno: 7386
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_window_full: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':backfill_completed: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':backfill_start_timestamp: 1354092278
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_jobs_completed: 1978
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_jobs_issued: 1978
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_result_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':connected: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':created: 7261
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':has_queued_item: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':idle: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':paused: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_disconnect: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_disk_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen_high_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen_low_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_backfillremaining: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_backoff: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_drain: 7352
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_fill: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_itemondisk: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_memory: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':rec_fetched: 5403
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':recv_ack_seqno: 7385
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':reserved: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':seqno_ack_requested: 7385
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':supports_ack: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':suspended: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':total_backlog_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':total_noops: 103
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':type: producer
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':vb_filter: { 272 }
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':vb_filters: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_log_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_seqno: 4205
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_window_full: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':backfill_completed: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':backfill_start_timestamp: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_jobs_completed: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_jobs_issued: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_result_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':connected: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':created: 7261
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':has_queued_item: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':idle: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':paused: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_disconnect: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_disk_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen_high_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen_low_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_backfillremaining: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_backoff: 154
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_drain: 4175
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_fill: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_itemondisk: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_memory: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':rec_fetched: 4202
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':recv_ack_seqno: 4204
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':reserved: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':seqno_ack_requested: 4204
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':supports_ack: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':suspended: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':total_backlog_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':total_noops: 99
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':type: producer
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':vb_filter: { 272 }
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':vb_filters: 1
 eq_tapq:replication_ns_1@10.1.3.135:ack_log_size: 0
 eq_tapq:replication_ns_1@10.1.3.135:ack_seqno: 561390
 eq_tapq:replication_ns_1@10.1.3.135:ack_window_full: false
 eq_tapq:replication_ns_1@10.1.3.135:backfill_completed: true
 eq_tapq:replication_ns_1@10.1.3.135:backfill_start_timestamp: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_jobs_completed: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_jobs_issued: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_result_size: 0
 eq_tapq:replication_ns_1@10.1.3.135:connected: true
 eq_tapq:replication_ns_1@10.1.3.135:created: 4873
 eq_tapq:replication_ns_1@10.1.3.135:flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_ns_1@10.1.3.135:has_queued_item: true
 eq_tapq:replication_ns_1@10.1.3.135:idle: false
 eq_tapq:replication_ns_1@10.1.3.135:paused: 1
 eq_tapq:replication_ns_1@10.1.3.135:pending_backfill: false
 eq_tapq:replication_ns_1@10.1.3.135:pending_disconnect: false
 eq_tapq:replication_ns_1@10.1.3.135:pending_disk_backfill: false
 eq_tapq:replication_ns_1@10.1.3.135:qlen: 0
 eq_tapq:replication_ns_1@10.1.3.135:qlen_high_pri: 0
 eq_tapq:replication_ns_1@10.1.3.135:qlen_low_pri: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_backfillremaining: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_backoff: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_drain: 557409
 eq_tapq:replication_ns_1@10.1.3.135:queue_fill: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_itemondisk: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_memory: 0
 eq_tapq:replication_ns_1@10.1.3.135:rec_fetched: 561267
 eq_tapq:replication_ns_1@10.1.3.135:recv_ack_seqno: 561389
 eq_tapq:replication_ns_1@10.1.3.135:reserved: 1
 eq_tapq:replication_ns_1@10.1.3.135:seqno_ack_requested: 561389
 eq_tapq:replication_ns_1@10.1.3.135:supports_ack: true
 eq_tapq:replication_ns_1@10.1.3.135:suspended: false
 eq_tapq:replication_ns_1@10.1.3.135:total_backlog_size: 4
 eq_tapq:replication_ns_1@10.1.3.135:total_noops: 79
 eq_tapq:replication_ns_1@10.1.3.135:type: producer
 eq_tapq:replication_ns_1@10.1.3.135:vb_filter: { [682,802] }
 eq_tapq:replication_ns_1@10.1.3.135:vb_filters: 121
 eq_tapq:replication_ns_1@10.1.3.147:ack_log_size: 0
 eq_tapq:replication_ns_1@10.1.3.147:ack_seqno: 3286570
 eq_tapq:replication_ns_1@10.1.3.147:ack_window_full: false
 eq_tapq:replication_ns_1@10.1.3.147:backfill_completed: true
 eq_tapq:replication_ns_1@10.1.3.147:backfill_start_timestamp: 1354086144
 eq_tapq:replication_ns_1@10.1.3.147:bg_jobs_completed: 0
 eq_tapq:replication_ns_1@10.1.3.147:bg_jobs_issued: 0
 eq_tapq:replication_ns_1@10.1.3.147:bg_result_size: 0
 eq_tapq:replication_ns_1@10.1.3.147:connected: true
 eq_tapq:replication_ns_1@10.1.3.147:created: 1127
 eq_tapq:replication_ns_1@10.1.3.147:flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_ns_1@10.1.3.147:has_queued_item: true
 eq_tapq:replication_ns_1@10.1.3.147:idle: false
 eq_tapq:replication_ns_1@10.1.3.147:paused: 1
 eq_tapq:replication_ns_1@10.1.3.147:pending_backfill: false
 eq_tapq:replication_ns_1@10.1.3.147:pending_disconnect: false
 eq_tapq:replication_ns_1@10.1.3.147:pending_disk_backfill: false
 eq_tapq:replication_ns_1@10.1.3.147:qlen: 0
 eq_tapq:replication_ns_1@10.1.3.147:qlen_high_pri: 0
 eq_tapq:replication_ns_1@10.1.3.147:qlen_low_pri: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_backfillremaining: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_backoff: 79285
 eq_tapq:replication_ns_1@10.1.3.147:queue_drain: 3268655
 eq_tapq:replication_ns_1@10.1.3.147:queue_fill: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_itemondisk: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_memory: 0
 eq_tapq:replication_ns_1@10.1.3.147:rec_fetched: 3285440
 eq_tapq:replication_ns_1@10.1.3.147:recv_ack_seqno: 3286569
 eq_tapq:replication_ns_1@10.1.3.147:reserved: 1
 eq_tapq:replication_ns_1@10.1.3.147:seqno_ack_requested: 3286569
 eq_tapq:replication_ns_1@10.1.3.147:supports_ack: true
 eq_tapq:replication_ns_1@10.1.3.147:suspended: false
 eq_tapq:replication_ns_1@10.1.3.147:total_backlog_size: 4
 eq_tapq:replication_ns_1@10.1.3.147:total_noops: 143
 eq_tapq:replication_ns_1@10.1.3.147:type: producer
 eq_tapq:replication_ns_1@10.1.3.147:vb_filter: { [0,169], [273,511] }
 eq_tapq:replication_ns_1@10.1.3.147:vb_filters: 409


statistics after 30 min:
 accepting_conns: 1
 auth_cmds: 4
 auth_errors: 0
 bucket_active_conns: 1
 bucket_conns: 26
 bytes: 755127144
 bytes_read: 2885054305
 bytes_written: 2937374687
 cas_badval: 0
 cas_hits: 0
 cas_misses: 0
 cmd_flush: 0
 cmd_get: 1280
 cmd_set: 4523945
 conn_yields: 122744
 connection_structures: 5000
 curr_connections: 26
 curr_conns_on_port_11209: 18
 curr_conns_on_port_11210: 6
 curr_items: 2053924
 curr_items_tot: 3674443
 curr_temp_items: 0
 daemon_connections: 4
 decr_hits: 0
 decr_misses: 0
 delete_hits: 0
 delete_misses: 0
 ep_access_scanner_last_runtime: 22
 ep_access_scanner_num_items: 3383889
 ep_access_scanner_task_time: 2012-11-29 10:00:00
 ep_allow_data_loss_during_shutdown: 1
 ep_alog_block_size: 4096
 ep_alog_path: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default/access.log
 ep_alog_sleep_time: 1440
 ep_alog_task_time: 10
 ep_backend: couchdb
 ep_bg_fetch_delay: 0
 ep_bg_fetched: 38
 ep_bg_load: 931092
 ep_bg_load_avg: 24502
 ep_bg_max_load: 618147
 ep_bg_max_wait: 3941
 ep_bg_meta_fetched: 0
 ep_bg_min_load: 397
 ep_bg_min_wait: 66
 ep_bg_num_samples: 38
 ep_bg_remaining_jobs: 0
 ep_bg_wait: 16623
 ep_bg_wait_avg: 437
 ep_chk_max_items: 5000
 ep_chk_period: 1800
 ep_chk_persistence_remains: 0
 ep_chk_persistence_timeout: 20
 ep_chk_remover_stime: 5
 ep_commit_num: 666
 ep_commit_time: 5106
 ep_commit_time_total: 280469
 ep_concurrentDB: 1
 ep_config_file:
 ep_couch_bucket: default
 ep_couch_host: localhost
 ep_couch_port: 11213
 ep_couch_reconnect_sleeptime: 250
 ep_couch_response_timeout: 180000
 ep_data_age: 4156
 ep_data_age_highwat: 4200
 ep_data_traffic_enabled: 0
 ep_dbinit: 1
 ep_dbname: c:/Program Files/Couchbase/Server/var/lib/couchbase/data/default
 ep_degraded_mode: 0
 ep_diskqueue_drain: 5029279
 ep_diskqueue_fill: 5789725
 ep_diskqueue_items: 798546
 ep_diskqueue_memory: 24373760
 ep_diskqueue_pending: 41692209
 ep_exp_pager_stime: 3600
 ep_expired_access: 0
 ep_expired_pager: 0
 ep_expiry_window: 3
 ep_failpartialwarmup: 0
 ep_flush_all: false
 ep_flush_duration_total: 8736
 ep_flushall_enabled: 0
 ep_flusher_state: running
 ep_flusher_todo: 485497
 ep_getl_default_timeout: 15
 ep_getl_max_timeout: 30
 ep_ht_locks: 5
 ep_ht_size: 3079
 ep_inconsistent_slave_chk: 0
 ep_initfile:
 ep_io_num_read: 38
 ep_io_num_write: 4933340
 ep_io_read_bytes: 10745
 ep_io_write_bytes: 1382289598
 ep_item_begin_failed: 0
 ep_item_commit_failed: 0
 ep_item_flush_expired: 0
 ep_item_flush_failed: 0
 ep_item_num_based_new_chk: 1
 ep_items_rm_from_checkpoints: 4644424
 ep_keep_closed_chks: 0
 ep_klog_block_size: 4096
 ep_klog_compactor_queue_cap: 500000
 ep_klog_compactor_stime: 3600
 ep_klog_flush: commit2
 ep_klog_max_entry_ratio: 10
 ep_klog_max_log_size: 2147483647
 ep_klog_path:
 ep_klog_sync: commit2
 ep_kv_size: 571960225
 ep_max_checkpoints: 2
 ep_max_data_size: 838860800
 ep_max_item_size: 20971520
 ep_max_size: 838860800
 ep_max_txn_size: 10000
 ep_max_vbuckets: 1024
 ep_mem_high_wat: 629145600
 ep_mem_low_wat: 503316480
 ep_mem_tracker_enabled: true
 ep_min_data_age: 0
 ep_mlog_compactor_runs: 0
 ep_mutation_mem_threshold: 0
 ep_num_access_scanner_runs: 1
 ep_num_eject_failures: 347203302
 ep_num_expiry_pager_runs: 3
 ep_num_non_resident: 2976309
 ep_num_not_my_vbuckets: 6
 ep_num_ops_del_meta: 0
 ep_num_ops_get_meta: 1
 ep_num_ops_set_meta: 0
 ep_num_pager_runs: 412
 ep_num_value_ejects: 3336878
 ep_oom_errors: 0
 ep_overhead: 93967377
 ep_pager_active_vb_pcnt: 40
 ep_pager_unbiased_period: 60
 ep_pending_ops: 0
 ep_pending_ops_max: 0
 ep_pending_ops_max_duration: 0
 ep_pending_ops_total: 0
 ep_postInitfile:
 ep_queue_age_cap: 900
 ep_queue_size: 313049
 ep_startup_time: 1354086142
 ep_storage_age: 4154
 ep_storage_age_highwat: 4200
 ep_store_max_concurrency: 10
 ep_store_max_readers: 9
 ep_store_max_readwrite: 1
 ep_stored_val_type:
 ep_tap_ack_grace_period: 300
 ep_tap_ack_initial_sequence_number: 1
 ep_tap_ack_interval: 1000
 ep_tap_ack_window_size: 10
 ep_tap_backfill_resident: 0.9
 ep_tap_backlog_limit: 5000
 ep_tap_backoff_period: 5
 ep_tap_bg_fetch_requeued: 0
 ep_tap_bg_fetched: 1978
 ep_tap_bg_max_pending: 500
 ep_tap_keepalive: 300
 ep_tap_noop_interval: 20
 ep_tap_requeue_sleep_time: 0.1
 ep_tap_throttle_cap_pcnt: 10
 ep_tap_throttle_queue_cap: 1000000
 ep_tap_throttle_threshold: 90
 ep_tmp_oom_errors: 755084
 ep_too_old: 336634
 ep_too_young: 0
 ep_total_cache_size: 527866909
 ep_total_del_items: 0
 ep_total_enqueued: 6011817
 ep_total_new_items: 3142974
 ep_total_persisted: 4933340
 ep_uncommitted_items: 6277
 ep_value_size: 237995966
 ep_vb0: 0
 ep_vb_snapshot_total: 603
 ep_vb_total: 922
 ep_vbucket_del: 102
 ep_vbucket_del_avg_walltime: 1184789
 ep_vbucket_del_fail: 0
 ep_vbucket_del_max_walltime: 6511942
 ep_version: 2.0.0r_140_gde42c8c
 ep_waitforwarmup: 0
 ep_warmup: 1
 ep_warmup_batch_size: 1000
 ep_warmup_dups: 0
 ep_warmup_min_items_threshold: 100
 ep_warmup_min_memory_threshold: 100
 ep_warmup_oom: 0
 ep_warmup_thread: complete
 ep_warmup_time: 2049
 get_hits: 1165
 get_misses: 115
 incr_hits: 0
 incr_misses: 0
 libevent: 2.0.11-stable
 limit_maxbytes: 67108864
 listen_disabled_num: 0
 max_conns_on_port_11209: 1000
 max_conns_on_port_11210: 9000
 mem_used: 755127144
 pid: 2940
 pointer_size: 64
 rejected_conns: 0
 tap_checkpoint_end_received: 6135
 tap_checkpoint_end_sent: 9611
 tap_checkpoint_start_received: 6890
 tap_checkpoint_start_sent: 10551
 tap_connect_received: 310
 tap_mutation_received: 4241246
 tap_mutation_sent: 3973519
 tap_opaque_received: 1877
 tap_opaque_sent: 2073
 tap_vbucket_set_received: 242
 tap_vbucket_set_sent: 204
 threads: 4
 time: 1354099572
 total_connections: 1607
 uptime: 14555
 vb_active_curr_items: 2053924
 vb_active_eject: 2022287
 vb_active_expired: 0
 vb_active_ht_memory: 13219776
 vb_active_itm_memory: 252869508
 vb_active_meta_data_memory: 173860668
 vb_active_num: 531
 vb_active_num_non_resident: 1828041
 vb_active_num_ref_ejects: 1735016
 vb_active_num_ref_items: 105546
 vb_active_ops_create: 1841439
 vb_active_ops_delete: 0
 vb_active_ops_reject: 0
 vb_active_ops_update: 1024159
 vb_active_perc_mem_resident: 10
 vb_active_queue_age: 555566181000
 vb_active_queue_drain: 3012854
 vb_active_queue_fill: 3251701
 vb_active_queue_memory: 7682592
 vb_active_queue_pending: 13086939
 vb_active_queue_size: 240081
 vb_dead_num: 0
 vb_pending_curr_items: 0
 vb_pending_eject: 0
 vb_pending_expired: 0
 vb_pending_ht_memory: 0
 vb_pending_itm_memory: 0
 vb_pending_meta_data_memory: 0
 vb_pending_num: 0
 vb_pending_num_non_resident: 0
 vb_pending_num_ref_ejects: 0
 vb_pending_num_ref_items: 0
 vb_pending_ops_create: 0
 vb_pending_ops_delete: 0
 vb_pending_ops_reject: 0
 vb_pending_ops_update: 0
 vb_pending_perc_mem_resident: 0
 vb_pending_queue_age: 0
 vb_pending_queue_drain: 0
 vb_pending_queue_fill: 0
 vb_pending_queue_memory: 0
 vb_pending_queue_pending: 0
 vb_pending_queue_size: 0
 vb_replica_curr_items: 1620519
 vb_replica_eject: 1313893
 vb_replica_expired: 0
 vb_replica_ht_memory: 9734336
 vb_replica_itm_memory: 274997401
 vb_replica_meta_data_memory: 137308195
 vb_replica_num: 391
 vb_replica_num_non_resident: 1148268
 vb_replica_num_ref_ejects: 1277577
 vb_replica_num_ref_items: 90072
 vb_replica_ops_create: 1172973
 vb_replica_ops_delete: 0
 vb_replica_ops_reject: 0
 vb_replica_ops_update: 679912
 vb_replica_perc_mem_resident: 29
 vb_replica_queue_age: 1475291239000
 vb_replica_queue_drain: 2016425
 vb_replica_queue_fill: 2538024
 vb_replica_queue_memory: 16691168
 vb_replica_queue_pending: 28605270
 vb_replica_queue_size: 521599
 version: 1.4.4_600_g7ea975a


 ep_tap_ack_grace_period: 300
 ep_tap_ack_interval: 1000
 ep_tap_ack_window_size: 10
 ep_tap_backoff_period: 5
 ep_tap_bg_fetch_requeued: 0
 ep_tap_bg_fetched: 1978
 ep_tap_bg_max_pending: 500
 ep_tap_count: 6
 ep_tap_deletes: 0
 ep_tap_fg_fetched: 4042426
 ep_tap_noop_interval: 20
 ep_tap_queue_backfillremaining: 0
 ep_tap_queue_backoff: 118231
 ep_tap_queue_drain: 3906860
 ep_tap_queue_fill: 0
 ep_tap_queue_itemondisk: 0
 ep_tap_throttle_queue_cap: 1000000
 ep_tap_throttle_threshold: 90
 ep_tap_throttled: 1074548
 ep_tap_total_backlog_size: 181
 ep_tap_total_fetched: 4066843
 ep_tap_total_queue: 3359
 eq_tapq:anon_1:connected: true
 eq_tapq:anon_1:created: 1127
 eq_tapq:anon_1:num_checkpoint_end: 6057
 eq_tapq:anon_1:num_checkpoint_end_failed: 0
 eq_tapq:anon_1:num_checkpoint_start: 6569
 eq_tapq:anon_1:num_checkpoint_start_failed: 0
 eq_tapq:anon_1:num_delete: 0
 eq_tapq:anon_1:num_delete_failed: 0
 eq_tapq:anon_1:num_flush: 0
 eq_tapq:anon_1:num_flush_failed: 0
 eq_tapq:anon_1:num_mutation: 3257674
 eq_tapq:anon_1:num_mutation_failed: 1160954
 eq_tapq:anon_1:num_opaque: 1270
 eq_tapq:anon_1:num_opaque_failed: 0
 eq_tapq:anon_1:num_unknown: 0
 eq_tapq:anon_1:num_vbucket_set: 0
 eq_tapq:anon_1:num_vbucket_set_failed: 0
 eq_tapq:anon_1:pending_disconnect: false
 eq_tapq:anon_1:reserved: 0
 eq_tapq:anon_1:supports_ack: true
 eq_tapq:anon_1:type: consumer
 eq_tapq:anon_448:connected: true
 eq_tapq:anon_448:created: 13890
 eq_tapq:anon_448:num_checkpoint_end: 1
 eq_tapq:anon_448:num_checkpoint_end_failed: 0
 eq_tapq:anon_448:num_checkpoint_start: 2
 eq_tapq:anon_448:num_checkpoint_start_failed: 0
 eq_tapq:anon_448:num_delete: 0
 eq_tapq:anon_448:num_delete_failed: 0
 eq_tapq:anon_448:num_flush: 0
 eq_tapq:anon_448:num_flush_failed: 0
 eq_tapq:anon_448:num_mutation: 974
 eq_tapq:anon_448:num_mutation_failed: 8152
 eq_tapq:anon_448:num_opaque: 2
 eq_tapq:anon_448:num_opaque_failed: 0
 eq_tapq:anon_448:num_unknown: 0
 eq_tapq:anon_448:num_vbucket_set: 0
 eq_tapq:anon_448:num_vbucket_set_failed: 0
 eq_tapq:anon_448:pending_disconnect: false
 eq_tapq:anon_448:reserved: 0
 eq_tapq:anon_448:supports_ack: true
 eq_tapq:anon_448:type: consumer
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_log_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_seqno: 7446
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':ack_window_full: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':backfill_completed: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':backfill_start_timestamp: 1354092278
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_jobs_completed: 1978
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_jobs_issued: 1978
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':bg_result_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':connected: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':created: 7261
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':has_queued_item: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':idle: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':paused: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_disconnect: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':pending_disk_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen_high_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':qlen_low_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_backfillremaining: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_backoff: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_drain: 7412
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_fill: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_itemondisk: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':queue_memory: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':rec_fetched: 5463
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':recv_ack_seqno: 7445
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':reserved: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':seqno_ack_requested: 7445
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':supports_ack: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':suspended: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':total_backlog_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':total_noops: 106
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':type: producer
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':vb_filter: { 272 }
 eq_tapq:replication_building_272_'ns_1@10.1.3.135':vb_filters: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_log_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_seqno: 4344
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':ack_window_full: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':backfill_completed: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':backfill_start_timestamp: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_jobs_completed: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_jobs_issued: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':bg_result_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':connected: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':created: 7261
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':has_queued_item: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':idle: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':paused: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_disconnect: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':pending_disk_backfill: false
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen_high_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':qlen_low_pri: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_backfillremaining: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_backoff: 234
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_drain: 4314
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_fill: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_itemondisk: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':queue_memory: 8
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':rec_fetched: 4341
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':recv_ack_seqno: 4343
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':reserved: 1
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':seqno_ack_requested: 4343
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':supports_ack: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':suspended: true
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':total_backlog_size: 0
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':total_noops: 101
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':type: producer
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':vb_filter: { 272 }
 eq_tapq:replication_building_272_'ns_1@10.1.3.147':vb_filters: 1
 eq_tapq:replication_ns_1@10.1.3.135:ack_log_size: 0
 eq_tapq:replication_ns_1@10.1.3.135:ack_seqno: 569269
 eq_tapq:replication_ns_1@10.1.3.135:ack_window_full: false
 eq_tapq:replication_ns_1@10.1.3.135:backfill_completed: true
 eq_tapq:replication_ns_1@10.1.3.135:backfill_start_timestamp: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_jobs_completed: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_jobs_issued: 0
 eq_tapq:replication_ns_1@10.1.3.135:bg_result_size: 0
 eq_tapq:replication_ns_1@10.1.3.135:connected: true
 eq_tapq:replication_ns_1@10.1.3.135:created: 4873
 eq_tapq:replication_ns_1@10.1.3.135:flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_ns_1@10.1.3.135:has_queued_item: false
 eq_tapq:replication_ns_1@10.1.3.135:idle: true
 eq_tapq:replication_ns_1@10.1.3.135:paused: 1
 eq_tapq:replication_ns_1@10.1.3.135:pending_backfill: false
 eq_tapq:replication_ns_1@10.1.3.135:pending_disconnect: false
 eq_tapq:replication_ns_1@10.1.3.135:pending_disk_backfill: false
 eq_tapq:replication_ns_1@10.1.3.135:qlen: 0
 eq_tapq:replication_ns_1@10.1.3.135:qlen_high_pri: 0
 eq_tapq:replication_ns_1@10.1.3.135:qlen_low_pri: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_backfillremaining: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_backoff: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_drain: 565288
 eq_tapq:replication_ns_1@10.1.3.135:queue_fill: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_itemondisk: 0
 eq_tapq:replication_ns_1@10.1.3.135:queue_memory: 0
 eq_tapq:replication_ns_1@10.1.3.135:rec_fetched: 569146
 eq_tapq:replication_ns_1@10.1.3.135:recv_ack_seqno: 569268
 eq_tapq:replication_ns_1@10.1.3.135:reserved: 1
 eq_tapq:replication_ns_1@10.1.3.135:seqno_ack_requested: 569268
 eq_tapq:replication_ns_1@10.1.3.135:supports_ack: true
 eq_tapq:replication_ns_1@10.1.3.135:suspended: false
 eq_tapq:replication_ns_1@10.1.3.135:total_backlog_size: 2
 eq_tapq:replication_ns_1@10.1.3.135:total_noops: 79
 eq_tapq:replication_ns_1@10.1.3.135:type: producer
 eq_tapq:replication_ns_1@10.1.3.135:vb_filter: { [682,802] }
 eq_tapq:replication_ns_1@10.1.3.135:vb_filters: 121
 eq_tapq:replication_ns_1@10.1.3.147:ack_log_size: 0
 eq_tapq:replication_ns_1@10.1.3.147:ack_seqno: 3347761
 eq_tapq:replication_ns_1@10.1.3.147:ack_window_full: false
 eq_tapq:replication_ns_1@10.1.3.147:backfill_completed: true
 eq_tapq:replication_ns_1@10.1.3.147:backfill_start_timestamp: 1354086144
 eq_tapq:replication_ns_1@10.1.3.147:bg_jobs_completed: 0
 eq_tapq:replication_ns_1@10.1.3.147:bg_jobs_issued: 0
 eq_tapq:replication_ns_1@10.1.3.147:bg_result_size: 0
 eq_tapq:replication_ns_1@10.1.3.147:connected: true
 eq_tapq:replication_ns_1@10.1.3.147:created: 1127
 eq_tapq:replication_ns_1@10.1.3.147:flags: 85 (ack,backfill,vblist,checkpoints)
 eq_tapq:replication_ns_1@10.1.3.147:has_queued_item: true
 eq_tapq:replication_ns_1@10.1.3.147:idle: false
 eq_tapq:replication_ns_1@10.1.3.147:paused: 1
 eq_tapq:replication_ns_1@10.1.3.147:pending_backfill: false
 eq_tapq:replication_ns_1@10.1.3.147:pending_disconnect: false
 eq_tapq:replication_ns_1@10.1.3.147:pending_disk_backfill: false
 eq_tapq:replication_ns_1@10.1.3.147:qlen: 3358
 eq_tapq:replication_ns_1@10.1.3.147:qlen_high_pri: 0
 eq_tapq:replication_ns_1@10.1.3.147:qlen_low_pri: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_backfillremaining: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_backoff: 117997
 eq_tapq:replication_ns_1@10.1.3.147:queue_drain: 3329846
 eq_tapq:replication_ns_1@10.1.3.147:queue_fill: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_itemondisk: 0
 eq_tapq:replication_ns_1@10.1.3.147:queue_memory: 26864
 eq_tapq:replication_ns_1@10.1.3.147:rec_fetched: 3346631
 eq_tapq:replication_ns_1@10.1.3.147:recv_ack_seqno: 3347760
 eq_tapq:replication_ns_1@10.1.3.147:reserved: 1
 eq_tapq:replication_ns_1@10.1.3.147:seqno_ack_requested: 3347760
 eq_tapq:replication_ns_1@10.1.3.147:supports_ack: true
 eq_tapq:replication_ns_1@10.1.3.147:suspended: true
 eq_tapq:replication_ns_1@10.1.3.147:total_backlog_size: 179
 eq_tapq:replication_ns_1@10.1.3.147:total_noops: 143
 eq_tapq:replication_ns_1@10.1.3.147:type: producer
 eq_tapq:replication_ns_1@10.1.3.147:vb_filter: { [0,169], [273,511] }
 eq_tapq:replication_ns_1@10.1.3.147:vb_filters: 409


 Comments   
Comment by Iryna Mironava [ 28/Nov/12 ]
 https://s3.amazonaws.com/bugdb/jira/MB-7280/d8baed39/10.1.3.146-collect.zip
 https://s3.amazonaws.com/bugdb/jira/MB-7280/d8baed39/10.1.3.147-collect.zip


https://s3.amazonaws.com/bugdb/jira/MB-7280/d8baed39/10.1.3.147-diag.txt.gz
https://s3.amazonaws.com/bugdb/jira/MB-7280/d8baed39/10.1.3.146-diag.txt.gz
Comment by Aleksey Kondratenko [ 28/Nov/12 ]
K/V only?
Comment by Steve Yen [ 28/Nov/12 ]
Hi Alk, not K/V only -- there are views.
Comment by Aleksey Kondratenko [ 02/Jan/13 ]
May be fixed already by +A 16. Worth being re-tested IMHO.
Comment by Farshid Ghods (Inactive) [ 09/Jan/13 ]
Alk,

windows installation was using +A when this test was ran.
Comment by Aliaksey Artamonau [ 16/Jan/13 ]
Node 147 stuck on waiting for index update on vbucket 272:

[rebalance:debug,2012-11-28T0:44:43.651,ns_1@10.1.3.147:<0.18797.2>:janitor_agent:handle_call:651]Going to wait for persistence of checkpoint 7 in vbucket 272
[ns_server:debug,2012-11-28T0:44:43.729,ns_1@10.1.3.147:janitor_agent-default<0.3997.0>:janitor_agent:handle_info:682]Got done message from subprocess: <0.18797.2> (ok)
[ns_server:debug,2012-11-28T0:44:43.854,ns_1@10.1.3.147:<0.18800.2>:capi_set_view_manager:do_wait_index_updated:608]References to wait: [#Ref<0.0.132.154454>,#Ref<0.0.132.154418>,
                     #Ref<0.0.132.154398>,#Ref<0.0.132.154371>,
                     #Ref<0.0.132.154347>] ("default", 272)

And corresponding process backtrace:

         {<0.18800.2>,
          [{registered_name,[]},
           {status,waiting},
           {initial_call,{proc_lib,init_p,5}},
           {backtrace,
               [<<"rogram counter: 0x0637b1ec (capi_set_view_manager:'-do_wait_index_updated/4-lc$^0/1-0-'/3 ">>,
                <<"CP: 0x00000000 (invalid)">>,<<"arity = 0">>,<<>>,
                <<"0x0a69a1a8 Return addr 0x0637b100 (capi_set_view_manager:do_wait_index_updated/4 + 440)">>,
                <<"y(0) #Ref<0.0.132.154347>">>,
                <<"y(1) #Ref<0.0.132.154480>">>,
                <<"y(2) #Ref<0.0.132.154479>">>,<<"y(3) []">>,<<>>,
                <<"0x0a69a1bc Return addr 0x017a2da8 (proc_lib:init_p_do_apply/3 + 28)">>,
                <<"y(0) {<0.18799.2>,#Ref<0.0.132.154343>}">>,<<>>,
                <<"0x0a69a1c4 Return addr 0x00b509b4 (<terminate process normally>)">>,
                <<"y(0) Catch 0x017a2db8 (proc_lib:init_p_do_apply/3 + 44)">>,
                <<>>]},
           {error_handler,error_handler},
           {garbage_collection,
               [{min_bin_vheap_size,46368},
                {min_heap_size,233},
                {fullsweep_after,512},
                {minor_gcs,8}]},
           {heap_size,987},
           {total_heap_size,1597},
           {links,[]},
           {memory,6968},
           {message_queue_len,0},
           {reductions,2135},
           {trap_exit,false}]},

Relevant messages in couch_set_view_group are logged with debug level that we deliberately disable for couchdb. So I was not able to deduce anything else. Filipe, could you please take a look?
Comment by Filipe Manana [ 18/Jan/13 ]
Same as MB-7535, can you re-test with build 2.0.1-138+?
It adds more useful logging to help find where the problem is.

thanks
Comment by Andrei Baranouski [ 24/Jan/13 ]
maybe my logs will be useful. but my steps differ, with online upgrade:
1. 2 nodes 1976, 3 diff buckets with 1 ddoc/view in each of them(10.3.3.211, 10.3.3.212)
2. start data loading
3. remove 10.3.3.212, add 2 nodes with 2.0.1-141( 10.3.3.41 & 10.3.3.197) and start rebalance
4. rebalance hangs almost at the beginning and hung over 4 hours. see tap stats during this period
5. then I stopped loading and rebalance progress began to change

https://s3.amazonaws.com/bugdb/jira/MB-7280/10.3.3.211-1242013-537-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-7280/10.3.3.212-1242013-617-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-7280/10.3.3.197-1242013-555-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-7280/10.3.3.41-1242013-553-diag.zip




Comment by Filipe Manana [ 24/Jan/13 ]
Thanks Andrei.

Giving them a quick look, it seems to be a different cause, unrelated to views.

Alieksey, can you give them a quick look as well? Thanks.
Comment by Aliaksey Artamonau [ 24/Jan/13 ]
Filipe is right. It's not views related. Node .211 misbehaves badly. There're lots of timeouts there all over the place. Rebalance is actually not stuck, it's just really-really slow. I'm assigning it back to Iryna since all this is completely unrelated to the original issue and we have other tickets open specifically for timeouts on windows.
Comment by Iryna Mironava [ 25/Jan/13 ]
repeated steps from description, build 2.0.1-141-rel
https://s3.amazonaws.com/bugdb/jira/MB-7280/295a4423/collect_10.1.3.146.zip
https://s3.amazonaws.com/bugdb/jira/MB-7280/295a4423/collect_10.1.3.147.zip
Comment by Filipe Manana [ 25/Jan/13 ]
Aliaksey can you look at Iryna's latest logs?
From what I see there's no problem on the indexes side.

I see several timeouts in one node:

** Last message in was {'EXIT',<0.23239.6>,
                           {timeout,
                               {gen_server,call,
                                   [os_mon_sysinfo,get_mem_info]}}}
** When Server state == [{data,[{"Timeout",60000}]},
                         {items,{"Memory Usage",
                                 [{"Allocated",2255761408},
                                  {"Total",4294500352}]}},
                         {items,{"Worst Memory User",
                                 [{"Pid",<0.69.0>},{"Memory",13463180}]}}]
** Reason for termination ==
** {timeout,{gen_server,call,[os_mon_sysinfo,get_mem_info]}}

[ns_server:error,2013-01-25T3:49:59.456,ns_1@10.1.3.146:<0.23232.6>:menelaus_web_alerts_srv:can_listen:345]Cannot listen due to nxdomain from inet:getaddr


And this unexpected process exit the rebalancer caught:

[ns_server:warn,2013-01-25T3:56:50.865,ns_1@10.1.3.146:<0.388.0>:ns_orchestrator:handle_info:342]Got unexpected message {'EXIT',<20763.2738.5>,normal} in state rebalancing with d
                                                                                      <0.20482.3>,
                                                                                      {dict,
                                                                                       2,
                                                                                       16,
                                                                                       16,
                                                                                       8,
                                                                                       80,
                                                                                       48,
                                                                                       {[],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        [],
                                                                                        []},
                                                                                       {{[],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [['ns_1@10.1.3.146'|
                                                                                           0.0859375]],
                                                                                         [['ns_1@10.1.3.147'|
                                                                                           0.25]],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         [],
                                                                                         []}}},
                                                                                      ['ns_1@10.1.3.146',
                                                                                       'ns_1@10.1.3.147'],
                                                                                      [],
                                                                                      []}



Thanks
Comment by Aliaksey Artamonau [ 25/Jan/13 ]
I took a peek at the logs. Again it's not the issue that we saw originally. As Filipe correctly noted there are again lots of timeouts. What happens in particular is that all the vbucket moves are blocked by index compaction on node .146. And this compaction just takes a lot of time (20 minutes to compact 3 out of 5 indexes). And in the end rebalance just fails.
Comment by Filipe Manana [ 28/Jan/13 ]
Index compaction histogram of node .146 (main and replica included):

fdmanana 04:29:37 ~/Downloads/cbcollect_info_ns_1@10.1.3.146_20130125-115741 > egrep 'compaction complete' ns_server.couchdb.log | cut -d ' ' -f 10 | perl -MStatistics::Histogram -e '@data = <>; chomp @data; print get_histogram(\@data);'
Count: 67
Range: 0.000 - 389.016; Mean: 35.598; Median: 18.797; Stddev: 69.323
Percentiles: 90th: 63.031; 95th: 110.484; 99th: 389.016
   0.000 - 1.170: 12 ###################################
   1.170 - 2.864: 3 #########
   2.864 - 5.879: 3 #########
   5.879 - 11.247: 2 ######
  11.247 - 20.804: 18 #####################################################
  20.804 - 37.820: 15 ############################################
  37.820 - 68.113: 7 #####################
  68.113 - 122.046: 3 #########
 122.046 - 218.066: 1 ###
 218.066 - 389.016: 2 ######
fdmanana 04:29:57 ~/Downloads/cbcollect_info_ns_1@10.1.3.146_20130125-115741 >
Comment by Aliaksey Artamonau [ 28/Jan/13 ]
It seems that compaction time that is logged counts only part of compaction time. For example,

[couchdb:info,2013-01-25T3:46:41.549,ns_1@10.1.3.146:<0.19862.3>:couch_log:info:39]Set view `default`, main group `_design/view2`, compaction starting
[couchdb:info,2013-01-25T3:46:41.768,ns_1@10.1.3.146:<0.19862.3>:couch_log:info:39]Set view `default`, main group `_design/view2`, linked PID <0.22523.6> stopped normally
[couchdb:info,2013-01-25T3:46:42.674,ns_1@10.1.3.146:<0.11551.6>:couch_log:info:39]Set view `default`, main group `_design/view2`, updater received compactor <0.22530.6> notification, ref #Ref<0.0.458.109826>, writer <0.11560.6>
[couchdb:info,2013-01-25T3:50:13.706,ns_1@10.1.3.146:<0.11551.6>:couch_log:info:39]Set view `default`, main group `_design/view2`, updater received compaction ack from writer <0.11560.6>
[couchdb:info,2013-01-25T3:56:39.459,ns_1@10.1.3.146:<0.19862.3>:couch_log:info:39]Set view `default`, main group `_design/view2`, compact group up to date - restarting updater
[couchdb:info,2013-01-25T3:56:39.459,ns_1@10.1.3.146:<0.19862.3>:couch_log:info:39]Set view `default`, main group `_design/view2`, compaction complete in 385.750 seconds, filtered 0 key-value pairs

So the compaction time from the caller perspective is about 600 seconds. It seems that it starts counting from "updater received compaction ack".

It's also important to note that when ns_server decides to perform compaction during rebalance, it has to wait for compaction to complete for all design documents. So roughly we have to multiply these 600 seconds by the number of design documents which is 5 in this particular case.
Comment by Farshid Ghods (Inactive) [ 28/Jan/13 ]
per bug scrub.

Iryna,
can you rerun the test and see if its reproducible . please also mention the sets/updates per second and the drain rate you see on this node
Comment by Filipe Manana [ 28/Jan/13 ]
You're right Alieksey, the logged compaction duration is not completely fair. Will fix this in master. Thanks
Comment by Farshid Ghods (Inactive) [ 29/Jan/13 ]
i think Couchbase Issues: Created: (MB-7618) Excessive compactions are scheduled during rebalance is opened to address this issue
Comment by Aliaksey Artamonau [ 29/Jan/13 ]
MB-7618 is mostly unrelated. It's part of our current design that we compact views after every few moves. And in this case one of such compactions just took long time.
Comment by Farshid Ghods (Inactive) [ 30/Jan/13 ]
Iryna,

did the test succeed after rerun ?
Comment by Iryna Mironava [ 31/Jan/13 ]
tested with 2.0.1-141-rel, 2.0.1-145-rel. Not reproduced
Comment by Farshid Ghods (Inactive) [ 31/Jan/13 ]
given that this is not reproducible and can be worked around by rerunning rebalancing i recommend deferring this to 2.0.2
Comment by Farshid Ghods (Inactive) [ 31/Jan/13 ]
or closing this as unable to reproduce if we have not reproduced this with 2.0.1 builds
Comment by Maria McDuff (Inactive) [ 02/Apr/13 ]
tony: can you please try this use case and see if you are seeing the hanging in RC1 build 185? thanks.
Generated at Sun Sep 21 19:56:14 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.