[MB-6550] [longevity] Rebalance hang after failover and remove node because of the memory leak on a couple of nodes Created: 06/Sep/12  Updated: 14/May/14  Resolved: 07/Sep/12

Status: Closed
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 2.0-beta
Fix Version/s: 2.0-beta
Security Level: Public

Type: Bug Priority: Major
Reporter: Thuan Nguyen Assignee: Chiyoung Seo
Resolution: Fixed Votes: 0
Labels: system-test
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: centos 6.2 64bit

Attachments: Text File 9nodes-1663-reb-hang-20120906_checkpoint.txt     Text File 9nodes-1663-reb-hang-20120906_stats_all.txt     Text File 9nodes-1663-reb-hang-20120906_tap.txt    
Issue Links:
Relates to

Cluster information:
- 11 centos 6.2 64bit server with 4 cores CPU
- Each server has 10 GB RAM and 150 GB disk.
- 8 GB RAM for couchbase server at each node (80% total system memmories)
- Disk format ext3 on both data and root
- Each server has its own drive, no disk sharing with other server.
- Load 9 million items to both buckets
- Cluster has 2 buckets, default (3GB) and saslbucket (3GB)
- Each bucket has one doc and 2 views for each doc (default d1 and saslbucket d11)
- Add one more doc d2 with 2 views to default bucket

* Start cluster with 10 nodes installed couchbase server 2.0.0-1663
* Data path /data
* View path /data

* The last run, I do swap rebalance remove node 13 and add node 26.
* Then node 26 failed due to physical failure. I failover node 26 and rebalance.
* Rebalance failed with known issue MB-6497 at the end of rebalance saslbucket
* Node 22 down due to run out of disk space. Failover node 22.
* Remove node 13. Start rebalance from 19:26:35 - Wed Sep 5, 2012

Bucket "default" rebalance does not seem to be swap rebalance ns_vbucket_mover000 ns_1@ 19:26:35 - Wed Sep 5, 2012

Rebalance hang until now Thu Sep 6 19:25:29 PDT 2012

CPU and beam stats
    Vm: 2796m Rm: 613m CPU: 13.7 beam.smp
    Vm: 6091m Rm: 4.2g CPU: 9.8 memcached
    Vm: 1845m Rm: 338m CPU: 9.9 beam.smp
    Vm: 1230m Rm: 1.0g CPU: 2.0 memcached
    Vm: 2443m Rm: 652m CPU: 9.8 beam.smp
    Vm: 4969m Rm: 3.4g CPU: 7.9 memcached
    Vm: 3304m Rm: 907m CPU: 19.4 beam.smp
    Vm: 5440m Rm: 4.0g CPU: 3.9 memcached
    Vm: 3462m Rm: 665m CPU: 30.7 beam.smp
    Vm: 6329m Rm: 4.1g CPU: 5.1 memcached
    Vm: 2702m Rm: 642m CPU: 13.2 beam.smp
    Vm: 4845m Rm: 3.5g CPU: 5.0 memcached
    Vm: 4498m Rm: 1.4g CPU: 91.2 beam.smp
    Vm: 5359m Rm: 3.6g CPU: 1.7 memcached
    Vm: 3793m Rm: 1.0g CPU: 11.7 beam.smp
    Vm: 5356m Rm: 3.7g CPU: 1.7 memcached

Swap stats in MB
                   Total Used Free
Swap: 5199 1815 3384
Swap: 5199 10 5189
Swap: 5199 15 5184
Swap: 5199 2503 2696
Swap: 5199 1037 4162
Swap: 5199 1543 3656
Swap: 5199 2156 3043
Swap: 5199 1156 4043
Swap: 5199 1949 3250

Link to diags of all nodes

Comment by Chiyoung Seo [ 07/Sep/12 ]
The memory usage on and is above 90% of their bucket quota even after most of active and replica items were ejected. This is the reason why rebalance got stuck:

Chiyoung-MacBook:ep-engine chiyoung$ ./management/cbstats raw memory
 ep_kv_size: 2436606624
 ep_max_data_size: 3145728000
 ep_mem_high_wat: 2359296000
 ep_mem_low_wat: 1887436800
 ep_mem_tracker_enabled: true
 ep_oom_errors: 0
 ep_overhead: 221345920
 ep_tmp_oom_errors: 0
 ep_value_size: 2214922031
 mem_used: 2831961568
 tcmalloc_current_thread_cache_bytes: 2281472
 tcmalloc_max_thread_cache_bytes: 4194304
 tcmalloc_unmapped_bytes: 7356416
 total_allocated_bytes: 5440249488
 total_fragmentation_bytes: 919716208
 total_free_bytes: 2457600
 total_heap_bytes: 6362423296

Chiyoung-MacBook:ep-engine chiyoung$ ./management/cbstats all | grep resident
 ep_num_non_resident: 2427780
 vb_active_num_non_resident: 1005950
 vb_active_perc_mem_resident: 0
 vb_pending_num_non_resident: 0
 vb_pending_perc_mem_resident: 0
 vb_replica_num_non_resident: 1421830
 vb_replica_perc_mem_resident: 0

It seems to me that there is a serious memory leak on 14 and 15. Especially, ep_value_size (2214922031) means that most of Blob value instances are freed even after we ejected them. Those blob values are referenced in many places (hash table, flusher, tap replicator, etc.)
Comment by Chiyoung Seo [ 07/Sep/12 ]
Comment by Thuan Nguyen [ 08/Sep/12 ]
Integrated in github-ep-engine-2-0 #426 (See [http://qa.hq.northscale.net/job/github-ep-engine-2-0/426/])
    MB-6550 Free bg-fetched items if the TAP connection is invalid. (Revision 25f4791191a3c3aca670781357b61559191a7f65)

     Result = SUCCESS
Chiyoung Seo :
Files :
* src/tapconnmap.cc
Comment by Farshid Ghods (Inactive) [ 12/Sep/12 ]
is this a system test blocker ? if so please add sblocker label
Comment by kzeller [ 17/Sep/12 ]
Beta RN: Fixed rebalance failure. Rebalanced had stalled
after performing failover and removing node due to memory leak on
cluster nodes.
Generated at Wed Sep 17 05:36:25 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.