[MB-7552] CPU usage of beam.smp is too high during initial loading of system test for 2.0.1 build Created: 17/Jan/13  Updated: 04/Feb/13  Resolved: 30/Jan/13

Status: Closed
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.0.1
Fix Version/s: 2.0.1
Security Level: Public

Type: Bug Priority: Major
Reporter: Chisheng Hong (Inactive) Assignee: Chisheng Hong (Inactive)
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: couchbase-server-community_x86_64_2.0.1-129-rel on Centos

Attachments: Text File 01-16-10.3.2.115.diags.txt     Zip Archive ns-diag-20130116182321.txt.zip    

 Description   
15-node-cluster with HHD 8GB RAM and 4 core CPU. During initial loading, I have 1 k "sets" ops/ per sec for each node. Some of the nodes in cluster consume all the CPU resources
Memcached (200%~250%), beam.smp (100%~150%). We only have compaction running, no views, rebalance. It's just a Key-Value load.

I am using couchbase-python-client to do the multi-set loading.

I attach the log from the node which suffers from this issue.

 Comments   
Comment by Aliaksey Artamonau [ 21/Jan/13 ]
Per previous discussion, please try to reproduce with the latest build.
Comment by Chisheng Hong (Inactive) [ 21/Jan/13 ]
It is reproduced in build 2.0.1-139 with R 15 on the same cluster with the same load.
Comment by Aliaksey Artamonau [ 21/Jan/13 ]
I ran measure-sched-delays program on one of the machines while it was experiencing high cpu load:

1358816552.787212133 7704073
1358816553.781080246 1571073
1358816554.851312160 71804073
1358816556.044983149 265474073
1358816556.781110048 1602073
1358816557.781501055 1993073

So sometimes it doesn't get a cpu share for more than two seconds. This is most likely an indication of environment problems. Talked with Farshid and he'll be working Chisheng to understand if this is really a virtualization issue.
Comment by Farshid Ghods (Inactive) [ 24/Jan/13 ]
last update:
we changed VMware setting to avoid oversubscribing cpu utilization to any vm more than others and only use as much CPU available in the host.

Chisheng however is still seeing same behavior and we will have to investigate this further on system test. will assign this ticket back to engineering if we find out more about the environment.
Comment by Farshid Ghods (Inactive) [ 30/Jan/13 ]
the specific test is now moved to EC2

will reopen if this is observed on EC2 environment
Comment by Chisheng Hong (Inactive) [ 31/Jan/13 ]
Can not repro it on EC2 environment. After disable memory ballooning on the previous cluster which cause this issue, CPU usage of beam.smp is back to normal.
Generated at Thu Apr 24 21:00:53 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.