Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Memcached Server 1.0.3

Massive CPU spike on cache server

8 replies [Last post]
  • Login or register to post comments
Mon, 08/02/2010 - 02:46
theburningmonk
Offline
Joined: 06/29/2010
Groups: None

Hi,

We're running two Membase memcache servers in our dev environment at the moment using Amazon EC2 instances, and over the weekend one of the servers' CPU just shot up from close to zero to 100% in an instant, under very little load (we had a few bots running but the load is consistent throughout).
I've attached a snapshot of the CPU graph and the diagnostic output on that server.

This is the second time we have seen this in about a week and having read through a similar thread:
[url]http://forums.membase.com/showthread.php?211-Memcached.exe-cpu-consumption[/url]
I checked the history and this is a different server to the one which spiked the last time..

Strange though, the average response time from the cache cluster (of these two servers) are only slightly higher than what we were seeing before the server spiked.

I'm just wondering if any of you guys managed to get to the bottom of why this sudden spikes in CPU happens and what we could do to stabilize the server without restarting the memcache.exe process?

Many thanks,

Top
Mon, 08/02/2010 - 11:08
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

We're still investigating some CPU issues, but we've never really experienced a %100 spike.

Can you confirm that it is in fact the memcached process that is taking up the full CPU? Have you reset the service yet?

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Mon, 08/02/2010 - 11:42
theburningmonk
Offline
Joined: 06/29/2010
Groups: None

Hi Perry,

Yes, I can confirm that it's the memcached.exe process that's taking up all the CPU. Maybe the CPU spike is exaggerated as these are small Amazon instances (single CPU, 1.7GB memory) we're using for development.

I have not reset the service yet, are there any other information that I could provide to help the investigation?

Cheers,

Top
Mon, 08/02/2010 - 11:44
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

Thanks. Can you telnet to this server on port 11211 and run the command 'stats', wait 10 seconds, and then run 'stats' again? I'd like to see if there is anything in the traffic statistics that might be causing the CPU to spike.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Tue, 08/03/2010 - 02:22
theburningmonk
Offline
Joined: 06/29/2010
Groups: None

Hi Perry,

Here's the output when I ran stats roughly 10 seconds apart. It's worth noting that we've stopped all the bots since yesterday and the CPU is still sitting at 100 consistently but there's no traffic hitting that server.. Almost looks as if the memcached process's got itself into an infinite loop somehow.

Top
Tue, 08/03/2010 - 13:10
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

What version of Windows are these systems running? Do you have any experience with "perfmon"?

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Thu, 08/12/2010 - 09:32
theburningmonk
Offline
Joined: 06/29/2010
Groups: None

Hi Perry, thanks for getting back, didn't get an email notification for your response for some reason. We have some experience with perfmon, what would you like us to do with perfmon?

Interestingly though, we've been running with NorthScale in production for about 2 weeks now with a fair amount of traffic going through it and we haven't seen that CPU spike again. Reading on one of the other threads in the forum, it sounds like if you mess around with the 'default' bucket you could find yourself a memory leak and considering that we were playing with the cache servers in testing quite a bit, could adding/removing the default bucket have introduced the CPU spike?

Thanks,

Top
Thu, 08/12/2010 - 09:38
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

I didn't think we had seen CPU issues with the default bucket being removed (definitely a memory leak...) but I'm sure it's possible. Glad to hear everything's stable now. If it does happen again, I'd like to see if perfmon shows us any in depth information about where the CPU is being taken up.

Thanks.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Mon, 08/16/2010 - 01:49
theburningmonk
Offline
Joined: 06/29/2010
Groups: None

Hi Perry, funny how things pan out as one of our cache servers have spiked to 100% over the weekend and has stayed there. We have started perfmon on that instance and the CPU usage readings from the Amazon cloudwatch seems to be legitimate and at a first glance large percentage of the CPU is being recorded in the '% Privilaged Time' counter. Is there anything in particular that we should monitor in order to gain an insight into what's happening with the memcached process?

Top
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker