Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.6.x

Seeking advice on tuning Cluster/Bucket Performance when under sustained get/put load

4 replies [Last post]
  • Login or register to post comments
Mon, 03/21/2011 - 12:43
jshao@contextweb.com
Offline
Joined: 02/23/2011
Groups: None

We're running a Membase cluster that looks roughly like: 4x Dell R410 - 128GB RAM, 3x512GB SSD, 1 HDD (OS, logs, etc) We're in the processes of taking that cluster from 4 nodes to 6 to address general scalability concerns. We're seeing odd performance on the cluster (wild swings from 0-10k ops/sec) and timeouts reported by our clients (currently set at 15ms & 22ms depending on which of a couple of apps are accessing the shared pool).

We're hoping we can see some advice on tuning/general performance principles.

We have about 18 buckets currently allocated, serving a mix of workloads and see very different performance per bucket. We have approximately 10 servers running Java -> NetSpy -> Moxi -> Membase. Up to now we have tuned size of DRAM allocations, and tried to ensure there's some available space at all times, but not tuned low/high water marks or other tweaks. Some other usage stats.

Small-Bucket Workload (2/3 of buckets in this category - per-bucket):

  • Approximately 2500 get/sec (4000 peak)
  • Approximately 300 put/sec (500 peak)
  •  

Large-Bucket Workload (per-bucket):

  • Approximately 4500 get/sec (8-10000 peak)
  • Approximately 3000 put/sec (6000 peak)
  •  

In both cases we see odd performance graphs (below) from Membase, though client apps seem to report good response time (<10ms) from small buckets, and terrible response time (~85% timeout at 22ms timeout) from large buckets. There are 2 types of app workloads:

#1 (2/3 traffic) : HTTP Request -> Membase Get (check current value) -> Update value -> Membase Put (store updated value) 

#2 (1/3 traffic) : HTTP Request -> Membase Get (check current value) -> App performs some action based on value

There is no relationship in terms of time of access of a key based on app workload 1 or 2 above - in fact they're very likely to be pulling from different sets due to different business factors. 

Overall Membase stat snapshot:

You can see the Disk Ops per second even at the cluster level seems to have significant swings up or down, around the consistent operation rate baseline.

Snapshot from our larger bucket:

It looks to me like there's a correlation in when transactions are blocked/dip and Items persisted/Disk write queue size. I had understood from reading the docs that a background thread walks the mem buffer from high-water -> low-water marks, but thought that it was supposed to be trumped by real-time accesses.

A smaller bucket for comparison:

Am hoping to get some ideas of how to handle this workload, plus any tuning tips, or pointers to documentation we may have missed

 

__________________

Jason Shao
Manager, Platform Development
jshao@contextweb.com

22 Cortlandt Street, 9th Floor
New York, NY 10007

646.421.6721 tel
212.349.2191 fax

Top
  • Login or register to post comments
Mon, 03/21/2011 - 12:46
jshao@contextweb.com
Offline
Joined: 02/23/2011
Groups: None

Sadly, image uploads in the forum don't seem to work, have posted the linked photos at:

* Cluster stats: http://www.flickr.com/photos/jayshao/5547325711/
* Large bucket: http://www.flickr.com/photos/jayshao/5547325737/
* Small bucket: http://www.flickr.com/photos/jayshao/5547907742/

__________________

Jason Shao
Manager, Platform Development
jshao@contextweb.com

22 Cortlandt Street, 9th Floor
New York, NY 10007

646.421.6721 tel
212.349.2191 fax

Top
  • Login or register to post comments
Tue, 03/22/2011 - 12:53
TimSmith
Offline
Joined: 02/09/2011
Groups:

Hello, Jason. You have an interesting use case, and the most significant stat that seems to stand out to me about the large bucket is the combo of a low resident item ratio (< 20%) and a fairly high cache miss ratio (4%). Allocating more RAM to the large bucket will relieve that, by keeping more of your working set in memory.

We can get into more detail with various statistics on this, and I would like to see screenshots for Manager -> Data bucket and Monitor -> Data bucket.

Regards,

Tim

Top
  • Login or register to post comments
Wed, 03/23/2011 - 08:24
jshao@contextweb.com
Offline
Joined: 02/23/2011
Groups: None

I think there's a support ticket paralleling this track - Tim/Perry/et.al @ Couchbase - feel free to crosspost items into the forum here if they'd be useful examples for others considering similar use-cases or workloads.

__________________

Jason Shao
Manager, Platform Development
jshao@contextweb.com

22 Cortlandt Street, 9th Floor
New York, NY 10007

646.421.6721 tel
212.349.2191 fax

Top
  • Login or register to post comments
Wed, 03/23/2011 - 16:58
perry
Offline
Joined: 10/11/2010
Groups:

Thanks Jason.

The key thing here seems to be appropriate sizing with respect to the amount of RAM available.

We're still looking into the overall performance of the system, but all the signs point to accessing too much data from disk.

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker