Failing sets and "Hard Out Of Memory Error"
We recently (Monday, 11.07) moved from Memcached to Membase. Our setup is a cluster consisting of 3 servers with 2GB dedicated RAM for Membase (4GB total on each). The server nodes are configured with 1 replica copy so there is a global amount of 6GB RAM for our buckets and plenty of HDD (over 600GB).
We are running a heavily loaded website so the increase of items was enormous:
Today we have noticed some problems with the cluster: adding a new element (via Memcache::set) results in getting "false". In addition we got the following log messages:
11:32:49 - Fri Jul 15, 2011 - Hard Out Of Memory Error. Bucket "bucket2" on node 85.***.***.*** is full. All memory allocated to this bucket is used for metadata.
11:38:06 - Fri Jul 15, 2011 - Hard Out Of Memory Error. Bucket "bucket2" on node 85.***.***.*** is full. All memory allocated to this bucket is used for metadata. (repeated 90 times)
11:38:07 - Fri Jul 15, 2011 - Hard Out Of Memory Error. Bucket "bucket2" on node 85.***.***.*** is full. All memory allocated to this bucket is used for metadata.
11:44:06 - Fri Jul 15, 2011 - Hard Out Of Memory Error. Bucket "bucket2" on node 85.***.***.*** is full. All memory allocated to this bucket is used for metadata. (repeated 2 times)
Any ideas about fixing this?
Don't forget that unlike memcached membase buckets will never throw away data and the meta data (i.e. key and some info on the object) is always kept in memory (which is necessary to give the super low latency hit and misses). So if you really need all the data you have in your bucket, then you will need to grow the cluster. However it might also be worthwhile checking whether you really need to keep all that data stored permanently. Expiration of items via TTL still works just like in memcached, so you an use that to make sure that old data gets thrown out.
We now also have TOUCH and GAT that allows you to just refresh the TTL on objects, which is a way to keep recently accessed data around, but have old data expire.
Hope that helps
Cheers,
Frank
Thank you for your advice Perry and Frank. Finally we divided the cluster into 4 buckets of which 2 are standard Memcache buckets and 2 act as Membase buckets.
It seems that you'll need to increase the size of bucket2 either by adding another node to the cluster or increasing the RAM quota for that bucket if possible.
If it's still in this state (or you can get it back there) can you please run the following command on all 3 servers and post the output:
/opt/membase/bin/mbstats localhost:11210 all bucket2
Thanks
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!