Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.6.x

a2b out of memory

25 replies [Last post]
  • Login or register to post comments
Wed, 11/03/2010 - 15:53
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Trying to load up some data into a membase install and getting "a2b out of memory" errors after a while. The persistent storage seems to be working ok for quite some time before this happens, though - I notice the sqlite databases growing consistently over the course of the data load. We're not doing anything dramatic (maybe 100-300 ops/sec to load it). Our memory allocation is just set to the minimum (256MB) on this test server.

My expectation was that as the size of the data grew and began to hit the watermarks it would begin filling the disk with these items (which is what it appeared to do), then we began seeing the error above. What's interesting is that once we hit the "a2b out of memory" error, membase doesn't appear to flush the rest of the data to disk. So, for example, if I resume the data load again after a few minutes, it would still give the error (even after an hour). We're using the 1.6.0.1 community release.

Top
  • Login or register to post comments
Wed, 11/03/2010 - 16:38
perry
Offline
Joined: 10/11/2010
Groups:

 Take a look at this wiki entry and see if that helps clear up the behavior: http://wiki.membase.org/display/membase/Growing+Data+Sets+Beyond+Memory

 

I would expect that you would be able to put more items in (unless the disk is actually full, which I doubt) after the write queues have drained sufficiently.

 

You can watch the write queue in the UI by going to a particular buckets statistics page, clicking on "Configure View" and selecting "Disk Write Queue".

 

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Wed, 11/03/2010 - 18:46
bfolkens
Offline
Joined: 10/06/2010
Groups: None

 Thanks for the quick reply Perry, after watching it for a while I think that's exactly the case - and at some point the server just gets too far behind.  I'm trying to use flushctl to set the mem_low_wat and mem_high_wat params but it's giving me this error:

TypeError: set_flush_param() takes exactly 3 arguments (2 given)

Top
  • Login or register to post comments
Thu, 11/04/2010 - 06:25
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Had some more time to investigate this... It seems like the trouble starts when I issue a "flush" after having a series of those error messages above. After doing so the ep disk queue value never changes (it stays at 0) so the server quickly begins reporting 'temporary failure' and never recovers. At this point even the "stats" tool reports the following:

Traceback (most recent call last):
  File "./stats", line 163, in <module>
    main()
  File "./stats", line 160, in main
    c.execute()
  File "/opt/membase/1.6.0.1/ep-engine/management/clitool.py", line 42, in execute
    f[0](mc, *args[2:], **opts.__dict__)
  File "./stats", line 34, in g
    f(*args[:n])
  File "./stats", line 53, in stats_all
    stats_formatter(mc.stats())
  File "/opt/membase/1.6.0.1/ep-engine/management/mc_bin_client.py", line 244, in stats
    cmd, opaque, cas, klen, extralen, data = self._handleKeyedResponse(None)
  File "/opt/membase/1.6.0.1/ep-engine/management/mc_bin_client.py", line 83, in _handleKeyedResponse
    raise MemcachedError(errcode,  rv)
mc_bin_client.MemcachedError: Memcached error #130:  Out of memory

Top
  • Login or register to post comments
Thu, 11/04/2010 - 17:01
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

bfolkens,

Our Engineers here at Membase are looking into your problem. I will get back to you as soon as we resolve this.

 

Thanks

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Thu, 11/04/2010 - 18:46
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Thanks so much for the help. Not sure if this helps your team or not, but I just tried sizing the instance up to 512MB instead of 256 and I still experienced the same symptoms (where the system stops flushing and ep_queue_size stays at 0). I can also confirm it happens from a fresh install without a flush first (at first it seemed like it happened after a flush, but it happens regardless).

Top
  • Login or register to post comments
Thu, 11/04/2010 - 18:46
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Thanks so much for the help. Not sure if this helps your team or not, but I just tried sizing the instance up to 512MB instead of 256 and I still experienced the same symptoms (where the system stops flushing and ep_queue_size stays at 0). I can also confirm it happens from a fresh install without a flush first (at first it seemed like it happened after a flush, but it happens regardless).

Top
  • Login or register to post comments
Mon, 11/08/2010 - 15:51
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

What os are you running this on? Is it 32 bit or 64 bit?

 

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Mon, 11/08/2010 - 15:51
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

What os are you running this on? Is it 32 bit or 64 bit?

 

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Mon, 11/08/2010 - 21:05
bfolkens
Offline
Joined: 10/06/2010
Groups: None

 32-bit

Top
  • Login or register to post comments
Wed, 11/10/2010 - 15:17
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

I haven't been able to reproduce this error that you are seeing. How long after you started loading the data, did you see this? You mention in your first post that  you start getting this error quite some time after you loaded the data.

 

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Wed, 11/10/2010 - 15:17
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

I haven't been able to reproduce this error that you are seeing. How long after you started loading the data, did you see this? You mention in your first post that  you start getting this error quite some time after you loaded the data.

 

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Wed, 11/10/2010 - 20:28
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Yes, it was several hours and after a few 100,000x keys were set.

Top
  • Login or register to post comments
Thu, 11/11/2010 - 11:07
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

Hello,

 

I have not been able to reproduce the errors you see. What are the values of your mem_high_wat and mem_low_wat?

I plan to keep adding data and see if I get the error and will let you know if I see anything.

Bhawana

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Fri, 11/12/2010 - 06:18
xinit
Offline
Joined: 10/11/2010
Groups: None

 We're currently experiencing the same issue. To test membase read/write performance when the RAM is full (not the disk). I've setup a membase server with 100mb bucket and started writing to it. After a while, we get the "SERVER_ERROR a2b out of memory". The disk queue seems to flush constantly and we see some ram evictions happening.

Top
  • Login or register to post comments
Fri, 11/12/2010 - 06:56
bfolkens
Offline
Joined: 10/06/2010
Groups: None

ep_mem_high_wat: 402653184

ep_mem_low_wat: 322122547

 

I'll try running this again and giving you a list of the ./stats output.  Is there another debug output you'd like to see?

 

Top
  • Login or register to post comments
Fri, 11/12/2010 - 10:49
bhawana@membase
Offline
Joined: 10/29/2010
Groups: None

bfolkens,

Are you using the default bucket?

Bhawana

 

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
  • Login or register to post comments
Fri, 11/12/2010 - 12:04
bfolkens
Offline
Joined: 10/06/2010
Groups: None

Here's a fresh run:

membase-server v1.6.0.1
x86 platform
gcc v4.3.4

Configured to use 256MB of ram (out of 1.7GB), default bucket, 0 replicas, 1 server

$ ./stats localhost:11210 all
auth_cmds: 1
auth_errors: 0
bucket_conns: 3
bytes_read: 4111710999
bytes_written: 451661495
cas_badval: 0
cas_hits: 0
cas_misses: 0
cmd_flush: 1
cmd_get: 1162293
cmd_set: 2325506
conn_yields: 0
connection_structures: 8
curr_connections: 8
curr_items: 2324571
curr_items_tot: 2324571
daemon_connections: 5
decr_hits: 0
decr_misses: 0
delete_hits: 0
delete_misses: 0
ep_bg_fetched: 0
ep_commit_num: 10398
ep_commit_time: 0
ep_commit_time_total: 2144
ep_data_age: 2
ep_data_age_highwat: 12
ep_dbinit: 0
ep_dbname: /home/membase/1.6.0.1/data/ns_1/default
ep_dbshards: 4
ep_expired: 0
ep_flush_duration: 0
ep_flush_duration_highwat: 8
ep_flush_duration_total: 2175
ep_flush_preempts: 0
ep_flusher_state: running
ep_flusher_todo: 0
ep_io_num_read: 0
ep_io_num_write: 2324571
ep_io_read_bytes: 0
ep_io_write_bytes: 3937810793
ep_item_commit_failed: 0
ep_item_flush_expired: 0
ep_item_flush_failed: 0
ep_kv_size: 255507713
ep_max_data_size: 268435456
ep_max_txn_size: 250000
ep_mem_high_wat: 201326592
ep_mem_low_wat: 161061273
ep_min_data_age: 0
ep_num_eject_failures: 639999137
ep_num_expiry_pager_runs: 4
ep_num_non_resident: 2324571
ep_num_not_my_vbuckets: 0
ep_num_pager_runs: 939
ep_num_value_ejects: 2324571
ep_oom_errors: 228
ep_overhead: 12927624
ep_pending_ops: 0
ep_pending_ops_max: 0
ep_pending_ops_max_duration: 0
ep_pending_ops_total: 0
ep_queue_age_cap: 900
ep_queue_size: 0
ep_storage_age: 0
ep_storage_age_highwat: 9
ep_storage_type: featured
ep_tap_keepalive: 0
ep_tmp_oom_errors: 707
ep_too_old: 0
ep_too_young: 0
ep_total_cache_size: 4076044533
ep_total_del_items: 0
ep_total_enqueued: 2324572
ep_total_new_items: 2324571
ep_total_persisted: 2324571
ep_vbucket_del: 0
ep_vbucket_del_fail: 0
ep_version: 1.6.0_10_g3b4878a
ep_warmed_up: 0
ep_warmup: true
ep_warmup_dups: 0
ep_warmup_oom: 0
ep_warmup_thread: complete
ep_warmup_time: 0
get_hits: 0
get_misses: 1162293
incr_hits: 0
incr_misses: 0
libevent: 2.0.8-rc
limit_maxbytes: 67108864
mem_used: 268435337
pid: 6304
pointer_size: 32
rejected_conns: 0
rusage_system: 50.975250
rusage_user: 502.402623
threads: 4
time: 1289588201
total_connections: 14413
uptime: 17354
version: 1.4.4_298_g250909b

Top
  • Login or register to post comments
Fri, 11/12/2010 - 13:24
perry
Offline
Joined: 10/11/2010
Groups:

Thanks bfolkens, looks like there's definitely something unexpected going on there.

We're engaging with our engineers now to look at it.

Can you tell me how large your items are?

 

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Fri, 11/12/2010 - 14:58
bfolkens
Offline
Joined: 10/06/2010
Groups: None

 Around 6-8kB

Top
  • Login or register to post comments
Fri, 11/12/2010 - 16:38
perry
Offline
Joined: 10/11/2010
Groups:

So we've done a bit of analysis here.

 

Your stats output was very helpful, and showed that all the avialable memory is being taken up by item metadata.

 

Looking at some stats in specific:

 

 -mem_used (mem_used: 268435337) is over the memory limit (ep_max_data_size: 268435456)

-The software has ejected almost all of the items (ep_num_value_ejects: 2324571 and ep_num_non_resident: 2324571) yet the memory has not been reclaimed.

-If we take the number of items (curr_items: 2324571) and divide them into the memory used (mem_used: 268435337) it equals about 110 bytes per item which is almost exactly the amount of per-item overhead we have.


The solution here is to add more memory or store less items, and I have filed a bug to improve the behavior when this happens to make it easier to figure out.

Let me know if you need any further clarification on this.

Thanks  Perry

 

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Fri, 11/12/2010 - 19:50
bfolkens
Offline
Joined: 10/06/2010
Groups: None

 Thanks for clarifying Perry.  So that's total memory for the cluster that needs to be available to store the metadata correct?  Each node only stores the metadata for the keys on that particular node?

Top
  • Login or register to post comments
Mon, 11/15/2010 - 11:06
ducci
Offline
Joined: 11/15/2010
Groups: None

bfolkens wrote:

 Thanks for clarifying Perry.  So that's total memory for the cluster that needs to be available to store the metadata correct?  Each node only stores the metadata for the keys on that particular node?

Running into the exact same issue and wondering the same as above.

Top
  • Login or register to post comments
Mon, 11/15/2010 - 11:36
perry
Offline
Joined: 10/11/2010
Groups:

Correct, each node only stores the metadata for the keys that are on that particular node.  One thing to keep in mind is that a node not only stores its active items but any replica items that it is also responsible for.

 

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
Mon, 11/15/2010 - 13:04
bfolkens
Offline
Joined: 10/06/2010
Groups: None

 Hrm, ok - I think that just killled it for us since we've got something like 80 million keys (and growing) and were hoping to spread it across commodity hardware (~1.5GB per node) - but even if we had 6 nodes with 1.5GB a piece it would be mostly metadata.

Are there any future plans to flush LRU metadata out to disk as well?

Top
  • Login or register to post comments
Mon, 11/15/2010 - 16:25
perry
Offline
Joined: 10/11/2010
Groups:

Sounds like you are correct in that you need more than 9GB of RAM in order to store over 80 million keys...not much can be done about that.

 

As far as flushing metadata out to disk, we have certainly considered it but I don't know that there are any concrete plans to implement that.  One of the nice features about Membase (inherited from memcached) is the ability to VERY quickly tell you that an item DOES NOT exist rather than possibly spending multiple seconds looking up an item's location on disk just to return with "not found" to the client after making it wait so long.  

 

There are improvements that can be done to reduce the amount of overhead per-item, and those will be evaluated and implemented as necessary.

 

Perry

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker