Membase will exhaust free disk space soon - how to determine what is being stored?
Hi Folks,
I presently have a single m2.xlarge membase instance powering a membase install for my app. This instance provides 15.7GB for membase and 403 GB for storage. I brought this instance up last Wednesday. After approximately 7 days of continuous operation, my membase instance is consuming 195GB of on-disk storage, and reports 13.8M entries in the cache. While I do cache a certain class of data for two weeks, there are only about 200k objects that fall into this class. The remainder are short-lived pieces of data that expire after 15 minutes.
Bottom line is I have absolutely no clue what is represented in those 13.8M entries or 195GB of disk data. I see my disk write queue size hovering around 600/sec, but I do not ever see any disk reads.
If I do not diagnose and correct this issue my node will exhaust disk space in the next week and crash.
How can I inspect what is actually being stored on disk, and determine if this is either a bug in my code, or a bug in the membase code that is supposed to expire old data. Is it possible that the data expiry code of membase is just being really, really, really lazy, given the very large amount of disk space available to it on this kind of EC2 instance?
Thanks,
Eric
Hi Perry - It is the latest - 1.6.5
Here are is the output from the stats command:
auth_cmds: 174
auth_errors: 1
bucket_conns: 114
bytes_read: 703420870361
bytes_written: 1596923768227
cas_badval: 0
cas_hits: 0
cas_misses: 0
cmd_flush: 0
cmd_get: 3011178029
cmd_set: 77332916
conn_yields: 27011744
connection_structures: 164
curr_connections: 124
curr_items: 14302943
curr_items_tot: 14302943
daemon_connections: 10
decr_hits: 0
decr_misses: 0
delete_hits: 10754276
delete_misses: 189150004
ep_bg_fetched: 0
ep_commit_num: 804834
ep_commit_time: 0
ep_commit_time_total: 373500
ep_data_age: 5
ep_data_age_highwat: 439
ep_db_cleaner_status: complete
ep_db_strategy: multiMTVBDB
ep_dbinit: 0
ep_dbname: /mnt/membase/default
ep_dbshards: 4
ep_expired: 1243268012
ep_flush_duration: 2
ep_flush_duration_highwat: 348
ep_flush_duration_total: 404519
ep_flush_preempts: 0
ep_flusher_state: running
ep_flusher_todo: 0
ep_io_num_read: 0
ep_io_num_write: 63467697
ep_io_read_bytes: 0
ep_io_write_bytes: 505878277095
ep_item_begin_failed: 0
ep_item_commit_failed: 0
ep_item_flush_expired: 15449460
ep_item_flush_failed: 0
ep_kv_size: 6770457729
ep_max_data_size: 16872636416
ep_max_txn_size: 1000
ep_mem_high_wat: 12654477312
ep_mem_low_wat: 10123581849
ep_min_data_age: 0
ep_num_active_non_resident: 0
ep_num_eject_failures: 0
ep_num_eject_replicas: 0
ep_num_expiry_pager_runs: 165
ep_num_non_resident: 0
ep_num_not_my_vbuckets: 0
ep_num_pager_runs: 0
ep_num_value_ejects: 0
ep_oom_errors: 0
ep_overhead: 93244037
ep_pending_ops: 0
ep_pending_ops_max: 0
ep_pending_ops_max_duration: 0
ep_pending_ops_total: 0
ep_queue_age_cap: 900
ep_queue_size: 385
ep_storage_age: 3
ep_storage_age_highwat: 424
ep_storage_type: featured
ep_store_max_concurrency: 10
ep_store_max_readers: 9
ep_store_max_readwrite: 1
ep_tap_bg_fetch_requeued: 0
ep_tap_bg_fetched: 0
ep_tap_keepalive: 0
ep_tmp_oom_errors: 0
ep_too_old: 0
ep_too_young: 0
ep_total_cache_size: 541785202697
ep_total_del_items: 24119294
ep_total_enqueued: 103472341
ep_total_new_items: 38309967
ep_total_persisted: 87586991
ep_vbucket_del: 0
ep_vbucket_del_fail: 0
ep_version: 1.6.5
ep_warmed_up: 0
ep_warmup: true
ep_warmup_dups: 0
ep_warmup_oom: 0
ep_warmup_thread: complete
ep_warmup_time: 11704
get_hits: 2186393906
get_misses: 824784123
incr_hits: 0
incr_misses: 0
libevent: 2.0.7-rc
limit_maxbytes: 67108864
mem_used: 6863701766
pid: 9296
pointer_size: 64
rejected_conns: 0
rusage_system: 68901.530000
rusage_user: 36600.190000
threads: 4
time: 1298477810
total_connections: 189
uptime: 596473
version: 1.4.4_364_g056e303
Looks like we are actually expiring things:
ep_expired: 1243268012
ep_item_flush_expired: 15449460
ep_num_expiry_pager_runs: 165
What we don't do is actually "reclaim" disk space. When we delete an item, it makes a hole that gets reused later on, but the disk space doesn't shrink.
You can take a backup (http://wiki.membase.org/display/membase/Backup+and+Restore+with+Membase) and then scan through the db files using sqlite syntax to see what's actually stored in there.
Perry
Interesting, thanks for the info. Given that there is no compaction step for membase, what you are saying then is for a healthy membase instance, over time I should expect to see the used disk space reach the total capacity and just stay there indefinitely? If we looked at Farmville's membase servers, the disk usage graph would be maxed out all the time?
Can you provide any insight into that 13M "Total Items" statistic? Does that mean there have been 13M total items ever written to membase? Or does that mean that there are 13M total items in membase at this moment? The former makes sense to me; the latter is totally perplexing and I'd definitely need to take the backup and attempt to crack open in SQLite.
Thanks again for all your help perry, I really do appreciate it. I always try to write SEO-able forum posts so these exchanges will be findable by other membase users in the future.
Best,
Eric
There actually is a compaction step, is just can't be done "online". We can use the sqlite 'vaccuum' command:
-Either take the servers down and vaccum the files in place
OR
-Take a live backup, vaccuum it and then shut the servers down, replace the data files and start back up. The disadvantage to this is that you may have some data change after the backup that wouldn't be up to date...depends on whether your application can deal with that.
You are correct about the disk usage, it "should" reach capacity and then stay there provided there isn't any net-new data added.
As far as the "Total Items" goes, that is the count of active items AND replicas. "Unique" items is just active items and should match what you've put in.
Glad to help Eric, let me know what else I can do to help.
Perry
Hi Perry,
I went ahead and tested what membase does in the face of resource exhaustion, given the problems I have had with it in the past. From what I can tell, membase does not handle resource reclamation or exhaustion well at all.
Here is my test environment:
- One m1.micro EC2 instance
- Membase 1.65 32-bit
- 476MB allocated in RAM, 7.87 GB allocated in disk
I created a file 1MB in size using the first method listed here, and wrote the following ruby script:
Click here for the script on github (code doesn't display properly in this forum).
I ran this script four times to enter this state. Here is a screen shot of my dashboard. It would appear membase is unable to clear the expired data, which should only have lasted for 10 seconds. I am unable to run a 'stats' command because membase is constantly crashing/exiting, as per this screen shot.
This is is an extremely basic test configuration that anyone should be able to reproduce on their own EC2 micro instance.
Perry, can you explain what is going on, and whether or not this is by design, a bug in membase, or something else? Really I was hoping to test disk exhaustion, which I didn't even get to in this experiment.
Thanks,
Eric
The memcache crashing you're experiencing is a bug: http://forums.membase.org/thread/membase-fall-down-every-2-days#comment-1002648
It's already been fixed and will be included in an upcoming release.
Membase actually does handle resource exhaustion fairly well...given that you are staying within the bounds of the system. You should make sure to follow the guidelines outlined here: http://wiki.membase.org/display/membase/Sizing+Guidelines
Also, you may want to consider engaging with our sales and system-engineering teams to help you along in the process. If you're planning on going into production, you really want to be using the Enterprise Edition as it goes through a much more rigorous QA and regression testing process. It will also include hotfixes (like the one you're running into) long before the Community Edition gets them.
Perry
Thanks for this info Perry. I really appreciate all your help! Unforunately, given the repeated challenges and problems I've faced with membase, I've decided to replace it and use memcached directly. There were three big reasons why:
1) Membase used more memory than I though it should. Memcached stable state is around 250MB; membase, 4-6GB. This had a very real cost to me: an m1.small is much cheaper than an m2.xlarge
2) Membase wrote gigabytes of data to disk, inexplicably. By the end, I had a working set of 6GB in RAM and about 25GB written daily to disk. This makes no sense, given that my working set (as observed using memcached directly) is around 250MB. The forcing function in my migration decision was the fact that I had no faith that membase would not fall over when it exhausted disk space, even though (per this thread) it should continue to function correctly.
3) Clusterized membase servers went down together, obviating the advertised redundancy benefit. Twice I had whole clusters die. Auto-rebalancing was the cause of this failure at least once.
All the promised features of membase, I love, want, and need, but in the end it seems like the product still has a ways to go to deliver on its core features. I'd buy an enterprise support license, but unfortunately I am working on a breakeven mobile app, and cannot afford any additional infrastructure expenses.
I have now written my own app logic that auto-discovers my memcached ec2 instances using the ec2 API, prioritizes them based on AZ (not a concept known to membase), and uses a short socket timeout to fail over to the backup memcached server if the first one doesn't reply in time.
Best of luck, and hopefully I will be able to revisit this decision in the coming months after membase has had some additional bake time.
Cheers,
Eric
Thanks for your detailed response Eric.
If you're willing, I'd love to try and address your concerns above in order to gain your confidence and continued use of Membase. The first 2 issues seem related and the third is just strange. We have a number of very large deployments both within EC2 and data centers without issue, handling >100k ops/sec and utilizing hundreds of gigabytes of space (nearly terabytes in some cases). I don't mean to boast, just to provide evidence that the software does work as intended, when given the right attention and monitoring.
If Membase doesn't provide you any value over memcached then by all means, there's no reason not to use memcached. If you do see/need the value of Membase though, I'd be happy to work with you directly to make it successful in your environment.
Please feel free to email me directly at perry -at- couchbase -dot- com if you're interested in continuing the conversation.
Thanks Eric.
Perry
Eric, what version of Membase is this?
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!