I’ve recently upgraded to Couchbase 2.0, and compared to 1.7 and 1.8, it seems like persistence is becoming an issue.
My cluster has servers with 48G RAM and “only” 146G disks. Couchbase stores up to 50-100 million items, mostly very small keys, which represent about 5-10k GETs per second and 1.5-3k PUTs per second.
The default auto-compactation has the main bucket enter an almost constant compactation loop, and raising the 30% to 60% or more “fixes” this, but I then see the disk usage get dangerously high at times.
Last night, the disks of all nodes in the cluster became full, and hell broke loose…
What I saw shortly before it happened was that we had hit a very high number of items (close to 100 millions), and that the compactation was taking much longer to complete. I’m guessing that it wasn’t able to complete fast enough, and the disks got full before it was done.
The RAM quota per node for the bucket is 40G, and it was far from being all used. On disk, though, 120G of Couchbase files existed, effectively filling it all up.
So my questions are :
Can I disable data from being on disk? I really don’t need it, since persistence is not a requirement in my case, as Couchbase is only used as a caching layer (a clustered Memcache, basically). My understanding from what I’ve seen here and there about 2.0 is unfortunately “no”…
Can I easily change something so that the disk never gets full? Something like having it be managed more like the RAM?
Any tips are welcome!
I’ll be more than happy to provide any relevant details!
It’s definitely pure key/value. We use Couchbase as if it was a memcached cluster (through moxi, from web servers). No indexing, no XDCR.
We heavily use TTLs, setting many at less than 1 minute.
We don’t explicitly re-use most keys, as they’re typically custom created to have session-like behaviour in our application. They’re quite small and short-lived, rarely to be reused.
Also, regarding the problematic cluster setup itself, it was of 4 nodes with 2 replicas and index replicas enabled. Right now I’ve switched to a much simpler 2 node setup with only 1 replica and index replicas disabled. It seems to be behaving better, but with auto-compactation set to kick-in at 50% fragmentation, it still seems to be running most of the time.
The current disk usage I see is as follows :
Each node is around “12.5GB / 20GB” for “Data/Disk Usage”.
The single bucket reports “24.8GB / 37.5GB” for “Data/Disk Usage” and “16.2GB / 78.1GB” for “RAM/Quota Usage”.
The real Couchbase-related disk usage is this :
cache1 ~ # du -sh /opt/couchbase
cache2 ~ # du -sh /opt/couchbase
(which is nearly all /opt/couchbase/var/lib/couchbase/data)
So my understanding is that even though the items only use 16GB RAM in total, the disk usage is over 40GB for the same set of data, even right after compactation is finished.
you don’t need index replica if you have not defined any indexes.
Also, currently there is a bug (specific to TTL) which is why you are seeing growth in disk space. in 2.0 we have an append-only storage and the persistence engine was completely re-written. fix is coming soon to purge expired items from disk.
Also, its not a problem is compaction is running all the time (as long as you have I/O capacity and cpu)
Thanks for these clarifications!
Right now, things should be stable, and the frequent compactation is taking care of maintaining the disk usage to an acceptable level without impacting Couchbase overall performance, so I’ll wait for the next bugfix release.
But I’m quite sure that if I had to rebalance now, I would be in trouble again, so fingers crossed! (that, and I have another 2-node cluster on standby)
we use Couchbase for php sessions. A three node cluster with replica 1. Items expiration works as expected but disk usage is always growing. Automatic compaction cleans some disk space for a while but overal usage only increases. Tonight I upgraded all nodes from 2.0 to 2.0.1 but there is no difference.
Before upgrade I tried to remove one node from cluster, delete all files from data directory and rejoin the cluster. It didn’t help.
Is there a way to free some disk space? As a workaround we could even flush all data from the bucket at night.
Yes, there is a way, and last I checked it wasn’t documented anywhere. You need to perform a different kind of compactation, which seems to be “unsafe” at least when it comes to XDCR from what I understood, though I have had no problems with it. It’s not available from the web interface, you need to do the following on any single cluster node :
curl -u : -X POST http://localhost:8091/pools/default/buckets//controller/unsafePurgeBucket
Of course, replace , and with your own values.
This REST Call is purging some data that are not deletes by the standard processes.
Theses data are some metadata that we keep in the database to optimize the XDCR, this command just purge them.
The forum is giving me a warning that this topic is very old, but the discussion does not seem to have been resolved, and it is an extremely urgent, current topic for me.
Without using the “usafePurgeBucket” trick, will the disk usage ever level off on its own?
We recently replaced a single-node Couchbase 2.2.0 Memcached bucket with a Couchbase bucket. Our only reason for doing this was to work around the 1MB limit on the value size.
Our “documents” tend to be very small and short lived. I.e., a payload might be nothing but a boolean value, with a TTL in seconds. The active docs are 100% resident, about 500MB of data for a 3GB bucket, getting an average of 800 gets/sec and 150 sets/sec. The auto-compaction is set to the default of 30% and compaction events run frequently.
After two days in deployment, we are worrying about the disk usage, which has climbed to 3.5GB despite the data size being on the order of 500MB. The slope of the increase isn’t sustainable. Will Couchbase Server 2.2.0 level off or reverse this trend of extreme disk usage?
It would do a great deal for my level of comfort, if I could have some assurance that Couchbase buckets don’t continuously grow in size.
Thanks, and apologies for resurrecting such an old discussion.
I wish I could say with strong confidence when it will level off, and at what maximum disk size.
I was hoping that the “three day” event would bring a steep drop, like the “unsafePurgeBucket” does, but disk usage continues to have local ups and downs, while the overall usage continues to climb (now passing 5GB for a 500MB bucket.)
I’m pretty surprised by the disk usage requirement. I planned for 3X the bucket utilization size, and passed that a long time ago. The growth is surprisingly steep, about 1.5GB per day for a bucket that remains constant at about 500MB mem_used for 100% vb_active_resident_items_ratio. I keep expecting “compaction_daemon” to run on its own and yield the same effect as unsafePurgeBucket.
Ok I now believe that the “metadata purge interval” has to do with the length of time that metadata has been stale, and not so much to do with the timing between purge events. Dropping the interval down to one day appears to have constrained the disk footprint to reasonable levels. I can also report that the huge purge when it ran initially had no detrimental effect on other processes, even though there was a huge amount of space being reclaimed (on NTFS, on an SSD).
We ended up using a Memcached bucket for the subset of items that were causing so much churn. It turns out to be a good solution for a case where the IO for massive deletion of data was dominating the load profile.
We have some data with long TTLs, but most of our keys have TTLs measured in a few seconds – and when you make a bucket like that in the tens of gigabytes, it appears that huge amounts of resources are devoted expiration/ejection/eviction.