First of all I would like to point out that 3.1.X is already end of live and we would recommend upgrading to a newer versions of Couchbase Server. For more information about end of life please see the Support policy page. I would also be interested to know why the cluster was upgraded to Couchbase Server 3.1.3 and not 3.1.6?
With regards to the compaction issue, to help us debug it further would it be possible to get a screenshot of the compaction settings from both the cluster wide settings and the bucket settings.
Could you also answer the following questions too:
When the problem happens is it the Data or View index fragmentation that is over the threshold?
As a bit of background, this system is inherited (i.e. not written by me) as the back end to some mobile games written several years ago. As they are currently running well for the most part, I don’t have a lot of plans to be doing upgrades as long as things are stable. 3.1.3 as far as I can tell was the last community edition available in the 3.x line, as moving to enterprise would be cost prohibitive for the revenue on these games. Anything regarding an API change would be major work so I haven’t looked at upgrading to newer lines until I have a chance to dig into the changes moving to 4 or 5. The move to 3.1.3 was meant to pick up any bug fixes since 3.0.1.
The upgrade was performed using a swap rebalance (added the two new nodes in, rebalanced, then failed the old nodes).
Data definitely. Only one bucket uses views, and that will eventually go over as well.
Yes. I’ve set up a cron job using the REST API to periodically trigger compaction until I can get this resolved. Prior to that, I would use the button on the console to get things back down.
I’ve included a screen shot of the cluster settings. The buckets do not check the “override the default compaction settings” so there are no bucket specific settings to show that I know of. The compaction that is showing up on the screenshot is from the cron job I mentioned.
If you’d like me to pull up any settings via other REST calls or the CLI let me know.
After the environment almost melted down due to lost disk space that wasn’t getting restored through compaction, I took the cluster off line and did a clean reboot.
This fixed the auto compaction issue, although I had to rebalance the original servers out and in for couchbase to clean up the “other” file data that was cluttering it up. It went from over 400 GB down to 40GB.
Anyhow, since I’m on legacy software, no need to waste your time looking at this further. I was hoping it was something stupid, and it turns out it kinda was - reboot the servers. I’m guessing it was in a weird state after being rebalanced in originally.