There are two simple linux OS level settings that people seem to be overlooking setting correctly on their production systems I have seen. These are documented elsewhere, but they keep coming up and seems like they need some quick review here. It is not like these are some super secret setting or magic bullet performance fixing items necessarily, but they are things that in a production Couchbase DB should be set correctly as below and incorporated into whatever system/process you use to bootstrap the nodes you use for Couchbase. They help with memcached performance and rebalance performance and in some cases stability issues.
Please make sure you test these out in a test environment first before moving to production with them obviously.
Swappiness should to be turned off
This one is pretty straightforward if you know about the Linux virtual memory system. Swappiness levels tell the virtual memory subsystem how much it should try and swap to disk. The thing is, the system will try to swap out items in memory even when there is plenty of RAM available to the system. The OS default is usually 60, which is a little aggressive IMO. You can see what value your system is set to by running the following command:
Since Couchbase is tuned to really operate in memory as much as possible. You can gain or at minimum not lose performance by just changing the swappiness value to 0. In non-tech talk, this tells the virtual memory subsystem of the OS to not swap items from RAM to disk unless it really really has to, which if you have sized your nodes correctly, swapping should not be needed. To set this, perform the following process use sudo or just become root if you ride in the wild west.
sudo sh -c 'echo 0 > /proc/sys/vm/swappiness'
# Backup sysctl.conf
sudo cp -p /etc/sysctl.conf /etc/sysctl.conf.date +%Y%m%d-%H:%M
# Set the value in /etc/sysctl.conf so it stays after reboot.
sudo sh -c 'echo “” >> /etc/sysctl.conf'
sudo sh -c 'echo “#Set swappiness to 0 to avoid swapping” >> /etc/sysctl.conf'
sudo sh -c 'echo “vm.swappiness = 0” >> /etc/sysctl.conf'
Make sure that you either have or modify your process that builds your OSs to do this. This is especially critical for public/private clouds where it is so easy to bring up new instances. You need to make this part of your build process for a Couchbase node.
Disable Transparent Huge Pages (THP)
Starting in Red Hat Enterprise Linux (RHEL) version 6, so this includes CentOS 6 and 7 too, a new default method of managing huge pages was implemented in the OS. Ubuntu has this setting as well starting in 12.02, so it will need this changed as well. THP combines smaller memory pages into Huge Pages without the running processes knowing. The idea is to reduce the number of lookups on TLB required and therefor increase performance. It brings in abstraction for automatation and management of huge pages basically. Couchbase Engineering has determined that under some conditions, Couchbase Server can be negatively impacted by severe page allocation delays when THP is enabled. Couchbase therefore recommends that THP be disabled on all Couchbase Server nodes
Confirm if the OS settings need to be disabled
Check the status of THP by issuing the following commands:
cat /sys/kernel/mm/transparent_hugepage/defrag
On some Red Hat or Red Hat variants, you might have to do this:
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
If in one or both files, the output looks like this, you need the below procedure:
Copy the Init Script
The init script is designed to make sure the changes are made around the same time as Couchbase is loaded on reboot.
### BEGIN INIT INFO
# Provides: disable-thp
# Required-Start: $local_fs
# Required-Stop:
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable THP
# Description: disables Transparent Huge Pages (THP) on boot
### END INIT INFO
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
echo 'never' > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/redhat_transparent_hugepage/defrag
else
return 0
fi
;;
esac
How to Register the Code in the OS
Do the following:
Create a file with the above code
Chmod the file to be executable
Execute it so it takes effect right now
Make sure the init script starts at boot
Red Hat variants:
Ubuntu:
Test the Process
Check the status of THP by issuing the following commands:
cat /sys/kernel/mm/transparent_hugepage/defrag
On some Red Hat or Red Hat variants, you might have to do this instead:
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
For both files, the output should be like this:
Note: There is a different way to do this that you will find elsewhere and edits /etc/grub.conf. My problem with it is that it would get blown out with each and every kernel update in the future. What I propose is easier to manage in the long run and easy to put into something like Puppet module or Chef recipe to append to the end of rc.local when you boot strap a node.
THP is a great feature for some things, but causes problems with applications like Couchbase. It is not alone in this. If you go search the Internet for transparent huge pages, there are multiple documented issues from other DB and application vendors about this. Until something has been found to work with this, it is just best to turn THP off.
\”sudo echo > file\” doesn\’t do what you think it does.
\”sudo sh -c \’echo > file\’\” (or echo | sudo tee file) does.
Good catch. Fixed. Thank you!
While I agree that swappiness should be low, you might want to run through some tests with the 2.6.32 kernel. There was a modification to how vm.swappiness=0 behaves, and I have seen reports of the kernel killing services like MySQL due to OOM, even when some of the RAM was still being used for file cache. I run with vm.swappiness=1 instead of 0, which still means the kernel shouldn\’t swap unless it really really has to… but is able to if it has to, versus killing off processes.
Cool. I shall take a look. Would you happen to have a link handy about the changes to 2.6.32 around this? If not, no worries. I will go search for it.
Not sure if you still need this, but I found a blog post that sums up the modification and the resulting changes in behavior: http://www.percona.com/blog/20…
In case that post goes away at some point, here is some of the info in the post:
Kernel code commit info:
commit fe35004fbf9eaf67482b074a2e032abb9c89b1dd
Author: Satoru Moriya <satoru.moriya@hds.com>
Date: Tue May 29 15:06:47 2012 -0700
mm: avoid swapping out with swappiness==0
RHEL changelog info:
* Mon Aug 27 2012 Jarod Wilson <jarod@redhat.com> [2.6.32-303.el6]
…
– [mm] avoid swapping out with swappiness==0 (Satoru Moriya) [787885]
I have been checking internally and there seems to be some debate on this one. The recommendation is still to set swappiness=0 from what I am hearing, though we do have customers that have gone to 1. I will dig deeper and do some more testing.
I hear yeah.
It\’s mostly a \”how do you want it to fail and how quickly\” type of question I suppose, and not a situation people should encounter often. Hopefully anyone would have RAM usage related alarms going off well before the kernel faces the choice of swapping or killing Couchbase :)
Hi,
I followed your advice any my rc.local file looks like this:
————————————————————–
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will \”exit 0\” on success or any other# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
[ -x /sbin/initctl ] && initctl emit –no-wait google-rc-local-has-run || true
exit 0
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/defrag
fi
——————————————————————————-
Now I\’m really green at linux, but doesn\’t the exit 0 mean that it would never get to transparent_hugepage tests? Please help me understand/fix this.
Kind regards,
David
[…] Often Overlooked Linux Tweaks […]