disk > memory problems
It would appear that the "disk > memory" features aren't exactly working.
Tested on community & Enterprise, 1.6.5 & 126.96.36.199 - results are all the same.
1) Create a bucket with a memory limit. (We are also setting the port)
2) Have an insert speed of more than ~ 100/sec and fill it.
3) Watch Rome burn.
(This is using the latest spymemcached with vbucket support, but I don't believe that has anything to do with it)
All joking aside, what we're finding is that at any reasonable insert speed (for my test this was around 10k/sec) it would appear that the thread that is supposed to be evicting things based on the high and low water marks has a problem and eventually stops running altogether.
At first you will see large eviction events happen as memory usage crosses the mem_high_wat mark, which of course reduces the in-use memory, but they get progressively smaller and smaller over time and come nowhere close to the mem_low_wat setting. This of course causes the bucket to fill given a constant insert rate. Once the bucket gets to 100% the eviction events are no longer happening and the thing just starts throwing the temp OOM errors to the client. Insert rate drops to say ... 5/sec (yes five) and it just stumbles along ... forever. Stopping inserting has no effect, the bucket will never work again. The only way we are able to get out of that state is to increase the bucket size to where the current amount of data ends up being below the newly calculated mem_low_wat.
Adjusting the mem_low_wat and the mem_high_wat values do not change the eventual outcome. In fact, there also seems to be a problem with the math when doing so. For example, on a two node cluster using flushctl to set one of those values to 180000000 results in the setting changing to 322286400
Performing the same steps, but with an insert rate of ~ 100/sec ... works. The number of items in each large eviction event doesn't decrease over time, and the resulting memory use decreases to the mem_low_wat mark. I have a test running against the community 1.6.5 that has been running for hours - each time the mem_high_wat mark is crossed, a major eviction occurs and life is good.
However, once I stopped inserting I could trigger the same behavior exhibited in the other test by reducing the mem_high_wat mark to below the current in-use. What occurred was that evict events started up but each time were progressively smaller, and never even coming close to reducing the mem_used to the mem_low_wat. Again, this was while doing no inserts or reads - this was while the bucket was completely idle.
I am able to consistently replicate this behavior by completely removing membase, doing a fresh install of any of the versions noted above, and performing the same actions as described.
- Brian Roach