Odd stats after switching to full eviction

robwalker · October 8, 2015, 8:32pm

Hi,

We’re seeing some odd stats in the couchbase UI. We’ve just switched to full eviction as we have a huge amount of data to store and a low number of reads.

Since we made the switch, we’ve seen very high reads per sec. That’s not necessarily a bad thing, but it stays high even when there are no incoming operations.

Also the cache miss ratio spikes in to the millions… of percent?? CPU is also quite high considering we’re only currently doing 50 ops/sec max. (we’ve turned off the main source of data whilst we look at things more closely.)

Is this normal behavior?

robwalker · October 8, 2015, 8:38pm

I forgot, even with 200k reads, the nodes themselves report minimal actual disk IO (0-50KB sec), checked with iotop. But the CPU on the nodes is indeed quite high.

robwalker · October 12, 2015, 9:15pm

Upgraded to version 4, and although the phantom reads seemed to go away at first, they’re now back.

robwalker · October 13, 2015, 2:40pm

Restarting the nodes gets rid of the high reads and CPU, but it slowly creeps back node by node.

cihangirb · October 13, 2015, 5:28pm

We are not seeing this on any of the other deployments.
You have a high number of connections so my guess is you have something else accessing the env that has retry logic. you can do a wireshark to see for sure but quick test would be to drop your default bucket. Create a new bucket with password and see if you still see the behavior.
thanks
-cihan

robwalker · October 13, 2015, 7:45pm

Thanks for getting back to me, it’s much appreciated. As far as I know all the connections are coming from one app server using the .net libraries. I’ll run wireshark to confirm, but a quick check in iftop doesn’t show anything out of the ordinary.

It’s encouraging that you’ve not seen this before, hopefully it’s something in our implementation. The only odd thing is that the behavior continues when we stop our app. Again, wireshark will confirm any oddities. We set a password on the bucket earlier today as it happens, and that hasn’t made a difference.

Here’s some updated stats when under load

robwalker · October 14, 2015, 8:10am

I’m going to look at some tcpdump output now to check for anything odd.

Here’s the last 24 hours stats

robwalker · October 15, 2015, 4:08pm

a tcpdump revealed nothing out of the ordinary.

Due to removing some old data, we’ve switched back to value ejection (at least for now). The problem has immediately gone away.

cihangirb · October 15, 2015, 5:26pm

Thanks Rob, could you give me the exact combination of OS version and client library you are using? I’ll look at this on the combination you have.

pvarley · October 16, 2015, 12:50am

Rob,

Are you using XDCR on this bucket?

robwalker · October 16, 2015, 11:19am

Hi,

No, we’re not using XDCR.

We’re running this on Ubuntu 12.04, and we’re using the .net SDK v2.2.0

Bart · March 22, 2017, 9:40am

Hello,

We are currently experiencing the exact same thing. Very high number of disk reads per sec (120k when there are just 5 ops per second) with 100% cache miss ratio after enabling full ejection. Did you figure out how to resolve this?

We are using Couchbase CE 4.0.0 in a 5 node cluster without XDCR.

robwalker · March 22, 2017, 10:07am

Hi,
We actually revisited this problem recently. We upgraded to 4.1.1 and since then we haven’t seen the problem.

Topic		Replies	Views
ep_num_eject_failures is much higher (up to 1000x) then ep_num_value_ejects Couchbase Server	4	2444	April 30, 2014
Full Eviction ~70% throughput drop over value eviction? Couchbase Server	4	3136	February 23, 2015
High disk write load and flood of memcached logs Couchbase Server	7	2828	December 22, 2016
Couchbase 3.0.1 Performance Issue in full metadata ejection mode Couchbase Server	7	3176	April 9, 2015
Understanding Bucket Health Metrics Couchbase Server	3	2318	October 17, 2014

Odd stats after switching to full eviction

Related topics