I have a production cluster with 3 nodes running 4.1 on virtualized Windows 2012 R2 servers. Each node have 8 CPUs and 16 GB of RAM. Couchbase is not using a dedicated disk for storage but it currently a part of the operating system disk. The bucket I am using is a Couchbase bucket, 1 replica activated. I/O priority is set to low.
I am using atomic counters as a way to monitor revisions across several client servers. Each client have a local counter value which is compares against the remote counter saved in Couchbase.
I am seeing high CPU and disk load. For 2.000 ops per second, CPU usage is typically between 50-60% but disk I/O is 100% most of the time. The counters are mostly read, there is typically ~100 new items per second. The disk writes about 10 megabytes per second during this which seems quite high for such basic entities as counters.
- I don’t understand how the 10 megabyte / second load is created? I can only explain it by the fact that I on each counter operation set the TTL, which then needs to be persisted to disk. The servers have a disk block size of 4096 which could also explain the 10 megabyte / second I see.
- The CPU load of 50-60% is quite high for only 2.000 ops per second, but could it be caused by the high I/O load?