How often is compaction running? Also, are you setting TTL’s on documents? The Expiry Pager runs by default once per hour but typically this doesn’t cause a degradation in performance.
Given everything you’re saying, it sounds like an issue we’ve not seen yet. Can you file one against Couchbase Server on our issue tracker? Please include a cbcollect_info for the nodes, which can be generated from the console.
And 1more thing, I assum that key length is related with Periodic OPS dropped down.
I did test by increasing data to cluster and recorded fail counts per 100million.
100million - not occurred
400million - no failure. but retries
850million - many failures (10k)
==> add 8 nodes (total 16nodes)
850 million - just 8 failures (decreasing failure)
=========
Test Informations
*** Server Informations ***
nodes : 8
*** node spec ***
OS : Linux ( 2.6.32-358.6.2.el6.x86_64 ) 64 bit
CPU : Intel® Xeon® CPU E5-2420 0 @ 1.90GHz[6] * 2 N
RAM : 128GB(DDR3[1333 MHz] 16384 * 8)
DISK : [LSI MegaRAID SAS PCI Express ROMB [F/W: 3.340.05-2939] (1024MB)]
[-] 299.0 GB * 4
*** bucket spec ***
Ram Quata : 858GB
data size : 1.27 billion (1,270,000,000) (284GB, ALL data is on Memory)
replicas : 1
disk io/optimization : Low
Auto Compaction : OFF
Flush : enable