Periodic OPS down on couchbase server 3.0.1 every hour

Sam_K · December 21, 2014, 3:08am

OPS down occurred on 44min ~ 49min periodically (every hour).
Is there any configuration for periodic job for Couchabse Server ?

I am trying to find the reason why OPS down is occurred periodically every 44min~50min per hour.

Network Traffic (1Gbps) => may be not (Couchbase v2.2 server is ok)
Garbage Collection on WebApi (using Couchbase Client) => not peridoically
System periodically batch job ? there’s no crontab

2014.12.19 09:44 ~ 09:50

2014.12.21 11:44 ~ 11:50

Couchbase 2.2 is ok (OPS down not occurred )
2014.12.21 11:44 ~11:50

tgreenstein · December 22, 2014, 8:36pm

How often is compaction running? Also, are you setting TTL’s on documents? The Expiry Pager runs by default once per hour but typically this doesn’t cause a degradation in performance.

Sam_K · December 23, 2014, 10:02am

Thx for your answer, @tgreenstein

I am not setting TTL’s on documents and I don’t use auto compaction option.

Sam_K · December 28, 2014, 1:27am

Is there anyone experienced above issue on 3.0.1?

Every 44, 48min, so many timeouts are occurred on cb clients .
I use cb java sdk v2.0.2 and I have 2 Clusters for joining data.

There’s no periodic throughput down like this on couchbase server 2.2
And there’s no batch job on all nodes.
It occurs 44min and 48min every hour exactly no
matter with gc and request loads (it acts like alarm ;()
There’s no network effects like high traffic periodically.
I never set TTL on documents and never configure auto compaction option.

ingenthr · December 28, 2014, 4:46pm

Given everything you’re saying, it sounds like an issue we’ve not seen yet. Can you file one against Couchbase Server on our issue tracker? Please include a cbcollect_info for the nodes, which can be generated from the console.

Sam_K · January 3, 2015, 3:32am

@ingenthr

Happy new year !

I’ve filed this issue on Loading....

I’m so sorry too late for update

Sam_K · January 21, 2015, 7:09am

I’ve found 2 factors about this issue.

data size
node counts

And 1more thing, I assum that key length is related with Periodic OPS dropped down.

I did test by increasing data to cluster and recorded fail counts per 100million.

100million - not occurred
400million - no failure. but retries
850million - many failures (10k)
==> add 8 nodes (total 16nodes)
850 million - just 8 failures (decreasing failure)

=========
Test Informations

*** Server Informations ***
nodes : 8

*** node spec ***
OS : Linux ( 2.6.32-358.6.2.el6.x86_64 ) 64 bit
CPU : Intel® Xeon® CPU E5-2420 0 @ 1.90GHz[6] * 2 N
RAM : 128GB(DDR3[1333 MHz] 16384 * 8)
DISK : [LSI MegaRAID SAS PCI Express ROMB [F/W: 3.340.05-2939] (1024MB)]
[-] 299.0 GB * 4

*** bucket spec ***
Ram Quata : 858GB
data size : 1.27 billion (1,270,000,000) (284GB, ALL data is on Memory)
replicas : 1
disk io/optimization : Low
Auto Compaction : OFF
Flush : enable

Topic		Replies	Views
Couchbase 3.0.1 auto failover every week or so Couchbase Server	6	2892	January 27, 2016
Couchbase server runs something every 1 hr 45 min? Couchbase Server	1	2013	May 28, 2014
Couchbase 3.0 Node goes down for evry weekend Couchbase Server	2	2840	November 18, 2014
Couchbase 4.0.0 load high but CPU percentage low periodicity Couchbase Server	4	3061	August 5, 2016
Compaction and High CPU & RAM utilization Couchbase Server	3	2680	September 10, 2013

Periodic OPS down on couchbase server 3.0.1 every hour

Related topics