Hi all
####Error description:
I’ve a strange “error” in my error logs on the couchservers. It says every 2 minutes (sometimes 3 and sometimes 1 minte) that the node is not able to retrieve the IndexStatus. Here are the original log entries:
[ns_server:error,2016-05-10T10:34:03.682+02:00,ns_1@SERVERIP:index_status_keeper_worker<0.363.0>:index_rest:get_json:45]Request to http://127.0.0.1:9102/getIndexStatus failed: {error,timeout}
[ns_server:error,2016-05-10T10:36:07.851+02:00,ns_1@SERVERIP:index_status_keeper_worker<0.363.0>:index_rest:get_json:45]Request to http://127.0.0.1:9102/getIndexStatus failed: {error,timeout}
[ns_server:error,2016-05-10T10:37:57.827+02:00,ns_1@SERVERIP:index_status_keeper_worker<0.363.0>:index_rest:get_json:45]Request to http://127.0.0.1:9102/getIndexStatus failed: {error,timeout}
[ns_server:error,2016-05-10T10:38:12.829+02:00,ns_1@SERVERIP:index_status_keeper_worker<0.363.0>:index_rest:get_json:45]Request to http://127.0.0.1:9102/getIndexStatus failed: {error,timeout}
When I curl this url by myself I get from time to time (no really regularity) that it fails to retreive the cluster-wide metadata from index service.
Here the error.
{"code":"error","error":"Fail to retrieve cluster-wide metadata from index service","failedNodes":["SERVERIP:9102"]}
Additionally I found out that this happens everytime when the connection of the index service gets closed.
####My Question: Is this behavior normal or is there something wrong with my cluster?
####The Cluster: 4 Physical Servers (Dell PowerEdge R730) Index, Data and Query service on each node 4.0.0-4051 Community Edition (build-4051) 3 Buckets (1 High, 1 Low and 1 Memcache)
####The Server:
Dell PowerEdge R730
115GB RAM (Couchbase Quota)
1 dedicated Indexdisk (SSD)
RAID 1 Data disk with 7200rpm
####Network:
1 dedicated network interface for communication between nodes
1 dedicated network interface for external access (SDK, Webgui, etc)
####OS:
Linux SERVERNAME 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux
####High Bucket:
~2’000 ops/sec
~700’000’000 Items
####Low Bucket:
~1’000 ops/sec
~60’000’000 Items
####Memcached Bucket:
~30’0000 ops/sec
~2’500’000 Items
####Load:
We have usual Webtraffic which causes the load on your couchbase cluster. (Night nearly nothing and during the Day high Peaks at the morning and evening)
####Misc:
The data disks have high I/O during the day
Avg. Age of Items are between 6000 sec and 0 sec (depends on node and on hour)
Disk queue has during the day around 600’000 items during the day