Different number connection on my cluster

Alessandro.79 · February 23, 2017, 2:33pm

Hi,
I have 8 server nodes and one of them has a problem (couchbase05, it crashes continuously)
Here below are the connection details:

8091    	8092	    8093	9100	  9102	      9999	      11209	        11210	  port/host

95	        323	        5868	16	        31	        28	        322	        1598	couchbase01
129	        307	        5872	16	        32	        23	        322	        1600	couchbase02
98	        303	        5879	16	        32	        29	        322	        1620	couchbase03
144	        306	        5882	16	        25	        21	        322	        1592	couchbase04
89	        236	         238     8          31	        15	        322	        2723	couchbase05
105	        317	        5612	16	        34	        21	        322	        1609	couchbase06
111	        314	        5895	16	        28	        20	        322	        1533	couchbase07
99	        361	        5725	16	        33	        21	        324	        1582	couchbase08

Is it normal that couchbase05 a doubled number of connections on port 11210 and a halved number of connection on port 9100 with respect to the other nodes?
Thanks

Alessandro.79 · February 23, 2017, 3:59pm

further studies brought up the problem: it seems that the node (couchbase05) does not contact other nodes on port 9100…
why ?

Alessandro.79 · February 23, 2017, 4:14pm

and in my log i found this :

Service 'goxdcr' exited with status 1. Restarting. Messages: MetadataService 2017-02-23T12:41:46.072+01:00 [ERROR] metakv.ListAllChildren failed. path=/remoteCluster/, err=Get http://127.0.0.1:8091/_metakv/remoteCluster/: CBAuth database is stale. Was never updated yet., num_of_retry=2
MetadataService 2017-02-23T12:41:46.072+01:00 [ERROR] metakv.ListAllChildren failed. path=/remoteCluster/, err=Get http://127.0.0.1:8091/_metakv/remoteCluster/: CBAuth database is stale. Was never updated yet., num_of_retry=3
MetadataService 2017-02-23T12:41:46.072+01:00 [ERROR] metakv.ListAllChildren failed. path=/remoteCluster/, err=Get http://127.0.0.1:8091/_metakv/remoteCluster/: CBAuth database is stale. Was never updated yet., num_of_retry=4
RemoteClusterService 2017-02-23T12:41:46.072+01:00 [ERROR] Failed to get all entries, err=metakv failed for max number of retries = 5
[goport] 2017/02/23 12:41:46 /opt/couchbase/bin/goxdcr terminated: exit status 1

this problems may be related?

WillGardella · February 23, 2017, 9:29pm

Hi Alessandro -
What version are you running? There is at least one known issue that looks like what you’re experiencing that was fixed in 4.5.0: MB-16568 .

You should also be able to fail over that node and re-add it and it should recover.

Just as an aside, you seem to have a lot of connections open on port 8093 - are you sure your applications are not leaking connections? Usually, the number of connections on 11210 per node would be about the same as the number of client objects, and that would also be about the same as the number of connections you have on port 8093.

Good luck,
-Will

Alessandro.79 · February 24, 2017, 9:56am

Thank you for the reply
We are currently using version: 4.0.0-4051
We are considering upgrading to 4.5, what would be the best procedure for a cluster of 8 machines?
About to active connections, we are updating our service to have fewer active connections, in order to lighten the load on the cluster.
Is the failover operation heavy in terms of resources and time?
thank you

WillGardella · February 24, 2017, 11:18pm

On how to upgrade, I recommend taking a look at the information here: https://developer.couchbase.com/documentation/server/current/install/upgrade.html and also look at the page called “Upgrade Options”. We recently re-wrote that to be easier to understand, so I hope it helps you decide which one is right for you.
Failover should not be a heavy operation, but you need to rebalance to bring back in the nodes, and rebalance is a heavier operation. Depending on your production use case, an offline upgrade can be faster. Some of our big customers would rather take the downtime than perform rebalances while the system is online.

Best,
-Will

Han_Chris1 · June 22, 2020, 9:38am

Hi,

I have issues too many open connection on port 11210.(30150 / 30000)

How can I increase the open connection?
How to clean the connection?

Topic		Replies	Views
Issue with number of java client connections increasing rapidly after fail over on single node in cluster Couchbase Server connections , java	15	3137	July 28, 2017
Maximum connection to port 11210 Couchbase Server	4	2616	July 6, 2020
Max Connection values Couchbase Server	2	2995	October 22, 2014
Possible bug with admin ui? # of connections does not get closed Couchbase Server	11	2716	May 24, 2016
Connection overload when instanteniously turning on many clients Couchbase Server	2	2439	May 27, 2015

Different number connection on my cluster

Related topics