We are experiencing the issue when trying to connect to the cluster after updating the version of Java SDK.
The setup of the system is as follows:
We have a web application that is using Java SDK and a Couchbase cluster. In between we have a VIP (Virtual IP Address). We realise that isn’t ideal but we’re not able to change that immediately since VIP was mandated by Tech Ops. VIP is basically only there to reroute the initial request on application startup. That way we can make modifications on the cluster and ensure that when application starts it can find the cluster regardless of the actual nodes in the cluster and their IPs.
Prior to the issue we used JAVA SDK version 1.4.4. Our application would start and Java SDK would initiate a request on port 8091 to VIP. Please note that port 8091 is the only port open on VIP. VIP would reroute the request to one of the node cluster currently in use the cluster would respond to Java SDK. At that point Java SDK would discover all the nodes in the cluster and application would run fine. During up time if we would add, remove a node from the cluster Java SDK would update automatically and everything would run without the issue.
In the last sprint we updated the Java SDK to version 2.1.3. Our application would start and Java SDK would initiate a request on port 11210 to VIP. Since this port is not open the request would fail and Java SDK would throw an exception:
Caused by: java.lang.RuntimeException: java.util.concurrent.TimeoutException
at com.couchbase.client.java.util.Blocking.blockForSingle(Blocking.java:93)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:108)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:99)
at com.couchbase.client.java.CouchbaseCluster.openBucket(CouchbaseCluster.java:89)
No further request would be made on any port.
It appears the order in which port are being used has been changed between versions. Could somebody please confirm, or dispute, that the order in which ports are being used for cluster discovery has been changed between versions. Also could somebody please provide some advice on how we could resolve the issue. We are trying to understand the clients behavior, if we could open all those ports on the VIP would the client still then function correctly and at full performance?
The issue is happening on our production environment which we cannot use for testing out potential solutions since it will interfere with our products.