Using the Java SDK

Sorry for the delay in responding, I was on a week’s annual leave.

My test application is about as simple as it gets: given a comma-separated list of nodes, a username, a password, a bucket name and a document id, it connects, opens the bucket and retrieves the document. The version of the application that uses version 2.7 of the Java SDK works. The version that uses version 3.0 of the Java SDK does not work. Neither explicitly uses the analytics service. I speculate that version 3.0 sees that the analytics service is enabled and tries to connect to it, hence the error messages for port 8095. However, it is not clear that this is what is causing the application to fail with the UnambiguousTimeoutException error (note that the version 3.0 application explicitly sets the cluster waitUntilReady property to 30s).

To clarify, I am not trying to connect from outside the OpenShift environment but from within it. Nonetheless, I agree that the configuration of the Couchbase container is incorrect (specifying only ports 8091-8094) and I am trying to get this fixed.

Thanks for pointing me to the sdk-doctor program. When I run it, it tells me that “All endpoints specified by my connection string were unreachable” (note that the hostname of the Couchbase container is “couchbase” and this is what I am passing to the two test programs I mention above):

$ ./sdk-doctor-linux diagnose -u xxxxxx -p xxxxxx couchbase://couchbase
|====================================================================|
|          ___ ___  _  __   ___   ___   ___ _____ ___  ___           |
|         / __|   \| |/ /__|   \ / _ \ / __|_   _/ _ \| _ \          |
|         \__ \ |) | ' <___| |) | (_) | (__  | || (_) |   /          |
|         |___/___/|_|\_\  |___/ \___/ \___| |_| \___/|_|_\          |
|                                                                    |
|====================================================================|

Note: Diagnostics can only provide accurate results when your cluster
 is in a stable state.  Active rebalancing and other cluster configuration
 changes can cause the output of the doctor to be inconsistent or in the
 worst cases, completely incorrect.

19:16:26.032 INFO ▶ Parsing connection string `couchbase://couchbase`
19:16:26.032 INFO ▶ Connection string was parsed as a potential DNS SRV record
19:16:26.036 INFO ▶ Connection string identifies the following CCCP endpoints:
19:16:26.037 INFO ▶   1. d9467e8c.couchbase.optima.svc.cluster.local:0
19:16:26.037 INFO ▶ Connection string identifies the following HTTP endpoints:
19:16:26.037 INFO ▶ Connection string specifies bucket ``
19:16:26.040 WARN ▶ The hostname specified in your connection string resolves both for SRV records, as well as A records.  This is not suggested as later DNS configuration changes could cause the wrong servers to be contacted
19:16:26.040 INFO ▶ Performing DNS lookup for host `d9467e8c.couchbase.optima.svc.cluster.local`
19:16:26.043 INFO ▶ Bootstrap host `d9467e8c.couchbase.optima.svc.cluster.local` refers to a server with the address `172.30.24.242`
19:16:26.045 INFO ▶ Attempting to connect to cluster via CCCP
19:16:26.045 INFO ▶ Attempting to fetch config via cccp from `d9467e8c.couchbase.optima.svc.cluster.local:0`
19:16:28.046 ERRO ▶ Failed to fetch configuration via cccp from `d9467e8c.couchbase.optima.svc.cluster.local:0` (error: dial tcp 172.30.24.242:0: i/o timeout)
19:16:28.046 INFO ▶ Not attempting HTTP (Terse), as the connection string does not support it
19:16:28.046 INFO ▶ Not attempting HTTP (Full), as the connection string does not support it
19:16:28.046 INFO ▶ Selected the following network type:
19:16:28.046 ERRO ▶ All endpoints specified by your connection string were unreachable, further cluster diagnostics are not possible
19:16:28.046 INFO ▶ Diagnostics completed

Summary:
[WARN] The hostname specified in your connection string resolves both for SRV records, as well as A records.  This is not suggested as later DNS configuration changes could cause the wrong servers to be contacted
[ERRO] Failed to fetch configuration via cccp from `d9467e8c.couchbase.optima.svc.cluster.local:0` (error: dial tcp 172.30.24.242:0: i/o timeout)
[ERRO] All endpoints specified by your connection string were unreachable, further cluster diagnostics are not possible

Found multiple issues, see listing above.

It is resolving the address correctly but then says it is attempting to “fetch config via cccp from :0” (i.e. port zero, which seems odd).

Thanks again for your help.

Based on this, I’m guessing you may have a malformed DNS SRV record. From where you were running the sdk-doctor try doing a lookup for SRV records for _couchbase._tcp.couchbase. My guess is you’ll get something back without the port number defined.

See the docs on DNS SRV records.

Once ports 8095 and 8096 were opened, the SDK 3.0 client program worked.

Hm. That’s interesting. Do you have by chance some debug logs from it working? From that SDK doctor output, it sure looks like something is possibly still misconfigured. But maybe there’s a narrow path where the slightly different order in which the Java SDK does DNS lookups it’s okay.

Sure. Here is the log up to the point where the bucket is opened. couchbase.log.zip (5.4 KB)

Caused by: com.couchbase.client.core.error.RequestCanceledException: CarrierGlobalConfigRequest, Reason: TARGET_NODE_REMOVED {“cancelled”:true,“completed”:true,“coreId”:“0x7e158a9700000001”,“idempotent”:true,“lastDispatchedTo”:“172.16.62.81”,“reason”:“TARGET_NODE_REMOVED”,“requestId”:2,“requestType”:“CarrierGlobalConfigRequest”,“retried”:11,“retryReasons”:[“ENDPOINT_NOT_AVAILABLE”,“NODE_NOT_AVAILABLE”],“service”:{“opaque”:“0x9”,“target”:“172.16.62.81”,“type”:“kv”,“vbucket”:0},“timeoutMs”:10000}
at com.couchbase.client.core.msg.BaseRequest.cancel(BaseRequest.java:186)
… 14 common frames omitted
DEBUG com.couchbase.config - [com.couchbase.config][GlobalConfigRetriedEvent][1000us] Failed to open global config {“coreId”:“0x7e158a9700000001”}
com.couchbase.client.core.error.ConfigException: Caught exception while loading global config.
at com.couchbase.client.core.config.loader.GlobalLoader.lambda$load$3(GlobalLoader.java:72)
at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:94)
at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:134)
at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:134)
at reactor.core.publisher.FluxMap$MapSubscriber.onError(FluxMap.java:134)
at reactor.core.publisher.MonoIgnoreThen$ThenIgnoreMain.onError(MonoIgnoreThen.java:278)

Hi @SANTOSH
Please create a new forum post rather than use this 2 year old one. And please include the smallest sample of code that reproduces this issue so we can see what you’re doing.