[JCBC-65] Client constructor blocks or deadlocks Created: 14/Jun/12 Updated: 17/Apr/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.0.2 |
| Fix Version/s: | 1.1-beta |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Martin Scott | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS: Windows 7 64bit
JDK: 1.6.0_31 also 1.6.0_33 64 bit Couchbase enterprise edition running on 3 nodes all Ubuntu 10.04 64bit server (VMware images) |
||
| Attachments: |
|
| Description |
|
I am evaluating the couchbase product and hit a brick wall immediately when running through the simple hello world example.
I have a 3 node cluster running couchbase enterprise 1.8.2 on ubuntu 10.04 64 bit VMware images. All three are running in VMWare player instances on Windows 7 64bit. When I try to run the Main example on Windows 7 using Java6 (64 bit) the code blocks somewhere in the Client constructor. The result is the logging below. 2012-06-14 14:07:46.313 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:46.316 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:46.319 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:59.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@24a4e2e3 2012-06-14 14:08:52.983 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@21ec6696 2012-06-14 14:08:52.987 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@27431340 I have also tried debugging but the code blocks in the constructor at client = new CouchbaseClient(uris, "default", ""); The program never completes. This works fine in a Linux environment with the following output received 2012-06-14 04:58:50.693 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.703 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.708 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.830 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@1bc74f37 2012-06-14 04:58:50.834 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3a21b220 2012-06-14 04:58:50.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@732b3d53 2012-06-14 04:58:51.135 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client Set Succeeded Synchronous Get failed Asynchronous Get Succeeded: Hello World! Is there a JDK for windows 7 or a configuration setting that can be used to prevent this? |
| Comments |
| Comment by Raghavan Srinivas [ 14/Jun/12 ] |
|
Thanks for giving the Java client library a spin.
Were you able to connect to a single windows 7 node? I suspect it might be a firewall/networking issue and if you can use the netstat command (or the appropriate command on windows 7)? You may also want to follow the instructions noted in http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html Changing IP addresses might be a cause for this. Finally, a more detailed log would be useful, if the network troubleshooting does not help. Please refer to http://www.couchbase.com/wiki/display/couchbase/Couchbase+Java+Client+Library for logging tips. |
| Comment by Martin Scott [ 14/Jun/12 ] |
|
Apologies that should read
Couchbase Version: 1.8.0 enterprise edition (build-55) |
| Comment by Martin Scott [ 14/Jun/12 ] |
| Detailed logging up to the point when the client hangs |
| Comment by Raghavan Srinivas [ 14/Jun/12 ] |
|
Thanks for the Log. I took a real quick look.
Were you able to follow the steps in http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html and use the ip address that you are able to connect to (via the admin console)? |
| Comment by Alex Ma [ 14/Jun/12 ] |
|
Hi Martin,
Can you verify connectivity from the JDK on your windows box? This is what my connection code looks like: // Connection details for Couchbase List<URI> uris = new LinkedList<URI>(); uris.add(URI.create("http://10.4.2.3:8091/pools")); CouchbaseClient client = null; try { client = new CouchbaseClient(uris, "default", ""); } catch (Exception e) { System.err.println("except: connect: " + e.getMessage()); System.exit(-1); } This will create a persistent connection to 8091 on 10.4.2.3 as well as connections to 11210 on every node in the cluster. ssh'ing to 10.4.2.3 and running netstat - you should see something like whats below: netstat -nat|grep 10.32.3.50 tcp 0 0 10.4.2.3:11210 10.32.3.50:65437 ESTABLISHED tcp 0 0 10.4.2.3:8091 10.32.3.50:65442 ESTABLISHED tcp 0 304 ::ffff:10.4.2.3:22 ::ffff:10.32.3.50:65516 ESTABLISHED can you confirm this in your environment? thanks -Alex. |
| Comment by Martin Scott [ 18/Jun/12 ] |
|
Hi, thanks for the responses.
Here is the netstat output from my Windows client TCP 192.168.186.1:139 0.0.0.0:0 LISTENING InHost TCP 192.168.186.1:51008 192.168.186.150:22 ESTABLISHED InHost TCP 192.168.186.1:53281 192.168.186.150:8091 TIME_WAIT InHost TCP 192.168.186.1:53284 192.168.186.150:11210 ESTABLISHED InHost TCP 192.168.186.1:53285 192.168.186.151:11210 ESTABLISHED InHost TCP 192.168.186.1:53286 192.168.186.152:11210 ESTABLISHED InHost TCP 192.168.186.1:53292 192.168.186.150:8091 ESTABLISHED InHost and from the first node in the cluster with the client and other nodes. tcp 0 0 192.168.186.150:41317 192.168.186.150:11210 ESTABLISHED tcp 0 0 192.168.186.150:35883 192.168.186.151:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.151:38013 ESTABLISHED tcp 0 0 192.168.186.150:21100 192.168.186.152:46834 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.1:53284 ESTABLISHED tcp 0 0 192.168.186.150:57559 192.168.186.151:22 TIME_WAIT tcp 0 0 192.168.186.150:8091 192.168.186.1:53292 ESTABLISHED tcp 0 48 192.168.186.150:22 192.168.186.1:51008 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.150:41317 ESTABLISHED tcp 0 0 192.168.186.150:42433 192.168.186.152:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.150:56214 ESTABLISHED tcp 0 0 192.168.186.150:56214 192.168.186.150:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.152:39222 ESTABLISHED tcp 0 0 192.168.186.150:21100 192.168.186.151:60945 ESTABLISHED I downloaded the source jars from the maven repo and debugging shows the client hanging at the getLatch().await() line below. There doesn't appear to be any thread calling the countDown method on the latch before or after this is called. private ChannelFuture getReceivedFuture() { try { getLatch().await(); } catch (InterruptedException ex) { finerLog("Getting received future has been interrupted."); } return receivedFuture; } Martin. |
| Comment by Martin Scott [ 18/Jun/12 ] |
|
The stack trace where the client blocks
Thread [main] (Stepping) BucketUpdateResponseHandler.getReceivedFuture() line: 147 BucketUpdateResponseHandler.getLastResponse() line: 127 BucketMonitor.startMonitor() line: 183 ConfigurationProviderHTTP.subscribe(String, Reconfigurable) line: 243 CouchbaseClient.<init>(CouchbaseConnectionFactory, boolean) line: 158 CouchbaseClient.<init>(CouchbaseConnectionFactory) line: 125 CouchbaseClient.<init>(List<URI>, String, String) line: 77 Main.main(String[]) line: 67 |
| Comment by sean diamond [ 23/Jan/13 ] |
|
I am having this exact same problem. Using windows 7 64 bit trying to connect to ubuntu.
I am using 32 bit os on linux and couchbase server 2.0. I am also using the lastest java client version 1.1 Same Issue as described below. The only workaround is to not use windows, if my java client is running on linux then it will work with no issues, it just deadlocks on the windows machine. |
| Comment by Tug Grall [ 17/Apr/13 ] |
|
I am reopening the issue as we see this error again on some environment:
- Yuval - http://www.couchbase.com/issues/browse/JCBC-65 ... Let me know if you prefer me to create a new issue for 1.1.x |