[JCBC-65] Client constructor blocks or deadlocks Created: 14/Jun/12  Updated: 19/Dec/13  Resolved: 19/Dec/13

Status: Resolved
Project: Couchbase Java Client
Component/s: Core
Affects Version/s: 1.0.2
Fix Version/s: .next
Security Level: Public

Type: Bug Priority: Major
Reporter: Martin Scott Assignee: Michael Nitschinger
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: OS: Windows 7 64bit
JDK: 1.6.0_31 also 1.6.0_33 64 bit
Couchbase enterprise edition running on 3 nodes all Ubuntu 10.04 64bit server (VMware images)

Attachments: Text File log.txt    

 Description   
I am evaluating the couchbase product and hit a brick wall immediately when running through the simple hello world example.

I have a 3 node cluster running couchbase enterprise 1.8.2 on ubuntu 10.04 64 bit VMware images. All three are running in VMWare player instances on Windows 7 64bit.

When I try to run the Main example on Windows 7 using Java6 (64 bit) the code blocks somewhere in the Client constructor. The result is the logging below.


2012-06-14 14:07:46.313 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:46.316 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:46.319 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:59.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@24a4e2e3
2012-06-14 14:08:52.983 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@21ec6696
2012-06-14 14:08:52.987 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@27431340

I have also tried debugging but the code blocks in the constructor at

client = new CouchbaseClient(uris, "default", "");

The program never completes.

This works fine in a Linux environment with the following output received

2012-06-14 04:58:50.693 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.703 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.708 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.830 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@1bc74f37
2012-06-14 04:58:50.834 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3a21b220
2012-06-14 04:58:50.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@732b3d53
2012-06-14 04:58:51.135 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
Set Succeeded
Synchronous Get failed
Asynchronous Get Succeeded: Hello World!

Is there a JDK for windows 7 or a configuration setting that can be used to prevent this?

 Comments   
Comment by Raghavan Srinivas (Inactive) [ 14/Jun/12 ]
Thanks for giving the Java client library a spin.

Were you able to connect to a single windows 7 node? I suspect it might be a firewall/networking issue and if you can use the netstat command (or the appropriate command on windows 7)?

You may also want to follow the instructions noted in

http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html

Changing IP addresses might be a cause for this.

Finally, a more detailed log would be useful, if the network troubleshooting does not help.

Please refer to

http://www.couchbase.com/wiki/display/couchbase/Couchbase+Java+Client+Library

for logging tips.
Comment by Martin Scott [ 14/Jun/12 ]
Apologies that should read

Couchbase Version: 1.8.0 enterprise edition (build-55)
Comment by Martin Scott [ 14/Jun/12 ]
Detailed logging up to the point when the client hangs
Comment by Raghavan Srinivas (Inactive) [ 14/Jun/12 ]
Thanks for the Log. I took a real quick look.

Were you able to follow the steps in

http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html

and use the ip address that you are able to connect to (via the admin console)?
Comment by Alex Ma [ 14/Jun/12 ]
Hi Martin,

Can you verify connectivity from the JDK on your windows box?

This is what my connection code looks like:
// Connection details for Couchbase
List<URI> uris = new LinkedList<URI>();
uris.add(URI.create("http://10.4.2.3:8091/pools"));

CouchbaseClient client = null;
try {
client = new CouchbaseClient(uris, "default", "");
}
catch (Exception e) {
System.err.println("except: connect: " + e.getMessage());
System.exit(-1);
}


This will create a persistent connection to 8091 on 10.4.2.3 as well as connections to 11210 on every node in the cluster.

ssh'ing to 10.4.2.3 and running netstat - you should see something like whats below:


netstat -nat|grep 10.32.3.50
tcp 0 0 10.4.2.3:11210 10.32.3.50:65437 ESTABLISHED
tcp 0 0 10.4.2.3:8091 10.32.3.50:65442 ESTABLISHED
tcp 0 304 ::ffff:10.4.2.3:22 ::ffff:10.32.3.50:65516 ESTABLISHED


can you confirm this in your environment?

thanks

-Alex.
Comment by Martin Scott [ 18/Jun/12 ]
Hi, thanks for the responses.

Here is the netstat output from my Windows client

  TCP 192.168.186.1:139 0.0.0.0:0 LISTENING InHost
  TCP 192.168.186.1:51008 192.168.186.150:22 ESTABLISHED InHost
  TCP 192.168.186.1:53281 192.168.186.150:8091 TIME_WAIT InHost
  TCP 192.168.186.1:53284 192.168.186.150:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53285 192.168.186.151:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53286 192.168.186.152:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53292 192.168.186.150:8091 ESTABLISHED InHost

and from the first node in the cluster with the client and other nodes.

tcp 0 0 192.168.186.150:41317 192.168.186.150:11210 ESTABLISHED
tcp 0 0 192.168.186.150:35883 192.168.186.151:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.151:38013 ESTABLISHED
tcp 0 0 192.168.186.150:21100 192.168.186.152:46834 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.1:53284 ESTABLISHED
tcp 0 0 192.168.186.150:57559 192.168.186.151:22 TIME_WAIT
tcp 0 0 192.168.186.150:8091 192.168.186.1:53292 ESTABLISHED
tcp 0 48 192.168.186.150:22 192.168.186.1:51008 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.150:41317 ESTABLISHED
tcp 0 0 192.168.186.150:42433 192.168.186.152:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.150:56214 ESTABLISHED
tcp 0 0 192.168.186.150:56214 192.168.186.150:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.152:39222 ESTABLISHED
tcp 0 0 192.168.186.150:21100 192.168.186.151:60945 ESTABLISHED


I downloaded the source jars from the maven repo and debugging shows the client hanging at the getLatch().await() line below. There doesn't appear to be any thread calling the countDown method on the latch before or after this is called.

  private ChannelFuture getReceivedFuture() {
    try {
      getLatch().await();
    } catch (InterruptedException ex) {
      finerLog("Getting received future has been interrupted.");
    }
    return receivedFuture;
  }


Martin.
Comment by Martin Scott [ 18/Jun/12 ]
The stack trace where the client blocks

Thread [main] (Stepping)
BucketUpdateResponseHandler.getReceivedFuture() line: 147
BucketUpdateResponseHandler.getLastResponse() line: 127
BucketMonitor.startMonitor() line: 183
ConfigurationProviderHTTP.subscribe(String, Reconfigurable) line: 243
CouchbaseClient.<init>(CouchbaseConnectionFactory, boolean) line: 158
CouchbaseClient.<init>(CouchbaseConnectionFactory) line: 125
CouchbaseClient.<init>(List<URI>, String, String) line: 77
Main.main(String[]) line: 67

Comment by sean diamond [ 23/Jan/13 ]
I am having this exact same problem. Using windows 7 64 bit trying to connect to ubuntu.
I am using 32 bit os on linux and couchbase server 2.0.

I am also using the lastest java client version 1.1

Same Issue as described below.
The only workaround is to not use windows, if my java client is running on linux then it will work with no issues, it just deadlocks on the windows machine.
Comment by Tug Grall (Inactive) [ 17/Apr/13 ]
I am reopening the issue as we see this error again on some environment:
- Yuval
- http://www.couchbase.com/issues/browse/JCBC-65
...

Let me know if you prefer me to create a new issue for 1.1.x
Comment by Michael Nitschinger [ 29/May/13 ]
getting it onto the bugfix release train, altough I'm not sure if we get it into 1.1.7
Comment by Michael Nitschinger [ 19/Dec/13 ]
We haven't seen this again in a long time.. also we fixed issues along the way and have more in the 1.3 upcoming..

for anyone stumbling upon this when running 1.2* or 1.3*, please reopen a new issue with more context.. thanks!
Generated at Wed Apr 16 06:13:37 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.