[JCBC-28] refactor the entire cluster stream connection Created: 03/Apr/12  Updated: 31/Jan/13

Status: Reopened
Project: Couchbase Java Client
Component/s: library
Affects Version/s: None
Fix Version/s: 1.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Matt Ingenthron Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Because of the codebase's legacy, the handling of the Bucket and Configuration is rather odd. It used to exist outside the client to serve a different purpose. At that time, not changing the client internals was desirable.

Fast forwarding to now, the internals should be updated to have the NodeLocator or the connection abstract away much of the configuration details.




[JCBC-117] mention that OperationFuture.get(tmo) changes state when timeout has been reached Created: 24/Sep/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: docs
Affects Version/s: 1.0.3
Fix Version/s: 1.1.8
Security Level: Public

Type: Improvement Priority: Major
Reporter: Mark Nunberg Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Dependency
blocks JCBC-114 Command Futures never receive results... Reopened

 Description   
get(tmo) should not change the underlying state of the command to being timed out. It should simply respond with a TimeoutException but allow the command to continue.

Specifically, when the arg-tmo (timeout passed as an argument) expires, the underlying command is marked as timed out. For example, if one waits for 50ms on the command and a response has not been received within that time, the command is now dead ('TIMEDOUT', or similar) and waiting again will not help.

It is understandable that some code might rely on the old behavior, so at the very least, this should be documented in 'BIG RED LETTERS' in the get(tmo) method.

 Comments   
Comment by Matt Ingenthron [ 05/Oct/12 ]
Please explain further.
Comment by Michael Nitschinger [ 16/Oct/12 ]
Hey Mark,

Can you explain in more detail what you want to see changed? When the argument is timed-out what should happen then with it?

Thanks,
Michael
Comment by Matt Ingenthron [ 24/Oct/12 ]
As currently designed, the client uses get() to determine timeout. This is not going to change at the moment. There's no other appropriate place internal to the client to check for this timeout of the operation at the moment.
Comment by Mark Nunberg [ 24/Oct/12 ]
Moving this as a documentation bug
Comment by Matt Ingenthron [ 06/Nov/12 ]
Michael, I'd like you to give this one a shot as your first docs bug, I'll help you with this as needed.




[JCBC-114] Command Futures never receive results after rebalance-out (or other sorts of topology/network changes) Created: 17/Sep/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: docs
Affects Version/s: 1.0.3
Fix Version/s: 1.1.8
Security Level: Public

Type: Bug Priority: Major
Reporter: Mark Nunberg Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Dependency
depends on JCBC-117 mention that OperationFuture.get(tmo)... Reopened

 Comments   
Comment by Mark Nunberg [ 03/Oct/12 ]
This is a real blocker, and seems to be related to a few vbuckets. This issue is preventing me from properly measuring command durations
Comment by Farshid Ghods [ 03/Oct/12 ]
Matt/Rags,

This issue is a blocker for executing more integration tests on java sdk. are there workarounds to avoid this use case or a fix on the way ?
Please assign this back to Mark if more information or logs needed for this issue
Comment by Matt Ingenthron [ 04/Oct/12 ]
Please have a look at this.
Comment by Mark Nunberg [ 05/Oct/12 ]
Michael,

I would not try this test manually.. the use case in more detail is as follows:

- Single CouchbaseClient object
- 20 user threads. 10 setting and 10 getting the same sorts of kv
- Operations are done asynchronously. They are submitted into a queue which is then checked periodically for isDone/isCancelled.
- 4 node cluster. Nodes are removed, connections are broken

The issue is those polling methods never returning true, unless they are retrieved synchronously (i.e. ft.get()).. which is actually an accidental detail
Comment by Matt Ingenthron [ 24/Oct/12 ]
We looked at this pretty closely today. The issue here is that the client as designed relies on the get() from the caller to trigger the timeout. An operation will, somewhat correctly, never transition to isDone() or isCancelled() unless someone cares to use it.

The scenario that was likely in play over the WAN here is that the request was in flight to the server while the config was in flight down to the client. It arrives at the server, but is never responded to. Since the get() is never called, it'll never time out and transition to the canceled state.

We recommend you change the test code to use the queue more like a queue and just get() each one. Iterating through the queue is a bit funny in the first place, but if using the get() on the Future objects, you'll still have asynchronous behavior and much of the time the get() will be returning since the data is already there.
Comment by Matt Ingenthron [ 24/Oct/12 ]
This behavior should be better documented, both in the javadoc and in the API reference.




[JCBC-65] Client constructor blocks or deadlocks Created: 14/Jun/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: library
Affects Version/s: 1.0.2
Fix Version/s: 1.1.8
Security Level: Public

Type: Bug Priority: Major
Reporter: Martin Scott Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: OS: Windows 7 64bit
JDK: 1.6.0_31 also 1.6.0_33 64 bit
Couchbase enterprise edition running on 3 nodes all Ubuntu 10.04 64bit server (VMware images)

Attachments: Text File log.txt    

 Description   
I am evaluating the couchbase product and hit a brick wall immediately when running through the simple hello world example.

I have a 3 node cluster running couchbase enterprise 1.8.2 on ubuntu 10.04 64 bit VMware images. All three are running in VMWare player instances on Windows 7 64bit.

When I try to run the Main example on Windows 7 using Java6 (64 bit) the code blocks somewhere in the Client constructor. The result is the logging below.


2012-06-14 14:07:46.313 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:46.316 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:46.319 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 14:07:59.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@24a4e2e3
2012-06-14 14:08:52.983 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@21ec6696
2012-06-14 14:08:52.987 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@27431340

I have also tried debugging but the code blocks in the constructor at

client = new CouchbaseClient(uris, "default", "");

The program never completes.

This works fine in a Linux environment with the following output received

2012-06-14 04:58:50.693 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.703 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.708 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2012-06-14 04:58:50.830 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@1bc74f37
2012-06-14 04:58:50.834 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3a21b220
2012-06-14 04:58:50.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@732b3d53
2012-06-14 04:58:51.135 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
Set Succeeded
Synchronous Get failed
Asynchronous Get Succeeded: Hello World!

Is there a JDK for windows 7 or a configuration setting that can be used to prevent this?

 Comments   
Comment by Raghavan Srinivas [ 14/Jun/12 ]
Thanks for giving the Java client library a spin.

Were you able to connect to a single windows 7 node? I suspect it might be a firewall/networking issue and if you can use the netstat command (or the appropriate command on windows 7)?

You may also want to follow the instructions noted in

http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html

Changing IP addresses might be a cause for this.

Finally, a more detailed log would be useful, if the network troubleshooting does not help.

Please refer to

http://www.couchbase.com/wiki/display/couchbase/Couchbase+Java+Client+Library

for logging tips.
Comment by Martin Scott [ 14/Jun/12 ]
Apologies that should read

Couchbase Version: 1.8.0 enterprise edition (build-55)
Comment by Martin Scott [ 14/Jun/12 ]
Detailed logging up to the point when the client hangs
Comment by Raghavan Srinivas [ 14/Jun/12 ]
Thanks for the Log. I took a real quick look.

Were you able to follow the steps in

http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html

and use the ip address that you are able to connect to (via the admin console)?
Comment by Alex Ma [ 14/Jun/12 ]
Hi Martin,

Can you verify connectivity from the JDK on your windows box?

This is what my connection code looks like:
// Connection details for Couchbase
List<URI> uris = new LinkedList<URI>();
uris.add(URI.create("http://10.4.2.3:8091/pools"));

CouchbaseClient client = null;
try {
client = new CouchbaseClient(uris, "default", "");
}
catch (Exception e) {
System.err.println("except: connect: " + e.getMessage());
System.exit(-1);
}


This will create a persistent connection to 8091 on 10.4.2.3 as well as connections to 11210 on every node in the cluster.

ssh'ing to 10.4.2.3 and running netstat - you should see something like whats below:


netstat -nat|grep 10.32.3.50
tcp 0 0 10.4.2.3:11210 10.32.3.50:65437 ESTABLISHED
tcp 0 0 10.4.2.3:8091 10.32.3.50:65442 ESTABLISHED
tcp 0 304 ::ffff:10.4.2.3:22 ::ffff:10.32.3.50:65516 ESTABLISHED


can you confirm this in your environment?

thanks

-Alex.
Comment by Martin Scott [ 18/Jun/12 ]
Hi, thanks for the responses.

Here is the netstat output from my Windows client

  TCP 192.168.186.1:139 0.0.0.0:0 LISTENING InHost
  TCP 192.168.186.1:51008 192.168.186.150:22 ESTABLISHED InHost
  TCP 192.168.186.1:53281 192.168.186.150:8091 TIME_WAIT InHost
  TCP 192.168.186.1:53284 192.168.186.150:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53285 192.168.186.151:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53286 192.168.186.152:11210 ESTABLISHED InHost
  TCP 192.168.186.1:53292 192.168.186.150:8091 ESTABLISHED InHost

and from the first node in the cluster with the client and other nodes.

tcp 0 0 192.168.186.150:41317 192.168.186.150:11210 ESTABLISHED
tcp 0 0 192.168.186.150:35883 192.168.186.151:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.151:38013 ESTABLISHED
tcp 0 0 192.168.186.150:21100 192.168.186.152:46834 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.1:53284 ESTABLISHED
tcp 0 0 192.168.186.150:57559 192.168.186.151:22 TIME_WAIT
tcp 0 0 192.168.186.150:8091 192.168.186.1:53292 ESTABLISHED
tcp 0 48 192.168.186.150:22 192.168.186.1:51008 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.150:41317 ESTABLISHED
tcp 0 0 192.168.186.150:42433 192.168.186.152:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.150:56214 ESTABLISHED
tcp 0 0 192.168.186.150:56214 192.168.186.150:11210 ESTABLISHED
tcp 0 0 192.168.186.150:11210 192.168.186.152:39222 ESTABLISHED
tcp 0 0 192.168.186.150:21100 192.168.186.151:60945 ESTABLISHED


I downloaded the source jars from the maven repo and debugging shows the client hanging at the getLatch().await() line below. There doesn't appear to be any thread calling the countDown method on the latch before or after this is called.

  private ChannelFuture getReceivedFuture() {
    try {
      getLatch().await();
    } catch (InterruptedException ex) {
      finerLog("Getting received future has been interrupted.");
    }
    return receivedFuture;
  }


Martin.
Comment by Martin Scott [ 18/Jun/12 ]
The stack trace where the client blocks

Thread [main] (Stepping)
BucketUpdateResponseHandler.getReceivedFuture() line: 147
BucketUpdateResponseHandler.getLastResponse() line: 127
BucketMonitor.startMonitor() line: 183
ConfigurationProviderHTTP.subscribe(String, Reconfigurable) line: 243
CouchbaseClient.<init>(CouchbaseConnectionFactory, boolean) line: 158
CouchbaseClient.<init>(CouchbaseConnectionFactory) line: 125
CouchbaseClient.<init>(List<URI>, String, String) line: 77
Main.main(String[]) line: 67

Comment by sean diamond [ 23/Jan/13 ]
I am having this exact same problem. Using windows 7 64 bit trying to connect to ubuntu.
I am using 32 bit os on linux and couchbase server 2.0.

I am also using the lastest java client version 1.1

Same Issue as described below.
The only workaround is to not use windows, if my java client is running on linux then it will work with no issues, it just deadlocks on the windows machine.
Comment by Tug Grall [ 17/Apr/13 ]
I am reopening the issue as we see this error again on some environment:
- Yuval
- http://www.couchbase.com/issues/browse/JCBC-65
...

Let me know if you prefer me to create a new issue for 1.1.x
Comment by Michael Nitschinger [ 29/May/13 ]
getting it onto the bugfix release train, altough I'm not sure if we get it into 1.1.7




[JCBC-11] Need more unit tests for couchbase-client Created: 03/Feb/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: library
Affects Version/s: 1.0.1
Fix Version/s: 1.1.8
Security Level: Public

Type: Improvement Priority: Major
Reporter: Raghavan Srinivas Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
        CouchbaseConnectionFactoryBuilder cfb = new CouchbaseConnectionFactoryBuilder();
        cfb.setOpTimeout(10000); // wait up to 10 seconds for an operation to succeed
        cfb.setOpQueueMaxBlockTime(5000); // wait up to 5 seconds when trying to enqueue an operation
        

For example, will fill up the funnel (the blocking queue) to the rim, but not overflow (immediately timeout). Once it's at the rim, it'll have to wait to wait until at least one operation flows out to add another operation. This *will* slow down the callers (their async calls will actually block on this internal queue), but that's okay in a bulk loader.

However, there are no unit tests in couchbase-client to test this.





[JCBC-18] NPE if hostnames in server bootstrap list are mixed case Created: 12/Mar/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: library
Affects Version/s: 1.1-dp4
Fix Version/s: 1.1.8
Security Level: Public

Type: Bug Priority: Major
Reporter: Matt Ingenthron Assignee: Matt Ingenthron
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
A user described a scenario where using mixed case in their URIs lead to an NPE. This is from the map lookup, since what the couchbase cluster sends us is different than what the user entered, I think.

See: http://www.couchbase.com/forums/thread/java-client-101-exception-using-couchbaseclient-servlet-filter




[JCBC-61] Expose returned CAS value in CASResponse when available from binary protocol Created: 06/Jun/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: library
Affects Version/s: 1.1dp
Fix Version/s: 1.1.8
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Customer request to add the capability to retrieve new cas value after a cas() operation to avoid a subsequent gets()

 Comments   
Comment by Michael Nitschinger [ 29/Nov/12 ]
This may already be implemented, need to check.




[JCBC-15] add showtype-options to documentation Created: 23/Feb/12  Updated: 12/Jun/13

Status: Reopened
Project: Couchbase Java Client
Component/s: docs
Affects Version/s: None
Fix Version/s: 1.1.8
Security Level: Public

Type: Improvement Priority: Major
Reporter: Matt Ingenthron Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Many of the Java docs should show the type in the options summary.

 Comments   
Comment by Michael Nitschinger [ 15/Nov/12 ]
Can you give me a quick example on what you mean? Reassign it back to me then and I'll fix it!
Comment by Matt Ingenthron [ 15/Nov/12 ]
If you look at the docs, there are many places where we have types that are returned, but we don't sufficiently describe those types. For example:
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/couchbase-sdk-java-set-add.html#table-couchbase-sdk_java_add

It mentions the OperationFuture, but nowhere really tell how to use it (to my knowledge).

You should be able to work with MC on the right way to fix these.




Generated at Wed Jun 19 05:55:25 CDT 2013 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.