[JCBC-28] refactor the entire cluster stream connection Created: 03/Apr/12 Updated: 31/Jan/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | None |
| Fix Version/s: | 1.2 |
| Security Level: | Public |
| Type: | Improvement | Priority: | Major |
| Reporter: | Matt Ingenthron | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Because of the codebase's legacy, the handling of the Bucket and Configuration is rather odd. It used to exist outside the client to serve a different purpose. At that time, not changing the client internals was desirable.
Fast forwarding to now, the internals should be updated to have the NodeLocator or the connection abstract away much of the configuration details. |
[JCBC-117] mention that OperationFuture.get(tmo) changes state when timeout has been reached Created: 24/Sep/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | docs |
| Affects Version/s: | 1.0.3 |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Improvement | Priority: | Major |
| Reporter: | Mark Nunberg | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
get(tmo) should not change the underlying state of the command to being timed out. It should simply respond with a TimeoutException but allow the command to continue.
Specifically, when the arg-tmo (timeout passed as an argument) expires, the underlying command is marked as timed out. For example, if one waits for 50ms on the command and a response has not been received within that time, the command is now dead ('TIMEDOUT', or similar) and waiting again will not help. It is understandable that some code might rely on the old behavior, so at the very least, this should be documented in 'BIG RED LETTERS' in the get(tmo) method. |
| Comments |
| Comment by Matt Ingenthron [ 05/Oct/12 ] |
| Please explain further. |
| Comment by Michael Nitschinger [ 16/Oct/12 ] |
|
Hey Mark, Can you explain in more detail what you want to see changed? When the argument is timed-out what should happen then with it? Thanks, Michael |
| Comment by Matt Ingenthron [ 24/Oct/12 ] |
| As currently designed, the client uses get() to determine timeout. This is not going to change at the moment. There's no other appropriate place internal to the client to check for this timeout of the operation at the moment. |
| Comment by Mark Nunberg [ 24/Oct/12 ] |
| Moving this as a documentation bug |
| Comment by Matt Ingenthron [ 06/Nov/12 ] |
| Michael, I'd like you to give this one a shot as your first docs bug, I'll help you with this as needed. |
[JCBC-114] Command Futures never receive results after rebalance-out (or other sorts of topology/network changes) Created: 17/Sep/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | docs |
| Affects Version/s: | 1.0.3 |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Mark Nunberg | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Comments |
| Comment by Mark Nunberg [ 03/Oct/12 ] |
| This is a real blocker, and seems to be related to a few vbuckets. This issue is preventing me from properly measuring command durations |
| Comment by Farshid Ghods [ 03/Oct/12 ] |
|
Matt/Rags,
This issue is a blocker for executing more integration tests on java sdk. are there workarounds to avoid this use case or a fix on the way ? Please assign this back to Mark if more information or logs needed for this issue |
| Comment by Matt Ingenthron [ 04/Oct/12 ] |
| Please have a look at this. |
| Comment by Mark Nunberg [ 05/Oct/12 ] |
|
Michael,
I would not try this test manually.. the use case in more detail is as follows: - Single CouchbaseClient object - 20 user threads. 10 setting and 10 getting the same sorts of kv - Operations are done asynchronously. They are submitted into a queue which is then checked periodically for isDone/isCancelled. - 4 node cluster. Nodes are removed, connections are broken The issue is those polling methods never returning true, unless they are retrieved synchronously (i.e. ft.get()).. which is actually an accidental detail |
| Comment by Matt Ingenthron [ 24/Oct/12 ] |
|
We looked at this pretty closely today. The issue here is that the client as designed relies on the get() from the caller to trigger the timeout. An operation will, somewhat correctly, never transition to isDone() or isCancelled() unless someone cares to use it. The scenario that was likely in play over the WAN here is that the request was in flight to the server while the config was in flight down to the client. It arrives at the server, but is never responded to. Since the get() is never called, it'll never time out and transition to the canceled state. We recommend you change the test code to use the queue more like a queue and just get() each one. Iterating through the queue is a bit funny in the first place, but if using the get() on the Future objects, you'll still have asynchronous behavior and much of the time the get() will be returning since the data is already there. |
| Comment by Matt Ingenthron [ 24/Oct/12 ] |
| This behavior should be better documented, both in the javadoc and in the API reference. |
[JCBC-65] Client constructor blocks or deadlocks Created: 14/Jun/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.0.2 |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Martin Scott | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS: Windows 7 64bit
JDK: 1.6.0_31 also 1.6.0_33 64 bit Couchbase enterprise edition running on 3 nodes all Ubuntu 10.04 64bit server (VMware images) |
||
| Attachments: |
|
| Description |
|
I am evaluating the couchbase product and hit a brick wall immediately when running through the simple hello world example.
I have a 3 node cluster running couchbase enterprise 1.8.2 on ubuntu 10.04 64 bit VMware images. All three are running in VMWare player instances on Windows 7 64bit. When I try to run the Main example on Windows 7 using Java6 (64 bit) the code blocks somewhere in the Client constructor. The result is the logging below. 2012-06-14 14:07:46.313 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:46.316 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:46.319 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 14:07:59.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@24a4e2e3 2012-06-14 14:08:52.983 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@21ec6696 2012-06-14 14:08:52.987 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@27431340 I have also tried debugging but the code blocks in the constructor at client = new CouchbaseClient(uris, "default", ""); The program never completes. This works fine in a Linux environment with the following output received 2012-06-14 04:58:50.693 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.150:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.703 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.151:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.708 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.186.152:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue 2012-06-14 04:58:50.830 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@1bc74f37 2012-06-14 04:58:50.834 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3a21b220 2012-06-14 04:58:50.843 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@732b3d53 2012-06-14 04:58:51.135 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client Set Succeeded Synchronous Get failed Asynchronous Get Succeeded: Hello World! Is there a JDK for windows 7 or a configuration setting that can be used to prevent this? |
| Comments |
| Comment by Raghavan Srinivas [ 14/Jun/12 ] |
|
Thanks for giving the Java client library a spin.
Were you able to connect to a single windows 7 node? I suspect it might be a firewall/networking issue and if you can use the netstat command (or the appropriate command on windows 7)? You may also want to follow the instructions noted in http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html Changing IP addresses might be a cause for this. Finally, a more detailed log would be useful, if the network troubleshooting does not help. Please refer to http://www.couchbase.com/wiki/display/couchbase/Couchbase+Java+Client+Library for logging tips. |
| Comment by Martin Scott [ 14/Jun/12 ] |
|
Apologies that should read
Couchbase Version: 1.8.0 enterprise edition (build-55) |
| Comment by Martin Scott [ 14/Jun/12 ] |
| Detailed logging up to the point when the client hangs |
| Comment by Raghavan Srinivas [ 14/Jun/12 ] |
|
Thanks for the Log. I took a real quick look.
Were you able to follow the steps in http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-bestpractice-cloud.html and use the ip address that you are able to connect to (via the admin console)? |
| Comment by Alex Ma [ 14/Jun/12 ] |
|
Hi Martin,
Can you verify connectivity from the JDK on your windows box? This is what my connection code looks like: // Connection details for Couchbase List<URI> uris = new LinkedList<URI>(); uris.add(URI.create("http://10.4.2.3:8091/pools")); CouchbaseClient client = null; try { client = new CouchbaseClient(uris, "default", ""); } catch (Exception e) { System.err.println("except: connect: " + e.getMessage()); System.exit(-1); } This will create a persistent connection to 8091 on 10.4.2.3 as well as connections to 11210 on every node in the cluster. ssh'ing to 10.4.2.3 and running netstat - you should see something like whats below: netstat -nat|grep 10.32.3.50 tcp 0 0 10.4.2.3:11210 10.32.3.50:65437 ESTABLISHED tcp 0 0 10.4.2.3:8091 10.32.3.50:65442 ESTABLISHED tcp 0 304 ::ffff:10.4.2.3:22 ::ffff:10.32.3.50:65516 ESTABLISHED can you confirm this in your environment? thanks -Alex. |
| Comment by Martin Scott [ 18/Jun/12 ] |
|
Hi, thanks for the responses.
Here is the netstat output from my Windows client TCP 192.168.186.1:139 0.0.0.0:0 LISTENING InHost TCP 192.168.186.1:51008 192.168.186.150:22 ESTABLISHED InHost TCP 192.168.186.1:53281 192.168.186.150:8091 TIME_WAIT InHost TCP 192.168.186.1:53284 192.168.186.150:11210 ESTABLISHED InHost TCP 192.168.186.1:53285 192.168.186.151:11210 ESTABLISHED InHost TCP 192.168.186.1:53286 192.168.186.152:11210 ESTABLISHED InHost TCP 192.168.186.1:53292 192.168.186.150:8091 ESTABLISHED InHost and from the first node in the cluster with the client and other nodes. tcp 0 0 192.168.186.150:41317 192.168.186.150:11210 ESTABLISHED tcp 0 0 192.168.186.150:35883 192.168.186.151:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.151:38013 ESTABLISHED tcp 0 0 192.168.186.150:21100 192.168.186.152:46834 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.1:53284 ESTABLISHED tcp 0 0 192.168.186.150:57559 192.168.186.151:22 TIME_WAIT tcp 0 0 192.168.186.150:8091 192.168.186.1:53292 ESTABLISHED tcp 0 48 192.168.186.150:22 192.168.186.1:51008 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.150:41317 ESTABLISHED tcp 0 0 192.168.186.150:42433 192.168.186.152:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.150:56214 ESTABLISHED tcp 0 0 192.168.186.150:56214 192.168.186.150:11210 ESTABLISHED tcp 0 0 192.168.186.150:11210 192.168.186.152:39222 ESTABLISHED tcp 0 0 192.168.186.150:21100 192.168.186.151:60945 ESTABLISHED I downloaded the source jars from the maven repo and debugging shows the client hanging at the getLatch().await() line below. There doesn't appear to be any thread calling the countDown method on the latch before or after this is called. private ChannelFuture getReceivedFuture() { try { getLatch().await(); } catch (InterruptedException ex) { finerLog("Getting received future has been interrupted."); } return receivedFuture; } Martin. |
| Comment by Martin Scott [ 18/Jun/12 ] |
|
The stack trace where the client blocks
Thread [main] (Stepping) BucketUpdateResponseHandler.getReceivedFuture() line: 147 BucketUpdateResponseHandler.getLastResponse() line: 127 BucketMonitor.startMonitor() line: 183 ConfigurationProviderHTTP.subscribe(String, Reconfigurable) line: 243 CouchbaseClient.<init>(CouchbaseConnectionFactory, boolean) line: 158 CouchbaseClient.<init>(CouchbaseConnectionFactory) line: 125 CouchbaseClient.<init>(List<URI>, String, String) line: 77 Main.main(String[]) line: 67 |
| Comment by sean diamond [ 23/Jan/13 ] |
|
I am having this exact same problem. Using windows 7 64 bit trying to connect to ubuntu.
I am using 32 bit os on linux and couchbase server 2.0. I am also using the lastest java client version 1.1 Same Issue as described below. The only workaround is to not use windows, if my java client is running on linux then it will work with no issues, it just deadlocks on the windows machine. |
| Comment by Tug Grall [ 17/Apr/13 ] |
|
I am reopening the issue as we see this error again on some environment:
- Yuval - http://www.couchbase.com/issues/browse/JCBC-65 ... Let me know if you prefer me to create a new issue for 1.1.x |
| Comment by Michael Nitschinger [ 29/May/13 ] |
| getting it onto the bugfix release train, altough I'm not sure if we get it into 1.1.7 |
[JCBC-11] Need more unit tests for couchbase-client Created: 03/Feb/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.0.1 |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Improvement | Priority: | Major |
| Reporter: | Raghavan Srinivas | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
CouchbaseConnectionFactoryBuilder cfb = new CouchbaseConnectionFactoryBuilder();
cfb.setOpTimeout(10000); // wait up to 10 seconds for an operation to succeed cfb.setOpQueueMaxBlockTime(5000); // wait up to 5 seconds when trying to enqueue an operation For example, will fill up the funnel (the blocking queue) to the rim, but not overflow (immediately timeout). Once it's at the rim, it'll have to wait to wait until at least one operation flows out to add another operation. This *will* slow down the callers (their async calls will actually block on this internal queue), but that's okay in a bulk loader. However, there are no unit tests in couchbase-client to test this. |
[JCBC-18] NPE if hostnames in server bootstrap list are mixed case Created: 12/Mar/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.1-dp4 |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Matt Ingenthron | Assignee: | Matt Ingenthron |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
A user described a scenario where using mixed case in their URIs lead to an NPE. This is from the map lookup, since what the couchbase cluster sends us is different than what the user entered, I think.
See: http://www.couchbase.com/forums/thread/java-client-101-exception-using-couchbaseclient-servlet-filter |
[JCBC-61] Expose returned CAS value in CASResponse when available from binary protocol Created: 06/Jun/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.1dp |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Improvement | Priority: | Major |
| Reporter: | Perry Krug | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Customer request to add the capability to retrieve new cas value after a cas() operation to avoid a subsequent gets()
|
| Comments |
| Comment by Michael Nitschinger [ 29/Nov/12 ] |
| This may already be implemented, need to check. |
[JCBC-15] add showtype-options to documentation Created: 23/Feb/12 Updated: 12/Jun/13 |
|
| Status: | Reopened |
| Project: | Couchbase Java Client |
| Component/s: | docs |
| Affects Version/s: | None |
| Fix Version/s: | 1.1.8 |
| Security Level: | Public |
| Type: | Improvement | Priority: | Major |
| Reporter: | Matt Ingenthron | Assignee: | Michael Nitschinger |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Many of the Java docs should show the type in the options summary.
|
| Comments |
| Comment by Michael Nitschinger [ 15/Nov/12 ] |
| Can you give me a quick example on what you mean? Reassign it back to me then and I'll fix it! |
| Comment by Matt Ingenthron [ 15/Nov/12 ] |
|
If you look at the docs, there are many places where we have types that are returned, but we don't sufficiently describe those types. For example:
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/couchbase-sdk-java-set-add.html#table-couchbase-sdk_java_add It mentions the OperationFuture, but nowhere really tell how to use it (to my knowledge). You should be able to work with MC on the right way to fix these. |