[JCBC-134] resubscriber IllegalArgumentException during topology changes Created: 19/Oct/12 Updated: 23/Jan/13 Resolved: 23/Jan/13 |
|
| Status: | Resolved |
| Project: | Couchbase Java Client |
| Component/s: | library |
| Affects Version/s: | 1.0, 1.1.1 |
| Fix Version/s: | 1.1.1 |
| Security Level: | Public |
| Type: | Bug | Priority: | Critical |
| Reporter: | Mark Nunberg | Assignee: | Michael Nitschinger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Got this during a swap-rebalance. I don't have any other relevant information to reproduce for now (fwiw I've run this many times and it's the first time I'm seeing it.. though I've just seen it again on the very next run).
During the second run, the cluster rebalance is actually hanging.. |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Description |
|
Exception in thread "couchbase cluster resubscriber - running" java.lang.IllegalArgumentException: Bucket name cannot be null and must never be re-set to a new object.
at com.couchbase.client.vbucket.ConfigurationProviderHTTP.subscribe(ConfigurationProviderHTTP.java:240) at com.couchbase.client.vbucket.ConfigurationProviderHTTP.finishResubscribe(ConfigurationProviderHTTP.java:215) at com.couchbase.client.CouchbaseConnectionFactory$Resubscriber.run(CouchbaseConnectionFactory.java:322) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Unfortunately I don't have a whole lot more insight into what's happening, but the stack trace might be helpful to examine.. assigning to myself until I have more info.. |
| Comments |
| Comment by Michael Nitschinger [ 08/Nov/12 ] |
|
Hi Mark,
can you elaborate a bit more whats going on through the test? This particular exception can come up when the java sdk tries to subcribe to a new node when the old connection is closed. I think the scenario should give us a connection to what is happening at runtime in the java sdk. Thanks, Michael |
| Comment by Mark Nunberg [ 12/Dec/12 ] |
| I've run the Java SDKD tests several times already and cannot reproduce this. Will re-open it if i see it again |
| Comment by Mark Nunberg [ 21/Jan/13 ] |
|
Seen again at: http://review.couchbase.org/#/c/24092/ |
| Comment by Mark Nunberg [ 21/Jan/13 ] |
| So as I mentioned in the bug, I closed it because I haven't seen this error. Just now, both me and Michael encountered this error while running the SDKD tests |
| Comment by Michael Nitschinger [ 22/Jan/13 ] |
|
Attaching the logs for http://review.couchbase.org/#/c/24092 changeset 4 (prefixed with daschl-4-) on some test runs.
Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours. |
| Comment by Michael Nitschinger [ 22/Jan/13 ] |
| ./stester -i 20devcluster.ini --service ALL --svcaction RESTART --num_nodes 3 --no_fo 1 -c failover.Once --dsw_timeres 1 -d -o restart.log -C 127.0.0.1:8050 |
| Comment by Michael Nitschinger [ 22/Jan/13 ] |
| ./stester -i 20devcluster.ini -c rebalance.Once --mode out --rbcount 2 --dsw_timeres 1 -d -o rebealance_two_nodes.log -C 127.0.0.1:8050 |
| Comment by Deepti Dawar [ 22/Jan/13 ] |
|
Attaching the functional test results. This was run against a local 2.0.0 node. Pass rate is better this time - 92%. |
| Comment by Deepti Dawar [ 22/Jan/13 ] |
|
For the Hybrid tests - failures are still coming.
Attaching the intermittent log. |
| Comment by Deepti Dawar [ 22/Jan/13 ] |
|
The error that seems to be problematic in the unit test logs is this one -
'Timeout occurred. Please note the time in the report does not reflect the time until the timeout. junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.' Most of the issues coming due to timeout. Note : that these tests were run against a local cluster. Hence, such problems should not be occurring. |
| Comment by Michael Nitschinger [ 23/Jan/13 ] |
| Merged in today, right before the 1.1.1 release. |