Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 1.0, 1.1.1
-
Fix Version/s: 1.1.1
-
Component/s: library
-
Security Level: Public
-
Labels:None
-
Environment:HideGot this during a swap-rebalance. I don't have any other relevant information to reproduce for now (fwiw I've run this many times and it's the first time I'm seeing it.. though I've just seen it again on the very next run).
During the second run, the cluster rebalance is actually hanging..ShowGot this during a swap-rebalance. I don't have any other relevant information to reproduce for now (fwiw I've run this many times and it's the first time I'm seeing it.. though I've just seen it again on the very next run). During the second run, the cluster rebalance is actually hanging..
Description
Exception in thread "couchbase cluster resubscriber - running" java.lang.IllegalArgumentException: Bucket name cannot be null and must never be re-set to a new object.
at com.couchbase.client.vbucket.ConfigurationProviderHTTP.subscribe(ConfigurationProviderHTTP.java:240)
at com.couchbase.client.vbucket.ConfigurationProviderHTTP.finishResubscribe(ConfigurationProviderHTTP.java:215)
at com.couchbase.client.CouchbaseConnectionFactory$Resubscriber.run(CouchbaseConnectionFactory.java:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Unfortunately I don't have a whole lot more insight into what's happening, but the stack trace might be helpful to examine.. assigning to myself until I have more info..
at com.couchbase.client.vbucket.ConfigurationProviderHTTP.subscribe(ConfigurationProviderHTTP.java:240)
at com.couchbase.client.vbucket.ConfigurationProviderHTTP.finishResubscribe(ConfigurationProviderHTTP.java:215)
at com.couchbase.client.CouchbaseConnectionFactory$Resubscriber.run(CouchbaseConnectionFactory.java:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Unfortunately I don't have a whole lot more insight into what's happening, but the stack trace might be helpful to examine.. assigning to myself until I have more info..
-
- log2.txt.bz2
- 21/Jan/13 12:48 PM
- 54 kB
- Mark Nunberg
-
- log2.txt.bz2
- 21/Jan/13 12:48 PM
- 54 kB
- Mark Nunberg
-
Hide
- junit.zip
- 22/Jan/13 6:36 AM
- 354 kB
- Deepti Dawar
-
- junit/all-tests.html 44 kB
- junit/allclasses-frame.html 4 kB
- junit/alltests-errors.html 4 kB
- junit/alltests-fails.html 4 kB
- junit/com/.../0_ClusterManagerTest.html 39 kB
- junit/com/.../10_ViewNodeTest-err.html 0.5 kB
- junit/com/.../client/10_ViewNodeTest.html 38 kB
- junit/com/.../client/11_ViewTest-err.html 39 kB
- junit/com/.../client/11_ViewTest-errors.html 39 kB
- junit/com/.../client/11_ViewTest-fails.html 39 kB
- junit/com/.../client/11_ViewTest-out.html 0.3 kB
- junit/com/.../client/11_ViewTest.html 44 kB
- junit/.../1_CouchbaseClientMemcachedBucketTest-err.html 0.9 kB
- junit/.../1_CouchbaseClientMemcachedBucketTest.html 38 kB
- junit/.../2_CouchbaseClientTest-errors.html 37 kB
- junit/com/.../2_CouchbaseClientTest.html 37 kB
- junit/.../3_CouchbaseConnectionFactoryBuilderTest-err.html 0.5 kB
- junit/.../3_CouchbaseConnectionFactoryBuilderTest.html 39 kB
- junit/com/.../client/4_FlushTest-err.html 2 kB
- junit/com/.../client/4_FlushTest.html 38 kB
- junit/com/.../5_PaginatorTest-err.html 5 kB
- junit/com/.../5_PaginatorTest-errors.html 40 kB
- junit/com/.../5_PaginatorTest-out.html 0.3 kB
- junit/com/.../client/5_PaginatorTest.html 40 kB
- junit/com/.../6_SpatialViewTest-err.html 5 kB
- junit/com/.../6_SpatialViewTest-fails.html 39 kB
- junit/com/.../6_SpatialViewTest-out.html 0.3 kB
- junit/com/.../client/6_SpatialViewTest.html 39 kB
- junit/com/.../client/7_TapTest-err.html 5 kB
- junit/com/.../client/7_TapTest.html 38 kB
-
- daschl-4-restart.log
- 22/Jan/13 2:19 AM
- 255 kB
- Michael Nitschinger
-
- daschl-4-rebealance_two_nodes.log
- 22/Jan/13 2:31 AM
- 1.74 MB
- Michael Nitschinger
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Mark Nunberg
added a comment -
I've run the Java SDKD tests several times already and cannot reproduce this. Will re-open it if i see it again
Show
Mark Nunberg
added a comment - I've run the Java SDKD tests several times already and cannot reproduce this. Will re-open it if i see it again
Show
Mark Nunberg
added a comment - Seen again at:
http://review.couchbase.org/#/c/24092/
Hide
Mark Nunberg
added a comment -
So as I mentioned in the bug, I closed it because I haven't seen this error. Just now, both me and Michael encountered this error while running the SDKD tests
Show
Mark Nunberg
added a comment - So as I mentioned in the bug, I closed it because I haven't seen this error. Just now, both me and Michael encountered this error while running the SDKD tests
Hide
Michael Nitschinger
added a comment -
Attaching the logs for http://review.couchbase.org/#/c/24092 changeset 4 (prefixed with daschl-4-) on some test runs.
Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours.
Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours.
Show
Michael Nitschinger
added a comment - Attaching the logs for http://review.couchbase.org/#/c/24092 changeset 4 (prefixed with daschl-4-) on some test runs.
Deepti is currently running the changeset against the brun cluster, expect some info in 2 hours.
Hide
Michael Nitschinger
added a comment -
./stester -i 20devcluster.ini --service ALL --svcaction RESTART --num_nodes 3 --no_fo 1 -c failover.Once --dsw_timeres 1 -d -o restart.log -C 127.0.0.1:8050
Show
Michael Nitschinger
added a comment - ./stester -i 20devcluster.ini --service ALL --svcaction RESTART --num_nodes 3 --no_fo 1 -c failover.Once --dsw_timeres 1 -d -o restart.log -C 127.0.0.1:8050
Hide
Michael Nitschinger
added a comment -
./stester -i 20devcluster.ini -c rebalance.Once --mode out --rbcount 2 --dsw_timeres 1 -d -o rebealance_two_nodes.log -C 127.0.0.1:8050
Show
Michael Nitschinger
added a comment - ./stester -i 20devcluster.ini -c rebalance.Once --mode out --rbcount 2 --dsw_timeres 1 -d -o rebealance_two_nodes.log -C 127.0.0.1:8050
Hide
Deepti Dawar
added a comment -
Attaching the functional test results.
This was run against a local 2.0.0 node.
Pass rate is better this time - 92%.
This was run against a local 2.0.0 node.
Pass rate is better this time - 92%.
Show
Deepti Dawar
added a comment - Attaching the functional test results.
This was run against a local 2.0.0 node.
Pass rate is better this time - 92%.
Hide
Deepti Dawar
added a comment -
For the Hybrid tests - failures are still coming.
Attaching the intermittent log.
Attaching the intermittent log.
Show
Deepti Dawar
added a comment - For the Hybrid tests - failures are still coming.
Attaching the intermittent log.
Hide
The error that seems to be problematic in the unit test logs is this one -
'Timeout occurred. Please note the time in the report does not reflect the time until the timeout.
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.'
Most of the issues coming due to timeout.
Note : that these tests were run against a local cluster. Hence, such problems should not be occurring.
'Timeout occurred. Please note the time in the report does not reflect the time until the timeout.
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.'
Most of the issues coming due to timeout.
Note : that these tests were run against a local cluster. Hence, such problems should not be occurring.
Show
Deepti Dawar
added a comment - - edited The error that seems to be problematic in the unit test logs is this one -
'Timeout occurred. Please note the time in the report does not reflect the time until the timeout.
junit.framework.AssertionFailedError: Timeout occurred. Please note the time in the report does not reflect the time until the timeout.'
Most of the issues coming due to timeout.
Note : that these tests were run against a local cluster. Hence, such problems should not be occurring.
Show
Michael Nitschinger
added a comment - Merged in today, right before the 1.1.1 release.
can you elaborate a bit more whats going on through the test? This particular exception can come up when the java sdk tries to subcribe to a new node when the old connection is closed. I think the scenario should give us a connection to what is happening at runtime in the java sdk.
Thanks,
Michael