[JCBC-142] Observe Tests show that something is wrong in the observe impl Created: 08/Nov/12  Updated: 03/Dec/12  Resolved: 27/Nov/12

Status: Closed
Project: Couchbase Java Client
Component/s: Core
Affects Version/s: 1.1-dp4
Fix Version/s: 1.1-beta
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Michael Nitschinger Assignee: Mike Wiederhold
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File observe-test.php    

 Description   
The newly added observe viewtests show that sometimes the full result sets are returned and sometimes not. This strongly correlates with the number of sets done in a given timeframe so I suppose the current observe implementation has a bug somewhere.

Also, the observe test inside the CouchbaseClient fails sometime which may correlate to the same issue.

 Comments   
Comment by Michael Nitschinger [ 09/Nov/12 ]
This is the test to reproduce it: https://github.com/couchbase/couchbase-java-client/commit/df5b6a53bbd61ca8daf64d56919e79d81870355e
(you may have to increase the amount of sets to be done to make sure the disk queue takes some time to get flushed)
Comment by Michael Nitschinger [ 09/Nov/12 ]
Please reproduce with another client and assign it back to me if it appears to be a client library issue.

thanks!
Comment by Mark Nunberg [ 09/Nov/12 ]
I've tried to replicate this in PHP, but without success.

It sounds like the observe operation is failing and therefore the view is returning bad results. I'm not familiar with the Java API, but the set+persist wouldn't throw an exception if it fails - you'd need to check the future.getStatus().isSuccess() or something.

I'm pasting a very ad-hoc test that I've written for PHP (more comments will be inline with that)
Comment by Mark Nunberg [ 09/Nov/12 ]
Update:

I've re-run the tests with a two node cluster. I see similar behavior. This looks like a server bug.
Comment by Mark Nunberg [ 09/Nov/12 ]
Michael, can you modify the test code to check for observe exceptions (and in general make it function more similarly to the php code).

This way we can have confident confirmation from both clients, and file a server bug

Failed observe does not throw an exception, as per

https://github.com/couchbase/couchbase-java-client/blob/df5b6a53bbd61ca8daf64d56919e79d81870355e/src/main/java/com/couchbase/client/CouchbaseClient.java#L909

(From then same revision linked to in the description).
Comment by Mark Nunberg [ 09/Nov/12 ]
I've actually revised the tests to work the way I needed them to.. (for some reason the observe in java is slower than I had hoped for, so I ended up making my own threaded contraption to solve this...) -- might this be a separate bug?

anyway.. I've observed duplicate behavior:

Basically, many of the times, the stale test fails, returning *exactly* half of the keys in the view.

Maybe my cluster config is funky, but this is doubtful..

Anyway, we'll file a cluster bug with a 100% certainty that this is a client issue.

btw, I'd actually advocate keeping the threaded contraption there (in the commit I accidentally saved it as a single worker, might want to bump it up)..

Placing load on the server (i.e. by using multiple setter threads) seems to highlight this issue.. and I have a feeling it's a lag/race condition sort of thing.

https://github.com/mnunberg/couchbase-java-client/commit/3d788ab9d3a88c1dc20717c4dd110e3a8bb5f5bc
Comment by Mark Nunberg [ 09/Nov/12 ]
java.lang.AssertionError: expected:<500> but was:<180>
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.failNotEquals(Assert.java:645)
at org.junit.Assert.assertEquals(Assert.java:126)
at org.junit.Assert.assertEquals(Assert.java:470)
at org.junit.Assert.assertEquals(Assert.java:454)
at com.couchbase.client.ViewTest.testObserveWithStaleFalse(ViewTest.java:839)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
Comment by Matt Ingenthron [ 13/Nov/12 ]
Note that this one is being worked by the server team, since it looks like a server issue. No action needed here at the moment.
Comment by Michael Nitschinger [ 14/Nov/12 ]
Is there a ticket we can link to?
Comment by Matt Ingenthron [ 19/Nov/12 ]
Mike had been in here earlier today and thinks he knows where the issue is, so passing assignment to him.
Comment by Michael Nitschinger [ 21/Nov/12 ]
Fixed and pushed to master, will be available in dp5!
Comment by Michael Nitschinger [ 21/Nov/12 ]
Looks like this is still not solved, from time to time the test still shows missing documents!
Comment by Michael Nitschinger [ 27/Nov/12 ]
Test case was flawed, now fixed and pushed.
Generated at Tue Oct 21 12:19:48 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.