Java SDK v3 deficiencies

This thread / stream of consciousness has been super helpful…

Current issue…

SELECT * from BUCKET use keys [KEY] where str <> $condition
behaves very differently from:
SELECT data.* from BUCKET data use keys [KEY] where str <> $condition

The first case won’t deserialize because the bucket name is a wrapper around each response:

com.couchbase.client.core.error.DecodingFailureException: Deserialization of content into target class com.xxxx.couch.CouchbaseN1qlHelperTest$TestEntity failed; encoded = {"data":{"str":"CouchbaseN1qlHelperTestretrieveList1","type":null}}

The second case deserializes to a list of entities easily for us.

1 Like

@ingenthr @daschl @david.nault ^ I don’t know if you guys get notified so… I’ll be annoying. :slight_smile:

1 Like

I think the behavior there is expected-- maybe there’s something about the issue I’m not picking up.

N1QL has specific functionality to do this ‘unwrapping’ in the projection, but in the generic case it will wrap the projected result. This shows some of the discontinuity between how N1QL treats buckets (it calls them a keyspace) and how the rest of the system treats buckets (as a unit of resource allocation: memory, filesystem). Good news is there is work underway (collections) that should make this better. I know it’s not necessarily intuitive that you need to do a SELECT data.*, but that is part of how N1QL works so well with flexible schema data.

If there’s something more that’s an issue and I’m not picking up on it, please set me on track.

1 Like

@unhuman that is totally expected. Since our Jackson decoder can only take a json blob and decode it into your entity, there is no “magic” behind the scenes. If your json blob does not match up with the entity it can’t work. You need to tell n1ql to unwrap it if you are doing a select * . If you select individual fields n1ql will return it directly.

1 Like

@daschl @ingenthr @graham.pople @david.nault @Richard_Smedley

Would it be possible to make all the *Options classes be cloneable? For my use case, let’s say devs want to define default behavior and then perhaps update a CAS or some other value in the Options? Since there’s no way to copy the Options, we’d have to do all that work ourselves and perhaps miss changes upon updates.

@unhuman that’s a good idea. one approach would be to do this by making the builders immutable and always return a new structure. This would be the safest, but also generating more garbage. The other option would be to have an explicit copy method or so, but that’s a little harder to discover. I think we need to do some benchmarking and see how intelligent the JIT is in optimizing the immutable builder away.

1 Like

We use https://immutables.github.io :slight_smile:

@daschl
I think the last bit I have that’s blocking us is that all of our regression tests depend on testcontainers.

I found this:

Any idea when that’ll go in? Once that’s there, I can start trying to push 3.0.x through the organization.

Much appreciated!

@unhuman unfortunately that’s out of my/our hands, I think the current thought is “next release”, but I don’t know when it will happen. I hope soon!

1 Like

@daschl It went out today! Looks like the sample / test here: https://github.com/daschl/testcontainers-java/commit/059d9b5807fb0a9d5f784e44d4defeb7b6f03f06 is still set up for the old client. Do you have any docs / can you update: https://www.testcontainers.org/modules/databases/couchbase/ ?

EDIT: Oohhh… there’s nothing special anymore! You just use the client to set up all the other bits. Fantastic!

@daschl

Having some problems staying connected. I see this message from Couchbase:

21:35:11.506 [cb-events] INFO  com.couchbase.core - [com.couchbase.core][CoreCreatedEvent] {"clientVersion":"3.0.3","clientGitHash":"e55f7d43","coreVersion":"2.0.4","coreGitHash":"e55f7d43","userAgent":"couchbase-java/3.0.3 (Mac OS X 10.15.4 x86_64; OpenJDK 64-Bit Server VM 11.0.6+10)","maxNumRequestsInRetry":32768,"ioEnvironment":{"nativeIoEnabled":true,"eventLoopThreadCount":8,"eventLoopGroups":["KQueueEventLoopGroup"]},"ioConfig":{"captureTraffic":[],"mutationTokensEnabled":true,"networkResolution":"auto","dnsSrvEnabled":true,"tcpKeepAlivesEnabled":true,"tcpKeepAliveTimeMs":60000,"configPollIntervalMs":2500,"kvCircuitBreakerConfig":"disabled","queryCircuitBreakerConfig":"disabled","viewCircuitBreakerConfig":"disabled","searchCircuitBreakerConfig":"disabled","analyticsCircuitBreakerConfig":"disabled","managerCircuitBreakerConfig":"disabled","numKvConnections":1,"maxHttpConnections":12,"idleHttpConnectionTimeoutMs":30000,"configIdleRedialTimeoutMs":300000},"compressionConfig":{"enabled":true,"minRatio":0.83,"minSize":32},"securityConfig":{"tlsEnabled":false,"nativeTlsEnabled":true,"hasTrustCertificates":false,"trustManagerFactory":null},"timeoutConfig":{"kvMs":30000000,"kvDurableMs":10000,"managementMs":75000,"queryMs":10000000,"viewMs":75000,"searchMs":75000,"analyticsMs":75000,"connectMs":10000000,"disconnectMs":10000},"loggerConfig":{"customLogger":null,"fallbackToConsole":false,"disableSlf4j":false,"loggerName":"CouchbaseLogger","diagnosticContextEnabled":false},"orphanReporterConfig":{"emitIntervalMs":10000,"sampleSize":10,"queueLength":1024},"retryStrategy":"BestEffortRetryStrategy","requestTracer":"OwnedSupplier"} {"alternateIdentifier":"external","coreId":"0x75189e0100000003"}
21:35:11.507 [cb-events] INFO  com.couchbase.node - [com.couchbase.node][NodeConnectedEvent] Node connected {"coreId":"0x75189e0100000003","managerPort":"33638","remote":"localhost"}
21:35:11.508 [cb-events] INFO  com.couchbase.node - [com.couchbase.node][NodeConnectedEvent] Node connected {"alternateRemote":"localhost","coreId":"0x75189e0100000003","managerPort":"8091","remote":"172.17.0.3"}
21:35:11.508 [cb-events] INFO  com.couchbase.node - [com.couchbase.node][NodeDisconnectedEvent][404us] Node disconnected {"coreId":"0x75189e0100000003","managerPort":"33638","remote":"localhost"}

Looks like the client works for a bit and then starts failing. I suspect it’s because the node thinks it’s now running on 8091 and not 33638. it also seems to swap the remote from 172.17.0.3 to localhost.

The testcontainer Couchbase UI, of course, remains responsive on 33638.

Or, is this me?

@unhuman is this with the 3.0.3 release?

You should be able to connect with something like

Cluster cluster = Cluster.connect(
  couchbaseContainer.getConnectionString(),
  couchbaseContainer.getUsername(),
  couchbaseContainer.getPassword()
);

@daschl Yep, I’m on 3.0.3!

My connection is done like this:

ClusterOptions clusterOptions = ClusterOptions.clusterOptions(REST_API_USER_NAME, REST_API_PASSWORD);
Cluster cluster = Cluster.connect(Collections.singleton(seedNode), clusterOptions);

But after some time (a few seconds), it appears a new cluster configuration happens and my suite of tests just gets hung up.

I’ll try to create a demo…

Working on this. My simplest example seems to work just fine…

But our actual tests hang up somewhere when using testcontainers but work w/o issue with local Couchbase. I’ll try to see if I can reduce that to an example, but it’s a bit more complicated than the contrived case.

@daschl Yeah, I’ve got a fair degree of confidence that it’s a problem in the client. I suspect some sort of timing issue. As highlighted above, I get the reconnection messages 100% of the time associated with the test that fails. That test varies from run-to-run (usually happens, but sometimes not at all). Unable to duplicate with simple test class.

Edit: Removed problem that I was seeing with some queries on old clients. This was caused by a new initialization pattern and I missed enabling mutation tokens, which those tests require. Removed because there is no strikethrough and I wanted to not confuse this topic.

@daschl I have created some tests that reproduce this issue. They’re attached here.
CB303TestcontainersFailure.zip (27.0 KB)
Note that this logs to console and seems to echo the same/similar symptom I detailed above:

INFO: [com.couchbase.node][NodeDisconnectedEvent][7825us] Node disconnected {"coreId":"0x1feeeb7500000003","managerPort":"32895","remote":"localhost"}
Apr 24, 2020 3:29:36 PM com.couchbase.client.core.cnc.LoggingEventConsumer$Slf4JLogger info
INFO: [com.couchbase.core][BucketOpenedEvent][10s] Opened bucket "data1" {"alternateIdentifier":"external","coreId":"0x1feeeb7500000003"}
Apr 24, 2020 3:29:36 PM com.couchbase.client.core.cnc.LoggingEventConsumer$Slf4JLogger info
INFO: [com.couchbase.core][CoreCreatedEvent] {"clientVersion":"3.0.3","clientGitHash":"e55f7d43","coreVersion":"2.0.4","coreGitHash":"e55f7d43","userAgent":"couchbase-java/3.0.3 (Mac OS X 10.15.4 x86_64; OpenJDK 64-Bit Server VM 11.0.6+10)","maxNumRequestsInRetry":32768,"ioEnvironment":{"nativeIoEnabled":true,"eventLoopThreadCount":8,"eventLoopGroups":["KQueueEventLoopGroup"]},"ioConfig":{"captureTraffic":[],"mutationTokensEnabled":true,"networkResolution":"auto","dnsSrvEnabled":true,"tcpKeepAlivesEnabled":true,"tcpKeepAliveTimeMs":60000,"configPollIntervalMs":2500,"kvCircuitBreakerConfig":"disabled","queryCircuitBreakerConfig":"disabled","viewCircuitBreakerConfig":"disabled","searchCircuitBreakerConfig":"disabled","analyticsCircuitBreakerConfig":"disabled","managerCircuitBreakerConfig":"disabled","numKvConnections":1,"maxHttpConnections":12,"idleHttpConnectionTimeoutMs":30000,"configIdleRedialTimeoutMs":300000},"compressionConfig":{"enabled":true,"minRatio":0.83,"minSize":32},"securityConfig":{"tlsEnabled":false,"nativeTlsEnabled":true,"hasTrustCertificates":false,"trustManagerFactory":null},"timeoutConfig":{"kvMs":2500,"kvDurableMs":10000,"managementMs":75000,"queryMs":75000,"viewMs":75000,"searchMs":75000,"analyticsMs":75000,"connectMs":10000,"disconnectMs":10000},"loggerConfig":{"customLogger":null,"fallbackToConsole":false,"disableSlf4j":false,"loggerName":"CouchbaseLogger","diagnosticContextEnabled":false},"orphanReporterConfig":{"emitIntervalMs":10000,"sampleSize":10,"queueLength":1024},"retryStrategy":"BestEffortRetryStrategy","requestTracer":"OwnedSupplier"} {"alternateIdentifier":"external","coreId":"0x1feeeb7500000004"}
Apr 24, 2020 3:29:36 PM com.couchbase.client.core.cnc.LoggingEventConsumer$Slf4JLogger info
INFO: [com.couchbase.node][NodeConnectedEvent] Node connected {"coreId":"0x1feeeb7500000004","managerPort":"32895","remote":"localhost"}
Apr 24, 2020 3:29:36 PM com.couchbase.client.core.cnc.LoggingEventConsumer$Slf4JLogger info
INFO: [com.couchbase.node][NodeConnectedEvent] Node connected {"alternateRemote":"localhost","coreId":"0x1feeeb7500000004","managerPort":"8091","remote":"172.17.0.3"}
Apr 24, 2020 3:29:36 PM com.couchbase.client.core.cnc.LoggingEventConsumer$Slf4JLogger info
INFO: [com.couchbase.node][NodeDisconnectedEvent][580us] Node disconnected {"coreId":"0x1feeeb7500000004","managerPort":"32895","remote":"localhost"}

I run this on my Mac (16-inch, 2019) 10.15.4. I do not know (yet) if other OSes have a similar issue…

It’s important to note that this problem does not happen if there’s a local Couchbase running - the test has “smarts” at the beginning to see if there’s a local Couchbase and connects to that if available, otherwise it will spin up and configure a testcontainer.

To use local Couchbase, you have to have a bucket named data1 with a user data1 with password password and full bucket access to data1 bucket.

There are 2 sets of tests, one uses some “goofy” inheritance that we use for our tests (inheritedTests package) and the other selfcontainedTests seems to be not as unreliable.

Having this work is critical for us - as this is how we manage our tests in our ci/cd pipeline.

Why is that the insert, upsert, delete are not returning the Document? I read the document:
Thus, the get method does not return a Document but a GetResult instead, and the upsert does not return a Document but a MutationResult. Each of those results only contains the field that the specific method can actually return, making it impossible to accidentally try to access the expiry on the Document after a mutation, for example.

But I want to return the created, updated or deleted document back in the response and now I have no way to get it, without making a separate retrieve call. Why ??

@david.nault

@abhideepchakravarty the SDK exactly exposes what is returned from the server on each operation. A mutation does not return the content, only metadata (like the new cas). If you fetch a document through get it will return the content as part of the get result.

Note that since you just created / updated the document, you have the content in scope that you sent into the API so you can return it to your caller. The server is not returning any new content or anything, so that should be fine. Note that a deleted document cannot be fetched at the same time. What specific use case do you have that you cannot do with the API?

In the 2.7.2 version I used get the document back along with the CAS. That was helpful is keeping the returned document ready for edit on the same page where it was getting created. Now to get the CAS, I will have to make one extra call. For now as a work around I am removing the edit option from the screen on that page because, as you said, I am returning the created object in the scope.

Hi @abhideepchakravarty

Now to get the CAS, I will have to make one extra call
An extra call isn’t required, as you get the CAS back in the MutationResult from the mutation.