Transaction Java library called from Scala may throw java.lang.InternalError and get stalled

java.lang.InternalError: Malformed class name
	at java.lang.Class.getSimpleName(Class.java:1330)
	at com.couchbase.transactions.error.internal.ErrorWrapper.convertToErrorWrapperIfNeeded(ErrorWrapper.java:114)
	at com.couchbase.transactions.TransactionsReactive.lambda$null$33(TransactionsReactive.java:577)
	at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onError(FluxOnErrorResume.java:88)
	at reactor.core.publisher.FluxSubscribeOnCallable$CallableSubscribeOnSubscription.run(FluxSubscribeOnCallable.java:228)
	at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:68)
	at reactor.core.scheduler.SchedulerTask.call(SchedulerTask.java:28)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Some Scala classes may not have a simple name, therefore limiting usability from Scala.

The following exception C has been thrown from the transaction body:

object A {
  object B
    case class C(message: String) extends Exception(message)
  }
}

Class name: redacted.canonical.name.A$B$C
The java.lang.InternalError: Malformed class name refers to class C.

Thanks @zoltan.zvara, that seems a pretty clear-cut bug. Iā€™ve raised https://issues.couchbase.com/browse/TXNJ-237 for it, and Iā€™ll get it fixed for next release 1.1 (currently scheduled for next month).

Note, this next release 1.1 is going to require the upcoming release of Couchbase Server 6.6 - it will not be compatible with 6.5.0 or 6.5.1 (and active development on transactions will continue only on this 6.6-compatible branch). Weā€™re taking this highly unusual step of a breaking change to enable some very useful new functionality, which we have not been able to get working in a backwards-compatible way with 6.5.X. Specifically, currently inserts in a transaction have to be ā€œstagedā€ as empty documents, whereas in 6.6 we have new functionality to allow them to be created as hidden documents. This prevents any part of the transaction being visible before the commit point, as it should be with our Read Committed isolation level.
Just wanted to give you a heads-up on this.

2 Likes

Ps if I can find time Iā€™ll aim to do some basic testing with calling the transactions library from Scala too, to check for any other stuff like this.

1 Like

Thanks @graham.pople!

Iā€™m aware of the workaround of inserts in a transaction. We are going to upgrade to Couchbase 6.6 right away.

To give you some more information on this: on our Bamboo CI, tests that include transactions with the above error, never finish and we see logs like:

07-Jul-2020 14:36:24	2020-07-07 14:36:24 WARN  config:523 - [com.couchbase.config][CollectionMapRefreshFailedEvent][10s] Collection Map refresh failed: FAILED {"coreId":"0xef0e39dc00000001"}
07-Jul-2020 14:36:24	com.couchbase.client.core.error.UnambiguousTimeoutException: GetCollectionManifestRequest {"cancelled":true,"completed":true,"coreId":"0xef0e39dc00000001","idempotent":true,"lastChannelId":"EF0E39DC00000001/000000005E64119D","lastDispatchedFrom":"10.233.85.221:52076","lastDispatchedTo":"couchbase-default-development-couchbase-0.couchbase-default-development-couchbase.development.svc.sigma:11210","reason":"TIMEOUT","requestId":110,"requestType":"GetCollectionManifestRequest","retried":29,"retryReasons":["KV_ERROR_MAP_INDICATED"],"service":{"bucket":"default","collection":"_default","opaque":"0x75","scope":"_default","type":"kv"},"timeoutMs":10000,"timings":{"dispatchMicros":319}}
07-Jul-2020 14:36:24	        at com.couchbase.client.core.msg.BaseRequest.cancel(BaseRequest.java:163)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.Timer.lambda$register$2(Timer.java:157)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
07-Jul-2020 14:36:24	        at java.lang.Thread.run(Thread.java:748)
07-Jul-2020 14:36:24	2020-07-07 14:36:24 INFO  io:493 - [com.couchbase.io][UnknownResponseStatusReceivedEvent] Received an unknown response status: 0x89 {"bucket":"default","coreId":"0xef0e39dc00000001","local":"/10.233.85.221:52076","remote":"couchbase-default-development-couchbase-0.couchbase-default-development-couchbase.development.svc.sigma/10.233.89.96:11210"}
07-Jul-2020 14:36:24	2020-07-07 14:36:24 WARN  config:523 - [com.couchbase.config][CollectionMapRefreshFailedEvent][10s] Collection Map refresh failed: FAILED {"coreId":"0xef0e39dc00000001"}
07-Jul-2020 14:36:24	com.couchbase.client.core.error.UnambiguousTimeoutException: GetCollectionManifestRequest {"cancelled":true,"completed":true,"coreId":"0xef0e39dc00000001","idempotent":true,"lastChannelId":"EF0E39DC00000001/000000005E64119D","lastDispatchedFrom":"10.233.85.221:52076","lastDispatchedTo":"couchbase-default-development-couchbase-0.couchbase-default-development-couchbase.development.svc.sigma:11210","reason":"TIMEOUT","requestId":123,"requestType":"GetCollectionManifestRequest","retried":29,"retryReasons":["KV_ERROR_MAP_INDICATED"],"service":{"bucket":"default","collection":"_default","opaque":"0x7f","scope":"_default","type":"kv"},"timeoutMs":10000,"timings":{"dispatchMicros":329}}
07-Jul-2020 14:36:24	        at com.couchbase.client.core.msg.BaseRequest.cancel(BaseRequest.java:163)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.Timer.lambda$register$2(Timer.java:157)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472)
07-Jul-2020 14:36:24	        at com.couchbase.client.core.deps.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
07-Jul-2020 14:36:24	        at java.lang.Thread.run(Thread.java:748)
07-Jul-2020 14:36:41	2020-07-07 14:36:41 WARN  endpoint:513 - [com.couchbase.endpoint][UnexpectedEndpointDisconnectedEvent] The remote side disconnected the endpoint unexpectedly {"circuitBreaker":"DISABLED","coreId":"0xef0e39dc00000002","local":"10.233.85.221:34500","remote":"couchbase-default-development-couchbase-1.couchbase-default-development-couchbase.development.svc.sigma:8093","type":"QUERY"}
07-Jul-2020 14:36:41	2020-07-07 14:36:41 WARN  endpoint:513 - [com.couchbase.endpoint][UnexpectedEndpointDisconnectedEvent] The remote side disconnected the endpoint unexpectedly {"circuitBreaker":"DISABLED","coreId":"0xef0e39dc00000002","local":"10.233.85.221:42068","remote":"couchbase-default-development-couchbase-0.couchbase-default-development-couchbase.development.svc.sigma:8093","type":"QUERY"}
07-Jul-2020 14:36:45	2020-07-07 14:36:45 WARN  endpoint:513 - [com.couchbase.endpoint][UnexpectedEndpointDisconnectedEvent] The remote side disconnected the endpoint unexpectedly {"circuitBreaker":"DISABLED","coreId":"0xef0e39dc00000001","local":"10.233.85.221:42128","remote":"couchbase-default-development-couchbase-0.couchbase-default-development-couchbase.development.svc.sigma:8093","type":"QUERY"}
07-Jul-2020 14:36:45	2020-07-07 14:36:45 WARN  endpoint:513 - [com.couchbase.endpoint][UnexpectedEndpointDisconnectedEvent] The remote side disconnected the endpoint unexpectedly {"circuitBreaker":"DISABLED","coreId":"0xef0e39dc00000001","local":"10.233.85.221:34502","remote":"couchbase-default-development-couchbase-1.couchbase-default-development-couchbase.development.svc.sigma:8093","type":"QUERY"}
07-Jul-2020 14:37:11	2020-07-07 14:37:11 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59781millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=1},maxEntries=1}
07-Jul-2020 14:38:11	2020-07-07 14:38:11 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59648millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}
07-Jul-2020 14:39:10	2020-07-07 14:39:10 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59628millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}
07-Jul-2020 14:40:10	2020-07-07 14:40:10 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59616millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}
07-Jul-2020 14:41:10	2020-07-07 14:41:10 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59603millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}
07-Jul-2020 14:42:09	2020-07-07 14:42:09 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59573millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}
07-Jul-2020 14:43:09	2020-07-07 14:43:09 INFO  stats:493 - [com.couchbase.transactions.cleanup.stats][TransactionCleanupEndRunEvent] start={default/_default/clientId=d4df1,index=0,numClients=1,ATRs={checking=1024,total=1024},runLength=60000millis},end={taken=59558millis,ATRs={read=1024,errored=0,present=149},entries={expired=0,failed=0,total=0},maxEntries=0}

One additional thing: it would be nice to be able to ask the API to directly throw out a certain type of exceptions when a transaction fails. We use a similar idom currently:

[...]
    try {
      transactions.run((t: AttemptContext) => {
        transactionID = Some(t.transactionId())
        result = Some(f(t))
      })
    } catch {
      case f: TransactionFailed =>
        if (t(f.getCause)) {
          throw f.getCause // we need the cause
        } else {
          throw f
        }
    } finally {
      transactionID.foreach {
        cleanupTransaction
      }
    }
[...]

Getting a TransactionFailed useful when there is something went wrong with the transaction itself, but when business objects and other services throw an exception, that is useful if thrown directly. This is just a questions of convenience.

Thanks,
ZoltƔn

@zoltan.zvara the transaction should timeout within its configured expiry window (defaults to 15 seconds). I suspect that since this bug is happening inside the bowels of the error-handling code itself, itā€™s somehow bypassing the expiry checks.

Iā€™ve now done some basic transactions-0n-Scala testing and somewhat annoyingly it passed without problem. So while I can fix the particular issue youā€™ve raised, I canā€™t guarantee Iā€™ve caught everything without adding a Scala performer for our centralised tests (which I now want to, but is quite a bit of work). So please let me know if you hit anything else while using it this way ā€˜in angerā€™, and Iā€™ll get it fixed.

On the feature suggestion, I think thatā€™s quite nice. What about if I provided throw new TransactionReraiseWrapper(new MyCustomAppException()), and that would raise MyCustomAppException as the final exception?

Btw, the TransactionFailed contains a TransactionResult, which contains a transactionId(), so no need to save it inside the lambda.

(Also out of interest what is cleanupTransaction doing? All cleanup and rollback should be automatically performed by the transactions library. Usually before control is returned to the application, but on occasion by an asynchronous background thread kicked off by Transactions.create).

@graham.pople Thanks, I will let you know if we hit anything else.

Our basic use-case is the following: the end-user calls an API which then translates to an internal API doing complex things. For example, handling subscription requests for a particular service.
Handling subscriptions in a web application can be cumbersome. A few business objects stored in separate documents may require an update: the previous subscription object, the new subscription object, the user object, the transaction object that covers the subscription with a payment. At least 4.

During the subscription request handling, many things can go wrong. We listen to at least 10 different exceptions, all of which are meaningful and useful to upstream callers. Not all of them are organized into a single parent exception. Therefore, currently, we let the user provide a guard that returns a boolean, to whether unpack the cause from the failed transaction or just let the TransactionException to be thrown out. For us, the throw new TransactionReraiseWrapper(new MyCustomAppException()) would still require to be unpacked and thrown away, since different exceptions and handled at different levels. (Maybe a bad design from our part?)

We encapsulate the Couchbase Scala Client API and the Couchcbase Java Transaction API for convenience. For example this is how our Transaction API looks like:

trait Operator[Identifier <: Stringifiable,
               Entity <: Unique[Identifier]]
extends Logger {
  def remove(key: Identifier): Unit
  def remove(entity: Entity): Unit
  def replace(entity: Entity): Entity
  def insert(entity: Entity): Entity
  def get(key: Identifier): Option[Entity]
  def need(key: Identifier): Entity
}

Here, the user of the API does not have to maintain and provide the TransactionGetResult to replace an entity with replace(entity: Entity). The API will take care of providing that, dynamically, and that is why we have to clean our API from these TransactionGetResults, because we manage them dynamically, transparently. This is not only convenient but removes the maintenance burden from the user to carry the TransactionGetResult along with function calls when the replace is to be called much much later downstream. TransactionGetResults within the API are bookkeeped using the transactionId, that is why we save in a variable.

I hope this makes sense, but let me know if not. These are just first breadth ideas on how we encapsulate the Couchbase API.

@graham.pople I wanted to check the progress on the issue https://issues.couchbase.com/browse/TXNJ-237 but I have to rights to view that specific issue or project.

Do you have an update in mind?

Cheers,
ZoltƔn

@zoltan.zvara oh, Iā€™m not sure why thatā€™s the case, let me check on that.

The ticket is resolved and the fix will be released in the forthcoming 1.1.0 release, expected to release around 10th August.

1 Like

There is a 1.0.7 released. Is that it?

Hey @zoltan.zvara

I was literally about to update this post that we have now launched both Couchbase Server 6.6 and Transactions for Java 1.1.0:

https://docs.couchbase.com/java-sdk/3.0/project-docs/distributed-transactions-java-release-notes.html

1 Like