The v2 of the java client leaks RxJava threads on shutdown

this PR https://github.com/ReactiveX/RxJava/pull/3149 is merged! :smiley:

After that these computations/RXJava threads should start to stop decently, shouldnā€™t?

Iā€™ll report my experience soon when I switch between last version and rxjava-1.0.15-SNAPSHOT.jar

Thank you!

AFAIK there should be a shutdown() method to be called on Schedulers at the time you shut the environment down.

I think this is best left to the userā€™s care, since the application code could continue using RxJava even though the Cluster has been shut down.

Please report with your findings, hope this solves the issue once and for all :wink:

Should I just call Schedulers.shutdown() before CouchbaseEnvironment.shutdown() shouldnā€™t?
If it is just that it didnā€™t work at all.

Basically my steps are:

bucket.close();
Cluster.disconnect();
Schedulers.shutdown();
CouchbaseEnvironment.shutdown()

Any tips?

Iā€™ll be back if I have any news!

I think Schedulers.shutdown() should be called last but yeah that should do the trick.
However stopping of rx threads attempts to be graceful and should be given maybe a few hundred milliseconds to complete.

What did you observe exactly?

note: the CouchbaseEnvironment needs only be closed if you instantiated it yourself

ok. Here we go.

I switched between cbenv.shutdown and shedulers.shutdown and in any order we have the same behavior.

Now RXThreads is not showing as Memory leaks anymore but we still having these threads from CB:

9:21:22.022 [http-nio-8086-exec-25] DEBUG c.c.client.core.RequestHandler - Starting reconfiguration.
09:21:22.022 [cb-core-3-1] DEBUG c.c.c.c.config.ConfigurationProvider - Received signal for outdated configuration.
09:21:22.022 [cb-core-3-1] DEBUG c.c.c.c.config.ConfigurationProvider - Received signal for outdated configuration.
09:21:22.022 [http-nio-8086-exec-25] DEBUG c.c.client.core.RequestHandler - No node found in config, disconnecting all nodes.
09:21:22.024 [http-nio-8086-exec-25] DEBUG c.c.c.c.config.ConfigurationProvider - Closing all open buckets
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-computations-1] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-computations-2] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-computations-3] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-computations-4] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-io-1-1] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-io-1-2] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-io-1-3] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [threadDeathWatcher-4-1] but has failed to stop it. This is very likely to create a memory leak.
Oct 07, 2015 9:21:22 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: The web application [/mauricio] appears to have started a thread named [cb-io-1-4] but has failed to stop it. This is very likely to create a memory leak.

looks like the SDK-managed threads are not closed (but RxJava threads are) :frowning:

in @farraultā€™s case cb-io-xxx and cb-core-xxx were already stopping gracefully before the 2.1.4 patch and cb-computation-xxx were also stopped after the patch.

can you both share your platform (which container is used, etcā€¦) and how you build you couchbase environment/cluster and the method you use to shut it down, for comparison?

@farrault could you also test the behavior with RxJava 1.0.15-SNAPSHOT seeing if it resolves the issue in you case?

can you both share your platform (which container is used, etcā€¦) and how you build you couchbase environment/cluster and the method you use to shut it down, for comparison?

Platform:
Ubuntu 14.04.3 LTS (GNU/Linux 3.13.0-62-generic x86_64)
java version "1.8.0_51"
Javaā„¢ SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpotā„¢ 64-Bit Server VM (build 25.51-b03, mixed mode)
Tomcat7
rxjava-1.0.15-SNAPSHOT.jar
couchbase-core-io-1.1.4.jar
couchbase-java-client-2.1.4.jar

env = DefaultCouchbaseEnvironment //create
					.builder()
					.kvTimeout(CB_KEY_VALUE_TIMEOUT) // 20000
					.connectTimeout(CB_CONNECT_TIMEOUT) //20000
					.disconnectTimeout(CB_DISCONNECT_TIMEOUT) //200000
					.build();
                               				
			couchCluster = CouchbaseCluster.create(env, argNodes);


//shutdown
bucket.close();
Cluster.disconnect();
Schedulers.shutdown();
CouchbaseEnvironment.shutdown();

@mcarvalho are you using a framework like Spring or something similar, where you call the shutdown code in a special method/hook?

@mcarvalho are you using a framework like Spring or something similar, ā€¦
@simonbasle Actually not. Our stack is basically based on servlets 3.0 for some components and struts 1 for some others. We donā€™t run under any container like Spring or similar products.

where you call the shutdown code in a special method/hook?
I have a simples class who implements ServletContextListener and these shutdown process occur inside

public void contextDestroyed(ServletContextEvent arg0)

Let me know if I can help with more informations.
Regards,
Mauricio

Iā€™ve reopened another ticket, https://issues.couchbase.com/browse/JVMCBC-251, because if found room for improvement.

There is a slight subtility with shutdown though: CouchbaseEnvironment.shutdown() returns an Observable<Boolean> and as such it needs to be subscribed. So your code should be:

//shutdown
//bucket.close(); //this will be called when disconnecting the cluster
cluster.disconnect();
//note: as soon as the env was created by you, you must call shutdown() on it
//here we trigger subscription and wait for termination by blocking on the Observable
couchbaseEnvironment.shutdown().toBlocking().single();
Schedulers.shutdown(); //reordered, last

I think that with this modification, things should be far better.

Improvements on that front have been submitted to master.

:arrow_right_hook: prefer using upcoming 2.2.1 release (snapshot can be built from current master) and upgrade to RxJava 1.0.15 as soon as it comes out in order to be able to call rx.Schedulers.shutdown() :slight_smile:

:warning: donā€™t forget to call toBlocking().single() (or at least subscribe()) after an environment.shutdown()

Note: The improvements have been partially backported to release11 branch for inclusion in the upcoming 2.1.5 release, but improvements for RxJava and Netty threads couldnā€™t be backported, so itā€™s mainly clarity of code, integration tests and logs that have been backported.

Really god! Weā€™re in the right way.
Iā€™ve set the last cb dependencies based on master branch and the last memory leak is this threadDeathWatcher:

13:42:04.218 [cb-io-1-4] DEBUG c.c.c.d.i.n.buffer.PoolThreadCache - Freed 21 thread-local buffer(s) from thread: cb-io-1-4
13:42:04.218 [cb-io-1-3] DEBUG c.c.c.d.i.n.buffer.PoolThreadCache - Freed 10 thread-local buffer(s) from thread: cb-io-1-3
13:42:04.218 [cb-io-1-2] DEBUG c.c.c.d.i.n.buffer.PoolThreadCache - Freed 16 thread-local buffer(s) from thread: cb-io-1-2
13:42:04.219 [cb-io-1-1] DEBUG c.c.c.d.i.n.buffer.PoolThreadCache - Freed 13 thread-local buffer(s) from thread: cb-io-1-1
SEVERE: The web application [/mauricio] appears to have started a thread named [threadDeathWatcher-4-1] but has failed to stop it. This is very likely to create a memory leak.

Note: The improvements have been partially backported to release11 branch for inclusion in the upcoming 2.1.5 release, but improvements for RxJava and Netty threads couldnā€™t be backported, so itā€™s mainly clarity of code, integration tests and logs that have been backported.

Any specific reason of why Netty threads couldnā€™t be backported?

Thanks,
Mauricio

Just for knowledge, after build java client from master branch and use the 2.2.1 release, I started to get this error during some operations:

Exception in thread "cb-computations-4" java.lang.IllegalStateException: Fatal Exception thrown on Scheduler.Worker thread.
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:62)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Caused by: java.lang.NoSuchMethodError: com.couchbase.client.core.message.kv.UpsertResponse.mutationToken()Lcom/couchbase/client/core/message/kv/MutationToken;
at com.couchbase.client.java.CouchbaseAsyncBucket$16.call(CouchbaseAsyncBucket.java:501)
at com.couchbase.client.java.CouchbaseAsyncBucket$16.call(CouchbaseAsyncBucket.java:493)
at rx.internal.operators.OperatorMap$1.onNext(OperatorMap.java:54)
at rx.observers.Subscribers$5.onNext(Subscribers.java:234)
at rx.subjects.SubjectSubscriptionManager$SubjectObserver.onNext(SubjectSubscriptionManager.java:222)
at rx.subjects.AsyncSubject.onCompleted(AsyncSubject.java:101)
at com.couchbase.client.core.endpoint.AbstractGenericHandler$1.call(AbstractGenericHandler.java:199)
at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55)
ā€¦ 7 more

@mcarvalho it looks like there is something wrong with your classpath, mutation tokens have been added in 2.2.0 / 1.2.0 and for some reason they are not found.

mmh did you also build core-io project (couchbase-jvm-core) from master? and if so, can you make sure the core-io version and the corresponding dependency match in both projects pom.xml?

Iā€™ve imagined it, thatā€™s the point! I didnā€™t clone the jvm-core project.
Thank you for the advice.

Hi,
Facing issue with leaking rx threads, we have managed to eliminate the issue by running 2.1.4 with RXJava snapshot(1.0.15). and call rx.Schedulers.shutdown().

Is there any open ticket within couchbase regarding this issue, eg. updating the RxJava version as soon as itā€™s available?

yes there was a ticket for the threads leaking, see post #31 aboveā€¦
weā€™ll upgrade to RxJava 1.0.15 (or above) in the 2.2.x bugfix release that follows the RxJava release, hopefully next one.

Thanks for the fast response.

We notice the ticket, but where confused the ticket where closed and fix version not set to 2.x.

yeah there was also a kind of ā€œparentā€ change https://issues.couchbase.com/browse/JCBC-773
the 251 change is in the core project, so it has core-io numbering schemeā€¦ it also only targets SDK threadsā€¦

note that you donā€™t necessary have to wait for us to bump the minimum dependency of the SDK to RxJava 1.0.15 once it is out (this may take some time), you can use it explicitly as soon as it is officially out.

1 Like