Xamarin.iOS Crash On Resume

Hey guys,

Been stuck on this issue for a while now. If we lock and unlock the iPad, our app crashes due to some thread worker exception. After symbolicating the crash log from the device we get the following exception trace on the crashing thread:

0x00000001038b4e30 mono_invoke_unhandled_exception_hook + 21581360 (exception.c:1084)
0x000000010390feb0 mono_thread_internal_unhandled_exception + 21954224 (threads.c:5091)
0x000000010390a59c worker_callback + 21931420 (threadpool.c:0)
0x0000000103908a80 worker_thread + 21924480 (threadpool-worker-default.c:477)
0x00000001039104bc start_wrapper + 21955772 (threads.c:829)

and from the device’s console I can see this comes from a Mono.Net.Security.MobileAuthenticationStream.Close() call which has a bunch of WebSocketSharp output.

We thought at first this could just be a Xamarin thing, but I have a fork of the project with all of the Couchbase Lite code taken out and I can’t get this crash to happen, which prompted me to look further down the stacktrace and I see
WebSocketSharp.WebSocket.releaseResources() in
couchbase-lite-net-build/1.4/iOS/couchbase-lite-net/vendor/websocket-sharp/websocket-sharp/WebSocket.cs
is what eventually lead to that exception.

Is there any special care we should take when suspending and resuming the app? We have tried stopping and starting the replication as well as closing and opening the database on suspend/resume respectively, and neither really seemed to help.

We ran across this thread that sounded similar but doesn’t seem to have a resolution:

Setup and specs:

  • iOS 11
  • Couchbase Lite 1.4, using SQLite for storage
  • Xamarin.Forms 2.3.5.235-pre2

Any help or ideas of things to try would be greatly appreciated.

Thanks!

You should be stopping your replications when going into the background, but I imagine that failure to do that would result in a different symptom. The tough part about dealing with this kind of stuff is that unless it is reproducible here there isn’t a way to fix it. Could you provide the full managed stack trace?

I believe we have tried stopping and starting the replication objects, but it’s possible we did that wrong.

Just to clarify, if we have 2 databases where database A has a pull replicator and database B has push and pull replicators, we should be manually stopping all three replicators (just using pusher.stop() etc. ), correct? And to restart, do we just call the .start() method on each replicator, or will we need to recreate any authenticator objects or the replicators themselves? We are currently just using basic auth.

And yeah sure, I can upload the full stack trace. Is there a preferred way to upload these? It’s a bit long so I don’t want to blow up the post by pasting it in.

Thanks for the quick response!

Yes, you should be stopping all the replicators using stop(). And start() should restart it

I don’t know what @borrrden prefers but one option would be fir you to create a gist with the traces and point us to that ?

Thanks for the suggestion!

Full stack trace is in this gist:

Let me know if you have trouble viewing it. I’m not sure how the iPad console logging works - it looks like the stack trace is repeated a few times with slight variations, so I just included all of them.

Pretty much this exact stack trace has been reported upstream…back in February…with no response…

I might have to do what the person who filed this did and modify the library to just swallow this exception.

Ah, thanks for pointing me in that direction.

I’m sure you guys are plenty busy with CB Lite 2.0 work, but does this mean you might patch this into 1.4? And if so, when might we expect something like that to make its way into a public release? Unfortunately we are stuck using pre-2.0 CB Lite due to .NET Standard issues, and our business requirements mean we need to use SSL which is the root problem here.

If we end up having to modify the WebSocket-Sharp code ourselves to catch that exception, do you think there would be any concern about memory leaks if the SSLStream is not properly being Disposed?

We’re very close to using Couchbase in our production-ready solutions, and this is one of the last few things really keeping us back in our evaluation, so I appreciate you helping us out so quickly!

We just released 1.4.1 today actually, but as you can imagine that means that it has been frozen for a few weeks at this point so a fix for this will not be present. That means that the focus is back on 2.0 now and any 1.x related changes will be put into the backlog for a while (especially for .NET, where it is just one person running the whole thing). Furthermore there is an ongoing discussion we are having about when to put out patches since our Server product has an established model that we are not currently following (see their Enterprise Edition vs Community Edition summary). Modifying the section in question yourself from a tagged commit of Couchbase is going to be the easiest and least disruptive way to get this fix I think (unless you are already paying for Couchbase, in which case you should raise the issue through the support channels and it will be handled in a different way than getting reported through the forums).

As far as your concern about memory leaks, I would think just the reverse. If you catch the exception and allow the logic to continue I think memory leaks would be less likely. They aren’t that likely to begin with because of the finalizer and garbage collection environment (unless you mean reference cycles that keep managed objects alive. Memory leaks to me are at the native level).

1 Like

Awesome, thanks for all the feedback!