Advice on order of operations during application shutdown

I’ve got a weird error in syncing that occurs during application shutdown.
Its a little complex to describe, but here goes…

I have an iPad and Mac running my software, syncing to a remote server.
Both have have the same document opened. Both have Change Listeners on queries that monitor that document.

All is good. I make some changes to the document on the iOS side, but don’t save yet.

Then I shut the iPad app down (swipe up to show all apps, swipe up on my app). The following sequence occurs on the iOS side, which all looks good as far as I can tell:

  • App delegate gets the OnResignActivation
  • Document manager is asked to save its current state (which it does) using db.Save(doc)
  • App then runs a “WaitForSyncIdle”, where it waits for 250ms, then waits until the sync status is idle.
  • App stops sync
  • App completes shutdown - AppDelegate.WillTerminate.

The weird thing happens on the Mac side. About 1 second after iOS does a save the Mac does the following:

  • App receives the document changed notification
  • App loads the document using db.GetDocument()

The problem is that it receives an old copy of the document - in this case about 50 seconds old and missing some changes.

I then restart the iOS app. Its copy of the document is the latest, it seems to resync and things usually fix themselves up.

A slight variation on this is where the mac does not receive the notification about the change (i’m guessing the iOS side shuts down before the sync can happen). Then I restart the iOS app, and the following happens in quick succession:

  • Mac receives notification about document change. Gets an old copy of the document.
  • Mac receives another notification about document change. Gets a new copy of the document.

So it seems that having a sync pending or in progress during app shutdown creates something strange with old and new document versions.

This makes me believe that I am not shutting the sync engine down properly, or my “WaitForSyncIdle” is faulty.

Here is how I start the sync engine:

            var auth = new BasicAuthenticator(id, secure_server_password);
            var collection = database.GetCollection("_default");
            var collection_configuration = new CollectionConfiguration()
                ConflictResolver = new SyncConflictResolver()
            var sync_config = new ReplicatorConfiguration(target)
                ReplicatorType = ReplicatorType.PushAndPull,
                Continuous = true,
                Authenticator = auth,
            sync_config.AddCollection(collection, collection_configuration);
            replicator = new Replicator(sync_config);
            listenerToken = replicator.AddChangeListener(SyncStatusUpdate);

And here is now I stop it:

            if (replicator == null)
                //  replicator doesn't actually stop immediately. We will wait for a few ms if wait_ms is passed in > 0
                if (wait_ms > 0)
                    var wait_until = DateTimeOffset.Now.AddMilliseconds(wait_ms);
                    while (currentSyncStatus != SyncStatus.NotStarted && wait_until > DateTimeOffset.Now)
            catch (Exception e)
                log.Error(e, "Failed to stop sync agent");
            replicator = null;

My stop function is called as follows:

            WaitForSyncIdle(250);      // wait an initial 250ms, then wait for idle state.
            StopSyncAgent(250);       // stop sync agent, waiting 250ms for it to complete.

Is there some extra work I need to do when stopping replication during application shutdown that is needed to stop the extra document sync with an old copy of the document?

Many thanks.

This is something too nuanced to diagnose from just a description of what is happening. This seems unexpected enough to file a bug report with logs to look into though.

I’ve been experimenting, and if I add a 2 second delay after the db.Save in OnResignActivation, then the problem goes away for small docs - the sync properly happens and the remote side gets the updated . For larger docs the problem comes back.

I’ll have to relearn how to do DB logs… its been a long time since I needed to.
I’ll get back to you about the bug report once I have decent logs.

More detail on this, and a workaround.

Background. I have a large document. The document is being modified on multiple endpoints, and as those changes are made, I save small change objects, which are synced. So all endpoints have the doc, and everyone’s changes, and therefore show the same result.

When one endpoint quits (or stops modifying the document), it removes its changes and saves the whole document. All endpoints are able to show the same results all the time.

From the logs, what seems to happen is this at Endpoint A:

  • Endpoint A quits
  • Endpoint A deletes all changes
  • Endpoint A saves the large document.
  • From the db/sync logs, it looks like the changes sync, then the sync manager gets a stop message, but it still starts to sync the large doc. I’m not sure.

then at Endpoint B

  • Endpoint B is notified of the changes disappearing.
  • Endpoint B gets a notification that the document changed. When it loads the full document… the only thing there is an old version. So it thinks that Endpoint A saved an old version and winds its changes back.

Perhaps my code should handle this case better (delete your changes, save old version). Fortunately, I’ve found 3 workarounds for now:

  • when the app quits, and after the final save, wait a few seconds. The app has already moved to the background, so the user doesn’t notice. This works for smaller documents, but not really large documents.
  • shut syncing off before doing the last save. Remote endpoints will miss the last changes, but that’s ok, and they catch up the next time the app starts, and this is all a rare occurrence.
  • perform the “delete changes” and “save document” in a Batch/Transaction. This seems to cause them to sync together. I haven’t tested with really large documents yet.

So I’ve implemented the last two approaches together, and the problem seems to go away.

Great that you found a way that works. If you still can reproduce it with logs or a reproduction case that would be ideal in case there is an actual underlying issue, but if not then that’s fine too.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.