Replicator pull continous stop after network error in couchbase lite 2.8.x android

Hi,

Replicator PULL continous stop after network error in couchbase lite 2.8.x android.
Today, I tested 2.8.5 and 2.8.4.

Steps :
1 / launch app
2 / wait IDLE state replicator
3 / turn off network

Logs:

2021-04-07 11:57:48.300 2414-2477/com.mapotempo.testcouchbase E/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#1} WebSocket closed abnormally (reason=Unknown error 10)
2021-04-07 11:57:48.302 2414-2454/com.mapotempo.testcouchbase E/CouchbaseLite/REPLICATOR: {Repl#2} Got LiteCore error: LiteCore error 26 "unexpected exception"
2021-04-07 11:57:48.309 2414-2414/com.mapotempo.testcouchbase I/Replicator: Error code ::  26
2021-04-07 11:57:48.309 2414-2414/com.mapotempo.testcouchbase I/Replicator: Status ::  STOPPED

Code :

public class MainActivity extends AppCompatActivity
{
    private static final String TAG = "Replicator";
    private Database mDatabase;
    private Replicator mReplicator;

    @Override
    protected void onCreate(Bundle savedInstanceState)
    {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        CouchbaseLite.init(getApplicationContext());

        DatabaseConfiguration config = new DatabaseConfiguration();

        config.setDirectory(getApplicationContext().getFilesDir().getAbsolutePath());

        try
        {
            mDatabase = new Database("test", config);

            Endpoint targetEndpoint = new URLEndpoint(new URI("URL"));
            ReplicatorConfiguration replConfig = new ReplicatorConfiguration(mDatabase, targetEndpoint);
            replConfig.setReplicatorType(ReplicatorConfiguration.ReplicatorType.PULL);
            replConfig.setContinuous(true);
            replConfig.setAuthenticator(
                new BasicAuthenticator("LOGIN", "PASSWORD".toCharArray()
                )
            );
            mReplicator = new Replicator(replConfig);

            mReplicator.addChangeListener(change -> {
                if (change.getStatus().getError() != null) {
                    Log.i(TAG, "Error code ::  " + change.getStatus().getError().getCode());
                }
                Log.i(TAG, "Status ::  " + change.getStatus().getActivityLevel().toString());
            });

            mReplicator.start();

        } catch (CouchbaseLiteException | URISyntaxException exception)
        {
            exception.printStackTrace();
            finish();
        }
    }

    @Override
    protected void onDestroy()
    {
        try
        {
            if (mDatabase != null) {
                mDatabase.close();
            }
        } catch (CouchbaseLiteException exception)
        {
            exception.printStackTrace();
        }
        super.onDestroy();
    }
}

Tested on Java; Android 10; SM-A202F
Best regards

2 Likes

in couchbase lite 2.7.1, it works, we have OFFLINE state

2021-04-07 12:30:44.964 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Status ::  CONNECTING
2021-04-07 12:30:45.522 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Status ::  BUSY
2021-04-07 12:30:45.650 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Status ::  IDLE
2021-04-07 12:30:50.794 14926-14992/com.mapotempo.testcouchbase E/CouchbaseLite/REPLICATOR: {Repl#5} Got LiteCore error: WebSocket error 1001 "WebSocket connection closed by peer"
2021-04-07 12:30:50.818 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Error code ::  11001
2021-04-07 12:30:50.818 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Status ::  BUSY
2021-04-07 12:30:50.906 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Error code ::  11001
2021-04-07 12:30:50.906 14926-14926/com.mapotempo.testcouchbase I/REPLICATOR: Status ::  OFFLINE

@Giallon Hmmm… interesting.

Can you give me a little bit more log? Those 4 lines are not enough to identify the location of the bug. Can you set the debug level to at least VERBOSE and provide logs for the whole connection process?

In particular, I need the log line (v2.8.5) that looks like this:
I/CouchbaseLite/NETWORK: WebSocket CLOSED …

… and I do believe that this is probably a bug.

FWIW, here is what I believe is happening:
When the remote is disconnected, OkHttp, the package we use for websocket communication, reports the error code that it gets from the OS. The OS uses POSIX codes which, perhaps surprisingly, are platform dependent! In the process of correcting how some failures are reported, I inadvertently made Android (where we can identify the POSIX code) report connection failure as an unknown failure type. Core, which manages the replicator, considers an unknown failure unrecoverable.

Thanks @blake.meike for your answer

With your instructions , here are the logs :

2021-04-07 19:38:56.469 12680-12680/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4blip10ConnectionE#3}==> N8litecore4blip10ConnectionE ->wss://mobile.fleet.beta.mapotempo.com:4984/db/_blipsync @0x76cb38b110
2021-04-07 19:38:56.469 12680-12680/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4blip10ConnectionE#3} Opening connection...
2021-04-07 19:38:56.473 12680-12836/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#6}==> N8litecore4repl12C4SocketImplE wss://mobile.fleet.beta.mapotempo.com:4984/db/_blipsync @0x7762b87430
2021-04-07 19:38:56.473 12680-12836/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#6} Connecting...
2021-04-07 19:38:57.716 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#6} Got HTTP response (status 101)
2021-04-07 19:38:57.718 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#6} Connected!
2021-04-07 19:38:57.718 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4blip10ConnectionE#3} Connected!
2021-04-07 19:38:57.718 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: WebSocket CONNECTED!
2021-04-07 19:38:57.728 12680-12835/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4blip6BLIPIOE#7}==> N8litecore4blip6BLIPIOE ->wss://mobile.fleet.beta.mapotempo.com:4984/db/_blipsync @0x76cb4840c8
2021-04-07 19:39:09.281 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: WebSocket CLOSED
    javax.net.ssl.SSLException: Read error: ssl=0x76d1ed8e48: I/O error during system call, Software caused connection abort
        at com.android.org.conscrypt.NativeCrypto.SSL_read(Native Method)
        at com.android.org.conscrypt.NativeSsl.read(NativeSsl.java:411)
        at com.android.org.conscrypt.ConscryptFileDescriptorSocket$SSLInputStream.read(ConscryptFileDescriptorSocket.java:583)
        at okio.Okio$2.read(Okio.java:140)
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
        at okio.RealBufferedSource.request(RealBufferedSource.java:72)
        at okio.RealBufferedSource.require(RealBufferedSource.java:65)
        at okio.RealBufferedSource.readByte(RealBufferedSource.java:78)
        at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:117)
        at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:101)
        at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:273)
        at okhttp3.internal.ws.RealWebSocket$1.onResponse(RealWebSocket.java:209)
        at okhttp3.RealCall$AsyncCall.execute(RealCall.java:174)
        at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:919)
2021-04-07 19:39:09.282 12680-12882/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#6} sent 135 bytes, rcvd 121, in 11.563 sec (12/sec, 10/sec)
2021-04-07 19:39:09.283 12680-12829/com.mapotempo.testcouchbase I/CouchbaseLite/NETWORK: {N8litecore4blip10ConnectionE#3} Closed with Unknown error 10: unexpected exception

Best regards

1 Like

Awesome, @Giallon . I’ve filed

https://issues.couchbase.com/browse/CBL-1841

to track the issue.

Thanks you @blake.meike

@Giallon : I have not been able to reproduce this bug. I think I know what is going on, and I think I have fixed it.

If you have a chance to try commit 010dddbcf413ec72 (Update to Lithium Alpha) or better, in the repo couchbase-lite-java-ce-root.git (this is the Community edition). that would be great.

If you need the EE version to test, please get in touch with your rep. I’d be happy to get you on the beta-tester’s list.

@Giallon I’ve made a number of changes to address this and similar bugs. If you can try the Lithium beta I’d be really interested to know if it fixes your problem

Hi @blake.meike,

I can try it after 20 december.

This also sounds exactly like the issue we have been having with 2.8.6 Enterprise Edition. Whenever you turn on flight mode in emulator we get a Error code :: 26: unexpected exception from com.couchbase.lite.AbstractReplicator.updateStatus, the replicator never recovers even when the user goes back online.

Tried upgrading to 3.0.0-beta02 which contains this fix: CBL-1841: attempt to provide Core with more meaningful failure status · couchbase/couchbase-lite-java-common@7d6de7b · GitHub but opon closing the internet connection, by selecting FLIGHT mode from the emulator we get the following exception. The replicator does not start again when re-establishing a connection.

2021-12-09 13:51:29.251 20816-21006/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} now idle
2021-12-09 13:51:29.251 20816-21006/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {C4Replicator#79} State: idle, progress=100.00%
2021-12-09 13:51:29.252 20816-21006/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] ReplicatorListener.statusChanged, repl: Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}, status: C4ReplicatorStatus{level=3,completed=562,total=562,#docs=50,domain=0,code=0,info=0}
2021-12-09 13:51:29.252 20816-20919/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] status changed: (0, 0) @C4ReplicatorStatus{level=3,completed=562,total=562,#docs=50,domain=0,code=0,info=0} for Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}
2021-12-09 13:51:29.252 20816-20919/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] State changed BUSY => IDLE(562/562): null for Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}
2021-12-09 13:51:29.253 20816-21154/com.skedulo.app.v3 D/CouchbaseDatabaseSourceImpl: Replication status: Status{activityLevel=IDLE, progress=Progress{completed=562, total=562}, error=null}. Pending: 2
2021-12-09 13:51:40.535 20816-21012/com.skedulo.app.v3 I/CouchbaseLite/NETWORK: [JAVA] WebSocket CLOSED with error
    javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
        at com.android.org.conscrypt.NativeCrypto.SSL_read(Native Method)
        at com.android.org.conscrypt.NativeSsl.read(NativeSsl.java:411)
        at com.android.org.conscrypt.ConscryptFileDescriptorSocket$SSLInputStream.read(ConscryptFileDescriptorSocket.java:549)
        at okio.InputStreamSource.read(JvmOkio.kt:93)
        at okio.AsyncTimeout$source$1.read(AsyncTimeout.kt:125)
        at okio.RealBufferedSource.request(RealBufferedSource.kt:206)
        at okio.RealBufferedSource.require(RealBufferedSource.kt:199)
        at okio.RealBufferedSource.readByte(RealBufferedSource.kt:209)
        at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.kt:119)
        at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.kt:102)
        at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.kt:293)
        at okhttp3.internal.ws.RealWebSocket$connect$1.onResponse(RealWebSocket.kt:195)
        at okhttp3.internal.connection.RealCall$AsyncCall.run(RealCall.kt:519)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:919)
2021-12-09 13:51:40.535 20816-21012/com.skedulo.app.v3 E/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#84} WebSocket closed abnormally (reason=WebSocket/HTTP status 1015)
2021-12-09 13:51:40.535 20816-21012/com.skedulo.app.v3 I/CouchbaseLite/NETWORK: {N8litecore4repl12C4SocketImplE#84} sent 6697 bytes, rcvd 8201, in 24.485 sec (274/sec, 335/sec)
2021-12-09 13:51:40.536 20816-21008/com.skedulo.app.v3 I/CouchbaseLite/NETWORK: {N8litecore4blip10ConnectionE#82} Closed with WebSocket/HTTP status 1015: javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
2021-12-09 13:51:40.538 20816-21008/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} Connection closed with WebSocket/HTTP status 1015: "javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort" (state=2)
2021-12-09 13:51:40.546 20816-21008/com.skedulo.app.v3 E/CouchbaseLite/REPLICATOR: {Repl#83} Got LiteCore error: WebSocket error 1015, "javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort"
2021-12-09 13:51:40.549 20816-21008/com.skedulo.app.v3 V/CouchbaseLite/REPLICATOR: {Repl#83} progress +0/+0, 0 docs -- now 562 / 562, 50 docs
2021-12-09 13:51:40.553 20816-21008/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} activityLevel=stopped: connectionState=-1, savingChkpt=0
2021-12-09 13:51:40.555 20816-21008/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} now stopped
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} documentEnded 69b4677e-13cd-4940-b778-dd175e81ed53 2-d23aed76d07b9d5ad7a367f55ec9718dd5ba462b flags=00 (6/409)
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {Repl#83} documentEnded dc62c865-7695-4926-9f40-3bce5464135d 2-eadb4d90a8dc132381e7d2a366fcbb2430eec88d flags=03 (6/409)
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {N8litecore4repl6PusherE#86} activityLevel=stopped: pendingResponseCount=0, caughtUp=1, changeLists=0, revsInFlight=0, blobsInFlight=0, awaitingReply=0, revsToSend=0, pushingDocs=0, pendingSequences=2
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 V/CouchbaseLite/REPLICATOR: {N8litecore4repl6PusherE#86} now stopped
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: {N8litecore4repl6PullerE#88} activityLevel=stopped: pendingResponseCount=0, _caughtUp=1, _pendingRevMessages=0, _activeIncomingRevs=0
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 V/CouchbaseLite/REPLICATOR: {N8litecore4repl6PullerE#88} now stopped
2021-12-09 13:51:40.560 20816-21007/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: BLIP sent 113 msgs (6697 bytes), rcvd 66 msgs (8201 bytes) in 24.510 sec. Max outbox depth was 1, avg 1.00
2021-12-09 13:51:40.604 20816-21008/com.skedulo.app.v3 E/CouchbaseLite/REPLICATOR: {C4Replicator#79} State: stopped, progress=100.00%, error=WebSocket error 1015, "javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort"
2021-12-09 13:51:40.609 20816-21008/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] ReplicatorListener.statusChanged, repl: Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}, status: C4ReplicatorStatus{level=0,completed=562,total=562,#docs=50,domain=6,code=1015,info=2}
2021-12-09 13:51:40.616 20816-20967/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] status changed: (0, 0) @C4ReplicatorStatus{level=0,completed=562,total=562,#docs=50,domain=6,code=1015,info=2} for Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}
2021-12-09 13:51:40.616 20816-20967/com.skedulo.app.v3 I/CouchbaseLite/REPLICATOR: [JAVA] State changed IDLE => STOPPED(562/562): CouchbaseLiteException{CouchbaseLite,11015,'javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
       (CouchbaseLite Android v3.0.0-180@31 (EE/release, Commit/96562e33aa@1225da0762f2 Core/3.0.0 (180) at 2021-10-14T23:23:16.155267Z) on Java; Android 10; Android SDK built for x86)'} for Replicator{@0x3c4012d(<*>),Database{@0xc8329, name='skedulo-v3'} => URLEndpoint{url=wss://test-couchbase-sgw-us1.test.skl.io/skedulo-mobile-edge/}}
2021-12-09 13:51:40.617 20816-21154/com.skedulo.app.v3 V/CouchbaseLite/REPLICATOR: {C4Replicator#79} Replicator not yet created, saving progress level value for later...
2021-12-09 13:51:40.626 20816-21154/com.skedulo.app.v3 D/CouchbaseDatabaseSourceImpl: Replication status: Status{activityLevel=STOPPED, progress=Progress{completed=562, total=562}, error=CouchbaseLiteException{CouchbaseLite,11015,'javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
       (CouchbaseLite Android v3.0.0-180@31 (EE/release, Commit/96562e33aa@1225da0762f2 Core/3.0.0 (180) at 2021-10-14T23:23:16.155267Z) on Java; Android 10; Android SDK built for x86)'}}. Pending: 3
2021-12-09 13:51:40.632 20816-21154/com.skedulo.app.v3 E/CouchbaseDatabaseSourceImpl: Error code :: 11015: javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
       (CouchbaseLite Android v3.0.0-180@31 (EE/release, Commit/96562e33aa@1225da0762f2 Core/3.0.0 (180) at 2021-10-14T23:23:16.155267Z) on Java; Android 10; Android SDK built for x86)
    CouchbaseLiteException{CouchbaseLite,11015,'javax.net.ssl.SSLException: Read error: ssl=0xba6ef278: I/O error during system call, Software caused connection abort
       (CouchbaseLite Android v3.0.0-180@31 (EE/release, Commit/96562e33aa@1225da0762f2 Core/3.0.0 (180) at 2021-10-14T23:23:16.155267Z) on Java; Android 10; Android SDK built for x86)'}
        at com.couchbase.lite.ReplicatorStatus.<init>(ReplicatorStatus.java:74)
        at com.couchbase.lite.AbstractReplicator.updateStatus(AbstractReplicator.java:622)
        at com.couchbase.lite.AbstractReplicator.c4StatusChanged(AbstractReplicator.java:505)
        at com.couchbase.lite.ReplicatorListener.lambda$statusChanged$0(ReplicatorListener.java:52)
        at com.couchbase.lite.ReplicatorListener$$ExternalSyntheticLambda0.run(Unknown Source:4)
        at com.couchbase.lite.internal.exec.InstrumentedTask.run(InstrumentedTask.java:60)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
        at java.lang.Thread.run(Thread.java:919)

Happy new year @blake.meike ,

We tried the latest 3.0.0 Beta for CB Lite but now get the error above. Any other ideas? Does Sync Gateway also need to be updated to 3.0.0 Beta? Sync Gateway version is 2.8.x when testing with SB Lite 3.0.0 Beta

@Nims: I’m afraid I’m not going to be a lot of help, here. Okhttp and Okio are, as you probably know, standard libraries for handling network connections. Conscrypt is an OSS library that Android uses to manage TLS. Conscrypt is called, in several places, by reflection, so it is subject to odd runtime failures.

TLS libs are usually pretty tight-lipped about failures, because they don’t want to provide too much information about exactly what is going on, thus providing an avenue for attack.

I will be reviewing our Android’s platform’s handling of sockets for our next release. I am aware that there things we can do better, and I am working on improvements.

That is probably cold comfort to you, for the immediate future. I strongly suspect that the error you are seeing is the symptom of some kind of TLS failure. If you have a way to watch your network, Wireshark, Charles, or something like that, it would, probably, be the best way to identify the problem.

I’ve created https://issues.couchbase.com/browse/CBL-2731 to track this problem.

Please feel free to update that ticket with any additional information.

Thanks for the update @blake.meike. I just noticed on the ticket you created, the affected version is 2.8.5, but the I/O error is occuring in 3.0.0 Beta.

@Nims Whoops! Thanks for catching that. Fixing now.

Hi @Nims
You can schedule task to check replicator status. If replicator is stopped with error, restart.
Of course, You can make it with listener but sometimes, the replicator don’t like change status too quickly.

Hi @Giallon, thanks. That is exactly what we ended up doing. We monitor if the network connection changes, i.e comes back online, if the replicator is stopped or offline after coming back online, we start the replicator again. This seems to be working well so far.

We suspect the replicator is trying to use the old SSL connection after going offline, which fails - so we need to create a new SSL connection (which presumably reinitializing the replicator does).

Gentlemen… I’m determined to address this. A couple bits of information, for you:

  1. If the replicator is OFFLINE, it will retry. In 3.0 you can control the frequency and number of retries. Also, in Android, if the replicator is OFFLINE and the connectivity state of the device changes (it goes from no network connection, to connected), the replicator will retry immediately.
  2. The replicator should post state changes almost immediately. If it is not doing so, I’d look for some kind of thread deadlock.

Any information you can give me would be most welcome. Post it directly to the ticket, if you can do so. If not post it here and I will copy it to the ticket (https://issues.couchbase.com/browse/CBL-1841)

Not sure if it’s the same root cause, but there is a similar error reported here:

Is there any update on this. We are facing the same problem. The log file is attached.
Couchbase Logs_New.txt.zip (7.8 KB)

We have faced this in 3.0 EE Android. With SG 2.8.3

FYI @blake.meike

@pankaj.sharma : why do you say that this is the same problem? In your log I see two issues:

Attempt to invoke virtual method 'boolean com.couchbase.lite.internal.core.C4DatabaseChange.isExternal()' on a null object reference

and

java.lang.NullPointerException: Null reference used for synchronization (monitor-enter)

Neither of those is related to replication.

The first one is a bug. I’ve created https://issues.couchbase.com/browse/CBL-2994 to track it. It will be fixed in the next point-release of 3.0.

The second one is an annoyance that should not affect the behavior of your app in any way. I am researching the problem and have filed a bug against the Android OS. I’m trying to repro the issue in a small test case for the bug.