LiteC replication occasionally fails with POSIX error 5 "Input/output error"

Hello all,

I’ve recently started seeing a new issue with my replications. This is the driver code I’m running:

int main() {
CBLError error;
CBLDatabaseConfiguration configDB = {“”, kCBLDatabase_Create};
CBLDatabase* db = CBLDatabase_Open(“wss-pull”, &configDB, &error);
CBLReplicatorConfiguration config = {};
CBLReplicator *repl = nullptr;
config.database = db;
config.replicatorType = kCBLReplicatorTypePull;
config.endpoint = CBLEndpoint_NewWithURL(“wss://api.xxx.xxx.com/xxx”);
config.authenticator = CBLAuth_NewBasic(“xxx”, “xxx”);
repl = CBLReplicator_New(&config, &error);
CBLReplicator_Start(repl);
CBLReplicatorStatus status;
while ((status = CBLReplicator_Status(repl)).activity != kCBLReplicatorStopped) {
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
std::cerr << “Finished with activity=” << status.activity
<< “, error=(” << status.error.domain << “/” << status.error.code << “)\n”;
return 0;
}

and, while most replications complete successfully, occasionally a replication will return the following:

14:23:50.620610| [DB]: {Shared#1}==> class litecore::DataFile::Shared 0000021D72F4E920 @0000021D72F4E920
14:23:50.621638| [DB]: {Shared#1} instantiated on Y:\xxx\host\build-apr16-Win-Converted-Release\bin\wss-pull.cblite2\db.sqlite3
14:23:50.621979| [DB]: {DB#2}==> class litecore::SQLiteDataFile .\wss-pull.cblite2\db.sqlite3 @0000021D72F4FFB0
14:23:50.622121| [DB]: {DB#2} Opening database
14:23:50.653035| Seeding the mbedTLS random number generator...
14:23:50.655177| [Actor]: Starting Scheduler<0000021D72FA0F50> with 10 threads
14:23:50.657802| [DB]: {DB#3}==> class litecore::SQLiteDataFile .\wss-pull.cblite2\db.sqlite3 @0000021D72FA1A00
14:23:50.657995| [DB]: {DB#3} Opening database
14:23:50.683318| [DB]: {DB#4}==> class litecore::SQLiteDataFile .\wss-pull.cblite2\db.sqlite3 @0000021D72FE5F00
14:23:50.683574| [DB]: {DB#4} Opening database
14:23:50.692127| [Sync]: {Repl#5}==> class litecore::repl::Replicator .\wss-pull.cblite2\ ->wss://api.xxx.xxx.com:443/xxx/_blipsync @0000021D73017F08
14:23:50.692346| [Sync]: {Repl#5} Pull=one-shot, Options={{auth:{password:"********", type:"Basic", username:"xxx"}}}
14:23:50.692582| [Sync]: {C4Replicator#6}==> class c4Internal::C4RemoteReplicator 0000021D72FD7130 @0000021D72FD7130
14:23:50.692722| [Sync]: {C4Replicator#6} Starting Replicator {Repl#5}
14:23:50.694507| [Sync]: {Repl#5} Scanning for pre-existing conflicts...
14:23:50.694624| [Sync]: {Repl#5} Found 0 conflicted docs in 0.001 sec
14:23:50.697373| [Sync]: {Repl#5} No local checkpoint 'cp-mfZF2f61VKy61f0gHy0v5j9xkHQ='
14:23:51.523185| [Sync]: {Repl#5} Connected!
14:23:51.523297| [Sync]: {Repl#5} now busy
14:23:51.523368| [Sync]: {C4Replicator#6} State: busy, progress=0.00%
14:23:51.523491| [Sync]: {Puller#7}==> class litecore::repl::Puller ->wss://api.xxx.xxx.com:443/xxx/_blipsync @0000021D7303A278
14:23:51.523732| [Sync]: {Puller#7} Starting pull from remote seq 
14:23:51.616838| [Sync]: {Repl#5} No remote checkpoint 'cp-mfZF2f61VKy61f0gHy0v5j9xkHQ='
14:23:51.621753| [Sync]: {RevFinder#8}==> class litecore::repl::RevFinder ->wss://api.xxx.xxx.com:443/xxx/_blipsync @0000021D7303A6C8
14:23:51.622058| [Sync]: {RevFinder#8} Received 26 changes (seq '207'..'290')
14:23:51.622596| [Sync]: {RevFinder#8} Responded to 'changes' REQ#1 w/request for 26 revs in 0.000250 sec
14:23:51.722513| [Sync]: {Puller#7} Caught up with remote changes
14:23:51.724482| [Sync]: {C4Replicator#6} State: busy, progress=0.00%
14:23:51.744679| [DB]: {DB#9}==> class litecore::SQLiteDataFile .\wss-pull.cblite2\db.sqlite3 @0000021D73181A80
14:23:51.745005| [DB]: {DB#9} Opening database
14:23:51.753288| [Sync]: {Inserter#10}==> class litecore::repl::Inserter ->wss://api.xxx.xxx.com:443/xxx/_blipsync @0000021D730F2DB8
14:23:51.753586| [Sync]: {Inserter#10} Inserted   9 revs in   9.64ms (  934/sec) of which 99.8% was commit
14:23:51.816551| [WS] ERROR: mbedTLS(C): mbedtls_ssl_fetch_input() returned -76 (-0x004c)
14:23:51.816747| [WS] ERROR: mbedTLS(C): ssl_get_next_record() returned -76 (-0x004c)
14:23:51.816863| [WS] ERROR: mbedTLS(C): mbedtls_ssl_read_record() returned -76 (-0x004c)
14:23:51.816987| [WS] WARNING: ClientSocket got POSIX error 5 "Input/output error"
14:23:51.817283| [WS]: {BuiltInWebSocket#11}==> class litecore::websocket::BuiltInWebSocket wss://api.xxx.xxx.com:443/xxx/_blipsync @0000021D72FD7600
14:23:51.817513| [WS] ERROR: {BuiltInWebSocket#11} Unexpected or unclean socket disconnect! (reason=errno 5)
14:23:51.817887| [Sync]: {Repl#5} Connection closed with errno 5: "Input/output error" (state=2)
14:23:51.818031| [Sync] ERROR: {Repl#5} Got LiteCore error: POSIX error 5 "Input/output error"
14:23:51.835840| [Sync]: {Inserter#10} Inserted   8 revs in   2.31ms ( 3457/sec) of which 99.2% was commit
14:23:51.924636| [Sync] ERROR: {C4Replicator#6} State: busy, progress=65.38%, error=POSIX error 5 "Input/output error"
14:23:51.924849| [Sync] WARNING: No listener to receive error from CBLReplicator 0000021D72FD6F00: POSIX error 5 "Input/output error"
14:23:51.925031| [Sync]: {Repl#5} now stopped
14:23:51.925095| [Sync]: BLIP sent 3 msgs (168 bytes), rcvd 19 msgs (11945 bytes) in 0.402 sec. Max outbox depth was 1, avg 1.00
14:23:51.926914| [DB]: {DB#4} Closing database
14:23:51.927264| [DB]: {DB#9} Closing database
14:23:51.927412| [Sync] ERROR: {C4Replicator#6} State: stopped, progress=65.38%, error=POSIX error 5 "Input/output error"
14:23:51.927584| [Sync] WARNING: No listener to receive error from CBLReplicator 0000021D72FD6F00: POSIX error 5 "Input/output error"
Finished with activity=0, error=(2/5)

More info:
This has only been seen on Windows and with TLS/WSS connections. I used CMake to compile.
I’ve gotten this result both on Windows running on a VM inside a Mac, and on a native Windows machine. The DB being transferred is not particularly large and system assets are not being over-utilized.
I’m using very recent versions of LiteC/Core but not the very latest.
I’d say this interruption happens around 10% of the time for me. Running the same script again will normally properly finish the replication of the DB.

Thanks in advance.

Have you filed an issue on Github?

I had not. Done now, filed on LiteCore repo.