"Endpoint not reachable"

“Endpoint not reachable”

Hi,

I’m looking for some advice as to how I can track my problem down - I’ve broken something and I’m not sure where :-(.

I have a C# app that runs on iOS and the Mac (lots of shared code, especially the data management).

The apps talk to a sync gateway on (dbserver.xyz.org:4984), and to a Web/REST service (appserver.xyz.org:5000). This was working and fully functional a few days ago.

I decided it was time to switch everything to TLS and to turn off the App Transport Security (ATS) workaround. I made CA key/cert and wildcard server key/cert/chain (*.xyz.org). I’ve installed the CA cert on the iPhone simulator and on my Mac. I’m confident the certs are good because I can browse to the SSL Web/REST server from the mac and iPhone and dont get any cert warnings plus “openssl verify” validates the chain (I’ve spent waaaay too long working with openssl and TLS in the past, so my openssl-fu is pretty good).

The Mac app is happy - it can sync to https://dbserver.xyz.org:4984 and fires REST requests at the https://appserver.xyz.org:5000 perfectly. My python service tester is also happy.

The iPhone app on the simulator is less happy - it can connect to the REST service via TLS, but it can’t seem to connect to the sync service. This is shared code - so both apps are running exactly the same code to create the replication objects.

Logs
I’ve turned on Couchbase logging - and here’s what I see:

INFO) SYNC (Replication): [8] 2018-3-27 02:03:47.010+13:00 Attempting to start puller (a38ce72d-6e09-4962-baa8-1c62cd82a222)
INFO) SYNC (Replication): [8] 2018-3-27 02:03:47.011+13:00 Attempting to start pusher (3b7fd8ca-db2f-4ed0-be9e-cd88ad0482ee)
INFO) SYNC (NetworkReachabilityManager): [8] 2018-3-27 02:03:47.029+13:00 Didn't get successful connection to http://dbserver.xyz.org:4984/private
INFO) SYNC (NetworkReachabilityManager): [8] 2018-3-27 02:03:47.030+13:00 Didn't get successful connection to http://dbserver.xyz.org:4984/private
INFO) SYNC (Replication): [8] 2018-3-27 02:03:47.030+13:00 Remote endpoint is not reachable, going offline...
INFO) SYNC (Replication): [8] 2018-3-27 02:03:47.030+13:00 Remote endpoint is not reachable, going offline...
System.NullReferenceException: Object reference not set to an instance of an object
    at Couchbase.Lite.NetworkReachabilityManager.CanReach (Couchbase.Lite.Internal.RemoteSession session, System.String remoteUri, System.TimeSpan timeout) [0x0000c] in /Users/jenkins/jenkins/workspace/couchbase-lite-net-build/1.4.1/iOS/couchbase-lite-net/src/Couchbase.Lite.Shared/Manager/NetworkReachabilityManager.cs:56 

Just to check there wasn’t something weird happening with name resolution - I wrote some code to check names/ips just before starting the sync, and whether the port is open. The results were as expected:
(I also checked ports that I knew would be closed to be sure my port checker was working).

dbserver.xyz.org resolves to 192.168.80.127		(this is correct - running on VM with local IP).
appserver.xyz.org resolves to 127.0.0.1			(this is correct - test REST server running locally)
dbserver.xyz.org:4984 is open			(std sync port)
dbserver.xyz.org:4986 is closed			(correct - not used)
appserver.xyz.org:5000 is open			(REST service)
appserver.xyz.org:5002 is closed		(correct - not used)

Lastly I turned off TLS (and added the ATS workaround). Again, the Mac is happy and the iOS app fails to connect to the sync server, but still connects to my App/REST service.

In Summary:

  • Mac and iOS share Sync code.
  • Mac works fine over TLS and HTTP to REST Service and Sync Service
  • iOS works over TLS and HTTP to REST Service, fails to connect to Sync Service over HTTP and TLS.
  • iOS app can open the sync gateway port… but the couchbase library cannot.

So the problem is not just with the TLS connection - I’ve broken something somewhere.
Lastly, I also checked git - there have been no changes to the shared code that starts the sync engine.

My questions:
a) Is there something that could impact the reachability code?
b) Is the System.NullReferenceException relevant/important?

Thanks for advice as to how I can track this down.
Paul

My last post was far to long, so I thought I’d show the sync code separately.

public void StartSyncAgent()
{
    log.Trace("StartSyncAgent");
    if (id == null || server_password == null)
    {
        log.Error("Authentication credentials required for sync agent");
        throw new BadPasswordException();
    }
    //  Debug - check name resolution and server access
    var ips = Dns.GetHostAddresses("dbserver.xyz.org");
    foreach (var ip in ips)
    {
        log.Debug("dbserver.xyz.org resolves to {0}".Fmt(ip.ToString()));
    }
    ips = Dns.GetHostAddresses("appserver.xyz.org");
    foreach (var ip in ips)
    {
        log.Debug("app server.xyz.org resolves to {0}".Fmt(ip.ToString()));
    }
    CheckPortOpen("dbserver.xyz.org", 4984);
    CheckPortOpen("dbserver.xyz.org", 4986);
    CheckPortOpen("appserver.xyz.org", 5000);
    CheckPortOpen("appserver.xyz.org", 5002);

    var config = AppConfig.Instance;
    var url = new Uri(config.sync_url);    // "http://dbserver.xyz.org:4984/private"
    push = DB.Instance.database.CreatePushReplication(url);
    pull = DB.Instance.database.CreatePullReplication(url);
    var auth = AuthenticatorFactory.CreateBasicAuthenticator(id, server_password);
    push.Authenticator = auth;
    pull.Authenticator = auth;
    push.Continuous = true;
    pull.Continuous = true;

    var user = UserManager.Instance.user;
    var channels = user.GetChannels();
    log.Debug("Start Sync: Channels now {0}", string.Join(", ", channels.ToArray()));
    pull.Channels = channels.ToArray(); 
    push.Changed += SyncStatusUpdate;    
    pull.Changed += SyncStatusUpdate;
    push.Start();
    pull.Start();
    currentSyncStatus = SyncStatus.Idle;
}

Are you setting WebRequest.DefaultProxy to null anywhere in your code base? That is where the exception is coming from and the library is mistaking that as an invalid connection. Setting it to a non-null value should fix that, but it should be non-null by default.

Hi @borrrden,

Thanks for getting back to me. I have not changed “DefaultProxy” (searched all code in projects + libraries).

Further… I can do direct REST queries to dbserver.xyz.org:4984 - that works just fine. Here is some code I used to save a document (DBProperties is just a dict):

    public DBProperties RESTSaveDoc(DBProperties props)
    {
        log.Trace("RESTSaveDoc");
        if (id == null || server_password == null)
        {
            log.Error("Authentication information required to create document");
            throw new BadPasswordException();
        }
        var config = AppConfig.Instance;
        var docId = (string)props["_id"];
        var client = new RestClient(config.sync_url);
        client.Authenticator = new HttpBasicAuthenticator(id, server_password);
        var req = new RestRequest(docId, Method.PUT);
        var json = Newtonsoft.Json.JsonConvert.SerializeObject(props);
        //var json = req.JsonSerializer.Serialize(props);           DO NOT USE req serializer - it cannot deal with string arrays (like channel)
        req.RequestFormat = DataFormat.Json;
        req.AddHeader("Content-Type", "application/json");
        req.AddParameter("application/json", json, ParameterType.RequestBody);

        req.Timeout = 3000;
        log.Debug("About to insert a new record");
        var response = client.Execute(req);
        if (response.StatusCode != System.Net.HttpStatusCode.OK && response.StatusCode != System.Net.HttpStatusCode.Created)
        {
            log.Error("Failed to create document: {0}, {1}, {2}", response.ErrorMessage, response.ErrorException, response.StatusCode);
            log.Error("RestClient {0}", client.ToString());
            log.Error("Rest Request {0}", req.ToString());

            if (response.StatusCode == System.Net.HttpStatusCode.Conflict)
                throw new BadDataException();
            if (response.StatusCode == System.Net.HttpStatusCode.Forbidden)
                throw new PermissionDeniedException();
            if (response.StatusCode == System.Net.HttpStatusCode.NotFound)
                throw new MissingIDException();
            throw new Exception();
        }
        var content = response.Content;
        log.Debug("Got content {0}", content);
        return JsonConvert.DeserializeObject<DBProperties>(content);
    }

One other thing that has changed and could cause an impact.

When implementing SSL - I added the following to my /etc/hosts file:
192.168.80.127 dbserver.xyz.org
127.0.0.1 appserver.xyz.org

That way I can use FQDNs for communications and the FQDNs match the SSL certs (which are wildcards). I checked name resolution in the simulator app (which is supposed to use host networking) and the mac app. In both cases name resolution worked correctly. Is there something weird happening with name resolution in the code near “NetworkReachabilityManager.cs:56”?

The library doesn’t do any name resolution manually, but an entry in the hosts file might affect the default proxy maybe? Do things work if you use an endpoint that is not in the hosts file?

Removed the endpoints from the hosts file and actually registered them at our domain registrar (we own the “xyz.org” domain). No change.

Just prior to creating the push/pull replicators, I checked the System.Net.WebRequest.DefaultWebProxy value (not DefaultProxy) - its null.

  • Is there another WebRequest object I should be looking at?

  • Do you mean DefaultWebProxy?

  • I’ve not used the DefaultWebProxy, but the docs that Proxy should be set as follows:

    To specify that no proxy should be used, set the Proxy property to the proxy
    instance returned by the GlobalProxySelection.GetEmptyWebProxy method

How can I get to the WebRequest object for the push or pull replicators? I can’t see it in the property list.

Yes, I meant DefaultWebProxy. That is a static .NET object and it happens to be used exactly on the line in question in the network reachability manager. You set it just as is WebRequest.DefaultWebProxy = ...

@borrrden
Ok that’s given me something to look for. Thank you.

DefaultWebProxy is non-null during application startup on iOS.
But at the time when I’m starting the replication, it has become null.
Some call somewhere has had an unintended side effect.

Found the line… I’m using RestSharp to make calls to my app server, which happens before replication starts. RestSharp is trashing DefaultWebProxy.

It appears RestSharp (prior to 106.2.1) has a line that has caused issues:

http.Proxy = this.Proxy ?? WebRequest.GetSystemWebProxy(); IWebProxy defaultWebProxy = WebRequest.DefaultWebProxy; WebRequest.DefaultWebProxy = http.Proxy;

(https://github.com/restsharp/RestSharp/issues/1066)

After 106.2.1, that line changed to:

WebRequest.DefaultWebProxy = http.Proxy;

If I read that right - that line is a bug. http.Proxy can be null. I have tried RestSharp 106.2.0 and 106.2.1, both exhibit the bug.

So I reset DefaultWebProxy after my REST call:

WebRequest.DefaultWebProxy = WebRequest.GetSystemWebProxy();

And now syncing works. Grumble… I’m going to have to add that line throughout my REST library for now until I can get the RestSharp people to look at this.

Thanks for your help @borrrden.

Paul.

It’s not impossible for it to be null, I just didn’t know that at the time I wrote this part of 1.x. The bug is in Couchbase Lite, I was just suggesting a workaround. I need to fix this in 1.x to handle the null proxy case.

Ok - but I also think that nullifying the DefaultWebProxy isn’t good behaviour on the part of RestSharp.

Cheers
Paul.