Deploying Sync Gatway behind Azure App Gateway - BLIP Sync

Hi,

I’m currently in the process of test driving the new couchbase couchbase mobile 2 developer previews.

So, I deployed two sync gateways with the current 1.5 devloper preview version. Everything for enabling BLIP is configured correctly. These sync gateways are configured to be private the the internal VLan. The allow access there is a Azure App Gateway (ISO/OSI Level 7 Load Balancer) configured to do SSL offloading and load balancing. When I now try to sync using the current devloper preview of couchbase lite I have a strange behavior. The client is able to start the connect, upgrade to websockets and then send the first command to sync Gateway. In the sync gateway log you can see that sync Gateway is processing the request and fetching the required data. The response to the client however gets lost and never arrives at the client. 30-60 seconds after the connect the client instead detects a timeout and reconnects. Now this starts again with the same result.

To check what to Problem really is I then created a Nginx cluster for ssl offloading and loadbalancing between the sync Gateway instances. Using that configuration the sync is working like a charm! So this problem has to be something in conjunction with the Azure App Gateway.

So my question: do you have any experience how to run the blip sync behind a Azure App Gateway?

I don’t, but this is strange because BLIP is layered on top of WebSockets, so everything it sends is just (binary) WebSocket messages. As far as the gateway is concerned, it’s just a regular WebSocket connection.

Is it possible for you do some packet sniffing, e.g. with WireShark? It would be interesting to see what is being sent from SG to the gateway, and from the gateway back to the client. (Of course you’d have to disable SSL for this.)

I finally had the oppurtunity to sniff the packages without an SSL Connection. I attached the Wireshark file to this post. (Don’t worry the authentication username and password have already be changed :wink: )

I hope you can extract the necessary information from this dump.

blip-azure.pcapng.zip (15.6 KB)

Sorry for reviving an old thread, but were you ever able to resolve this? From what I can tell it looks like some TCP segments are getting cut off.

Wow! Resurected!
We’ve done a lot of work on the SG, since this thread started. Would be really interested to hear if you were able to duplicate it with a recent version.

Hi Blake,

We’re trying to run sync gw 2.5 behind the azure app gw. The connection is successfully upgrading to a websocket, but the message that sends the revisions to sync seems to be getting truncating. I’m doubtful if this a cb sync gw issues, but we’re still trying to figure it out, so I was hoping that the original poster had found something. I’ll post back here if we figure it out.

Sorry. I originally replied to the wrong message.

I think this might help:

Thanks for the information. We’re moving away from using the azure app gateway for now since a resolution wasn’t provided there either.

Hi,

our solution also was to move away from Azure App Gw. In the meantime our sync-gateway is deployed behind a nginx proxy (kubernetes ingress). With this setup we did not have any issue like this one.

We never completely found out what was causing this issue. So migrating to nginx was the easiest solution.

1 Like

Thanks for the info, gents. Much appreciated.

Hello,

we are having the same issue, with the same setup.
I’m seeing that here : https://github.com/couchbase/couchbase-lite-core/issues/503

you changed both constants ( * kDefaultFrameSize

  • kBigFrameSize) in the BLIPConnection.cc to 4096 bytes and then replication is working fine.

that’s right ?

romain CALIX