Sync gateway raised the error "WS: c:[5abf1836] ERROR decompressing frame: inputLen=4090, remaining=0, output=0, error=unexpected EOF" in k8s

Hello All,

I am deploying the couchbase server (6.0 community edition) and sync gateway (2.6.1 community edition) to k8s environment, the couchbase server and sync gateway are running in the same docker container and behind a web server - Traefik . But I am facing the issue regarding “unexpected EOF” in sync gateway when saving data (Json format). I am able to save some specific data (json format) without issue but there is only one case it caused this error.

I didn’t see this issue when I deployed couchbase server and sync gateway to AWS ECS. Everything is working fine in this environment.

So I am wondering whether anyone is facing the same issue.
Any advice are appreciated!

Below is the error log which I extracted from sg_info.log
2020-07-30T20:41:56.847Z [INF] HTTP: #893: GET /offline/_blipsync (as GUEST)
2020-07-30T20:41:56.847Z [INF] HTTP+: #893: --> 101 [117966a2] Upgraded to BLIP+WebSocket protocol (as GUEST) (0.0 ms)
2020-07-30T20:41:56.847Z [INF] WS: c:[117966a2] Start BLIP/Websocket handler
2020-07-30T20:41:56.944Z [INF] SyncMsg: c:[117966a2] #1: Type:getCheckpoint Client:cp-ISneQUdMy4rgoNEoIPSaCA8b8/8=
2020-07-30T20:41:57.032Z [INF] SyncMsg: c:[117966a2] #2: Type:proposeChanges #Changes: 1
2020-07-30T20:41:57.069Z [INF] SyncMsg: c:[117966a2] #3: Type:subChanges Since:35 Continuous:true Filter:sync_gateway/bychannel Channels:ABC
2020-07-30T20:41:57.069Z [INF] Sync: c:[117966a2] Sending changes since 35
2020-07-30T20:41:57.069Z [INF] Changes: c:[117966a2] MultiChangesFeed(channels: {ABC}, options: {Since:35 Limit:0 Conflicts:false IncludeDocs:false Wait:true Continuous:true Terminator:0xc000e6ca20 HeartbeatMs:0 TimeoutMs:0 ActiveOnly:false Ctx:context.Background.WithValue(base.LogContextKey{}, base.LogContext{CorrelationID:"#893"}).WithValue(base.LogContextKey{}, base.LogContext{CorrelationID:"[117966a2]"})}) …
2020-07-30T20:41:57.069Z [INF] Sync: c:[117966a2] Sent all changes to client
2020-07-30T20:41:57.250Z [INF] WS: c:[117966a2] ERROR decompressing frame: inputLen=4090, remaining=0, output=0, error=unexpected EOF

2020-07-30T20:41:57.251Z [INF] WS: c:[117966a2] Error receiving frame MSG#4~: unexpected EOF. Raw frame = <>
2020-07-30T20:41:57.251Z [INF] WS: c:[117966a2] Error: parseLoop closing socket due to error: unexpected EOF
2020-07-30T20:41:57.251Z [INF] WS: c:[117966a2] BLIP/Websocket Handler exited: unexpected EOF
2020-07-30T20:41:57.251Z [INF] HTTP: c:[117966a2] #893:     --> BLIP+WebSocket connection error: unexpected EOF
2020-07-30T20:41:57.251Z [INF] HTTP: c:[117966a2] #893:    --> BLIP+WebSocket connection closed
2020-07-30T20:41:57.251Z [INF] Changes: c:[117966a2] MultiChangesFeed done

Sounds like an issue with the load balancer. I would recommend reviewing the Traefif guide and apply these general guidelines

On unrelated note,

the couchbase server and sync gateway are running in the same docker container and behind a web server - Traefik

Same docker container ??? So you created a custom image with couchbase server and sync gateway ? Or did you mean “same pod” - even thats not a recommended practice to have sync gateway and server on same pod. They have different sizing and scaling requirements

Thanks Priya for the answer.

I setup in the same pod, I don’t need the scaling for couchbase in my case, I just need one instance for couchbase sync gateway and couchbase server.

I will take a look into Traefik and loadbalancer to see if I can solve this issue. Will let you know soon

Curious- why are you using Kubernetes ? It seems like a pretty trivial one pod deployment of server and sync gateway.

Hi Priya,

Sorry for late responding.

Curious- why are you using Kubernetes ? It seems like a pretty trivial one pod deployment of server and sync gateway.

Yes, I am trying to deploy couchbase server/sync gateway to Kubernetes to see if these services are working fine in this environment.
This is a trivial, if it is working fine then I will apply it. Any recommendations regarding this deployment are really appreciated!

BTW, I checked the LB keep-alive timeout then it is already set to 1 hour. One more thing I need to do is changing the keep-alive timeout of Traefik but this is not recommended due to impact to other applications.
Do we have an alternative way to change the keep-alive timeout in Traefik instead of updating Traefik configuration directly?

Or Is it possible to change the heartbeat timeout (in couchbase lite/sync gateway)?

Tuan

Recommendation to deploy Couchbase Server and Sync Gateway in separate pods and preferably scheduled on separate nodes .

Regarding EoF issue, did you try accessing Sync Gateway via some other load balancer

Thanks Priya for recommendation, I will check on this…

Yes, I did try to deploy in AWS ECS using AWS load balancer and it was working fine.

Regarding EOF issue, I tried to reduce the size of payload per request then this issue gone.
There is a similar issue which described here: Websocket response are not complete · Issue #4446 · traefik/traefik · GitHub , it seems like the issue related to Websocket.

Any comments on that issue link?