Errors in the custom serializer after a rebalance (?)

Hey all,

I’m experiencing an issue with our Couchbase cluster that I cannot fully explain yet. I’ll describe what I learnt so far, it’d be awesome if anyone could check if they have an idea what might be happening.

Since last night our logs were full of errors saying that the deserialization in our custom serializer failed. (We’re using a custom Json-Gzip serializer). The error happened when we were trying to add an item to CB, which was weird to me, since I didn’t understand why the Deserialize was even called when we were inserting something.

The call stack was the following:

Exception: System.IO.InvalidDataException: The magic number in GZip header is not correct. Make sure you are passing in a GZip stream.
   at System.IO.Compression.GZipDecoder.ReadHeader(InputBuffer input)
   at System.IO.Compression.Inflater.Decode()
   at System.IO.Compression.Inflater.Inflate(Byte[] bytes, Int32 offset, Int32 length)
   at System.IO.Compression.DeflateStream.Read(Byte[] array, Int32 offset, Int32 count)
   at System.IO.Stream.InternalCopyTo(Stream destination, Int32 bufferSize)
   at Haddock.DistributedCacheClient.Couchbase.JsonAndGZipSerializer.Decompress(Stream compressed)
   at Haddock.DistributedCacheClient.Couchbase.JsonAndGZipSerializer.Deserialize[T](Byte[] buffer, Int32 offset, Int32 length)
   at Couchbase.Core.Transcoders.DefaultTranscoder.Decode[T](Byte[] buffer, Int32 offset, Int32 length, Flags flags, OperationCode opcode)
   at Couchbase.IO.Operations.OperationBase.GetConfig()
   at Couchbase.Core.Buckets.CallbackFactory.<>c__DisplayClass3_0`1.<<CompletedFuncWithRetryForCouchbase>b__0>d.MoveNext()

Where the Haddock.DistributedCacheClient.Couchbase.JsonAndGZipSerializer is our custom serializer, and the exception is simply saying that the Gzip decompression failed. (This error can happen if we’re trying to decompress a stream which is not a Gzip stream, but again, I don’t understand why the Decompress was even called on our serializer in the first place.)

I looked up the call stack in the SDK, it happens here:

if (result.IsNmv())
    var config = op.GetConfig();
    if (config != null)

Where op.GetConfig() does this:

config = Transcoder.Decode<BucketConfig>(Data.ToArray(), offset, length, new Flags
    Compression = Compression.None,
    DataFormat = DataFormat.Json,
    TypeCode = TypeCode.Object
}, OperationCode);

So to me it seems like the server responded with a VBucketBelongsToAnotherServer, and the SDK tried to deserialize the response body with our custom serializer, which failed.

I checked the logs in the management console, and when the errors started happening I see this log message:

Haven't heard from a higher priority node or a master, so I'm taking over.

Does this mean that a rebalance happened, and that’s why we received a VBucketBelongsToAnotherServer response? In this case, what’s the response body, is it the new configuration? Does it make sense that the SDK tries to use the custom serializer to read it?
To me that seems wrong, since there is no way that was serialized on the server with the serializer we configured on the client.

Or am I on the wrong track, and is this caused by something else?

Hi @markvincze

I believe your assumption is correct. VBucketBelongsToAnotherServer can occur during a rebalance as a VBucket that was located on server A is now on server B. The code does indeed try to read the response body as an updated cluster configuration (which would tell the client the key that failed is now located on server B).

I also agree that the client should probably not use the serializer that is configured for application use, and instead should use the default serializer so we can interpret the response correctly.

I have raised NCBC-1453 to investigate further.

Can you please confirm your server version and OS along with the client version?

Thanks for raising the bug.

1 Like

Hi @MikeGoldsmith,

Thanks for the quick response.

Client SDK: CouchbaseNetClient 2.4.5
Server: 4.0.0-4051 Community Edition (build-4051)