After a restart of couchbase, some partitions seem to fail to reconnect

Hi David!

Thanks a lot for the new release!

These newer versions does indeed seem to solve the issues where it would loose the vbuckets completely. I am however now getting these Exceptions in the log, that I can’t recall getting with the 3.3.2-SNAPSHOT version, I do however get them even with that self-build version now, looks like this: (lots of these)

09:30:36.496 [nioEventLoopGroup-3-13] WARN c.c.client.dcp.conductor.Conductor - Error during Partition Move for partition 989
com.couchbase.client.dcp.error.RollbackException: null
at com.couchbase.client.dcp.conductor.DcpChannel$6$1.operationComplete(DcpChannel.java:368)
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:481)
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
at com.couchbase.client.deps.io.netty.util.concurrent.DefaultPromise.setSuccess(DefaultPromise.java:95)
at com.couchbase.client.dcp.transport.netty.DcpMessageHandler.channelRead(DcpMessageHandler.java:317)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.couchbase.client.deps.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.couchbase.client.deps.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at com.couchbase.client.deps.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:299)
at com.couchbase.client.deps.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:415)
at com.couchbase.client.deps.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1302)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.couchbase.client.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at com.couchbase.client.deps.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:135)
at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:646)
at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:581)
at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)
at com.couchbase.client.deps.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
at com.couchbase.client.deps.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at com.couchbase.client.deps.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:745)
09:30:36.496 [nioEventLoopGroup-3-13] WARN com.couchbase.client.dcp.Client - Received rollback for vbucket 989 to seqno 1
09:30:36.496 [nioEventLoopGroup-3-13] WARN c.c.connect.kafka.CouchbaseReader - Rolling back partition 989 to seqno 1
09:30:36.496 [nioEventLoopGroup-3-13] INFO com.couchbase.client.dcp.Client - Stopping to Stream for 1 partitions
09:30:36.496 [nioEventLoopGroup-3-13] INFO com.couchbase.client.dcp.Client - Starting to Stream for 1 partitions

After this I get many of these: (seems normal)


09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 124 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 113 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 134 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 98 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 132 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 142 complete
09:30:36.497 [nioEventLoopGroup-3-13] INFO c.c.connect.kafka.CouchbaseReader - Rollback for partition 106 complete

Everything does however seem to recover still, so not really any issues, just thought I’d share this with you

Regards,
Martin