Required changes from version 6.6 to 7.0?

Yeah, I’m sure you’re right. So it should have been deselected on the list of updates… So I should have rephrased that more precisely :slight_smile:

But once the first server was updated we had to update the other to have the same version in the cluster.

@daschl, I have now implemented a mechanism that only forces the QueryScanConsistency.REQUEST_PLUS for the same type where it was set - with a 5 minutes timeout (so ignore if not used within 5 minutes).

Are there any other timing settings I ought to change (e.g. on the collection)? … just before I put the changes into production :wink:

@jda well let’s come back to the original issue so that we don’t run too deep.

Caused by: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0x3df54ed500000009","idempotent":false,"lastDispatchedFrom":"172.23.130.47:6688","lastDispatchedTo":"db1.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":55375650,"requestType":"QueryRequest","retried":0,"service":{"operationId":"a3279e05-eea7-4580-98cf-80b292cabc39","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000,"timings":{"totalMicros":6843033}}

If you look at this, CHANNEL_CLOSED_WHILE_IN_FLIGHT tells us that it is not caused by a timeout, but rather that while the query request got sent the socket got closed. So the investigation needs to focus on why the socket got closed.

You can also run wireshark/tcpdump and look at the sockets and see what happens to them outside of your application server when the issue comes up again. A socket being closed is a bit different from the timeouts.

Also note that in the error context above, you c an see that the socket got closed while the operation was already over 6 seconds “old” (6843033 microseconds). That is way more than the couple milliseconds when run by hand apparently.

@daschl I put my changes into production. And I log when I force the use of the QueryScanConsistency.REQUEST_PLUS. It has not been set - but I now have the problem again…

I also changed the timeout slightly:

timeout.kvTimeout=100s

These are the stack traces I get (with my application code referenced):

8/5/21 4:14 PM: Exception Thrown
Context Path: /fangst.nsf
Page Name: /index.xsp
javax.faces.el.EvaluationException: javax.faces.el.EvaluationException: Error getting property 'count' from bean of type dk.dtu.aqua.catchlog.bean.BoastCatchesBean: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:57596","lastDispatchedTo":"db2.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719161,"requestType":"QueryRequest","retried":0,"service":{"operationId":"27145317-8b99-40c1-9f2b-f35e30044258","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000}
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:184)
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:134)
	at com.ibm.xsp.page.compiled.ExpressionEvaluatorImpl.getBindingValue(ExpressionEvaluatorImpl.java:151)
	at xsp.BoastCatches$BoastCatchesPage.createPanel2(BoastCatches.java:188)
	at xsp.BoastCatches$BoastCatchesPage.createComponent(BoastCatches.java:114)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.buildComponent(AbstractCompiledPage.java:265)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:115)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildFacet(CompiledComponentBuilder.java:205)
	at com.ibm.xsp.component.UICallback.buildContents(UICallback.java:180)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildFacet(CompiledComponentBuilder.java:205)
	at com.ibm.xsp.component.UICallback.buildContents(UICallback.java:180)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIViewRootEx.buildContents(UIViewRootEx.java:1649)
	at com.ibm.xsp.component.UIViewRootEx2.buildContents(UIViewRootEx2.java:247)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createViewRoot(AbstractCompiledPage.java:167)
	at com.ibm.xsp.application.ViewHandlerExImpl._createViewRoot(ViewHandlerExImpl.java:521)
	at com.ibm.xsp.application.ViewHandlerExImpl.createViewRoot(ViewHandlerExImpl.java:567)
	at com.ibm.xsp.application.ViewHandlerExImpl.doCreateView(ViewHandlerExImpl.java:142)
	at com.ibm.xsp.application.ViewHandlerEx.createView(ViewHandlerEx.java:90)
	at dk.dtu.aqua.catchlog.control.AppViewHandler.createView(AppViewHandler.java:190)
	at com.ibm.xsp.webapp.FacesServlet.serviceView(FacesServlet.java:250)
	at com.ibm.xsp.webapp.FacesServletEx.serviceView(FacesServletEx.java:157)
	at com.ibm.xsp.webapp.FacesServlet.service(FacesServlet.java:159)
	at com.ibm.xsp.webapp.FacesServletEx.service(FacesServletEx.java:138)
	at com.ibm.xsp.webapp.DesignerFacesServlet.service(DesignerFacesServlet.java:103)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule.invokeServlet(ComponentModule.java:588)
	at com.ibm.domino.xsp.module.nsf.NSFComponentModule.invokeServlet(NSFComponentModule.java:1335)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule$AdapterInvoker.invokeServlet(ComponentModule.java:865)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule$ServletInvoker.doService(ComponentModule.java:808)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule.doService(ComponentModule.java:577)
	at com.ibm.domino.xsp.module.nsf.NSFComponentModule.doService(NSFComponentModule.java:1319)
	at com.ibm.domino.xsp.module.nsf.NSFService.doServiceInternal(NSFService.java:662)
	at com.ibm.domino.xsp.module.nsf.NSFService.doService(NSFService.java:482)
	at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.doService(LCDEnvironment.java:357)
	at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.service(LCDEnvironment.java:313)
	at com.ibm.domino.xsp.bridge.http.engine.XspCmdManager.service(XspCmdManager.java:272)
Caused by: javax.faces.el.EvaluationException: Error getting property 'count' from bean of type dk.dtu.aqua.catchlog.bean.BoastCatchesBean: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:57596","lastDispatchedTo":"db2.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719161,"requestType":"QueryRequest","retried":0,"service":{"operationId":"27145317-8b99-40c1-9f2b-f35e30044258","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000}
	at com.sun.faces.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:104)
	at com.ibm.xsp.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:144)
	at com.sun.faces.el.impl.ArraySuffix.evaluate(ArraySuffix.java:182)
	at com.sun.faces.el.impl.ComplexValue.evaluate(ComplexValue.java:163)
	at com.sun.faces.el.impl.BinaryOperatorExpression.evaluate(BinaryOperatorExpression.java:175)
	at com.sun.faces.el.impl.ExpressionEvaluatorImpl.evaluate(ExpressionEvaluatorImpl.java:257)
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:150)
	... 83 more
Caused by: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:57596","lastDispatchedTo":"db2.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719161,"requestType":"QueryRequest","retried":0,"service":{"operationId":"27145317-8b99-40c1-9f2b-f35e30044258","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000,"timings":{"totalMicros":6893452}}
	at com.couchbase.client.java.AsyncUtils.block(AsyncUtils.java:51)
	at com.couchbase.client.java.Cluster.query(Cluster.java:393)
	at dk.dtu.aqua.catchlog.base.BaseCouchbaseDAO.getAllJoinedSubFromDb(BaseCouchbaseDAO.java:596)
	at dk.dtu.aqua.catchlog.dao.CouchbaseCatchDAO.getCatches(CouchbaseCatchDAO.java:146)
	at dk.dtu.aqua.catchlog.dao.CouchbaseCatchDAO.getAllBoastCatches(CouchbaseCatchDAO.java:159)
	at dk.dtu.aqua.catchlog.dao.facade.CatchCRUDFacade.getAllBoastCatches(CatchCRUDFacade.java:104)
	at dk.dtu.aqua.catchlog.dao.facade.CatchCRUDFacade.getAllBoastCatches(CatchCRUDFacade.java:100)
	at dk.dtu.aqua.catchlog.bean.DataBean.getAllBoastCatches(DataBean.java:477)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getAllCatches(BoastCatchesBean.java:21)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getRecentCatches(BoastCatchesBean.java:39)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getRecentCatches(BoastCatchesBean.java:34)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getCount(BoastCatchesBean.java:30)
	at sun.reflect.GeneratedMethodAccessor443.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
	at java.lang.reflect.Method.invoke(Method.java:508)
	at com.sun.faces.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:96)
	... 89 more
	Suppressed: java.lang.Exception: The above exception was originally thrown by another thread at the following location.
		at com.couchbase.client.core.msg.BaseRequest.cancel(BaseRequest.java:172)
		at com.couchbase.client.core.retry.RetryOrchestrator.lambda$maybeRetry$0(RetryOrchestrator.java:79)
		at com.couchbase.client.core.retry.RetryOrchestrator$$Lambda$388.00000000C4447700.accept(Unknown Source)
		at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:771)
		at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:789)
		at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2151)
		at com.couchbase.client.core.retry.RetryOrchestrator.maybeRetry(RetryOrchestrator.java:62)
		at com.couchbase.client.core.io.netty.chunk.ChunkedMessageHandler.channelInactive(ChunkedMessageHandler.java:238)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
		at com.couchbase.client.core.deps.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:418)
		at com.couchbase.client.core.deps.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389)
		at com.couchbase.client.core.deps.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354)
		at com.couchbase.client.core.deps.io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:311)
		at com.couchbase.client.core.deps.io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:221)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
		at com.couchbase.client.core.deps.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
		at com.couchbase.client.core.deps.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:819)
		at com.couchbase.client.core.deps.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
		at com.couchbase.client.core.deps.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472)
		at com.couchbase.client.core.deps.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)
		at com.couchbase.client.core.deps.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
		at com.couchbase.client.core.deps.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
		at com.couchbase.client.core.deps.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
		at java.lang.Thread.run(Thread.java:811)
8/5/21 4:14 PM: Exception Thrown
Context Path: /fangst.nsf
Page Name: /index.xsp
javax.faces.el.EvaluationException: javax.faces.el.EvaluationException: Error getting property 'count' from bean of type dk.dtu.aqua.catchlog.bean.BoastCatchesBean: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:39098","lastDispatchedTo":"db1.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719234,"requestType":"QueryRequest","retried":0,"service":{"operationId":"224eb019-71ea-408e-96c8-f97e36098027","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000}
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:184)
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:134)
	at com.ibm.xsp.page.compiled.ExpressionEvaluatorImpl.getBindingValue(ExpressionEvaluatorImpl.java:151)
	at xsp.BoastCatches$BoastCatchesPage.createPanel2(BoastCatches.java:188)
	at xsp.BoastCatches$BoastCatchesPage.createComponent(BoastCatches.java:114)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.buildComponent(AbstractCompiledPage.java:265)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:115)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildFacet(CompiledComponentBuilder.java:205)
	at com.ibm.xsp.component.UICallback.buildContents(UICallback.java:180)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildFacet(CompiledComponentBuilder.java:205)
	at com.ibm.xsp.component.UICallback.buildContents(UICallback.java:180)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:345)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIIncludeComposite.buildPageContents(UIIncludeComposite.java:692)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:340)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.addComponent(AbstractCompiledPage.java:389)
	at com.ibm.xsp.component.UIIncludeComposite.buildContents(UIIncludeComposite.java:453)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildChildren(CompiledComponentBuilder.java:123)
	at com.ibm.xsp.page.compiled.CompiledComponentBuilder.buildAll(CompiledComponentBuilder.java:84)
	at com.ibm.xsp.component.UIViewRootEx.buildContents(UIViewRootEx.java:1649)
	at com.ibm.xsp.component.UIViewRootEx2.buildContents(UIViewRootEx2.java:247)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.initComponent(AbstractCompiledPage.java:334)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createTree(AbstractCompiledPage.java:256)
	at com.ibm.xsp.page.compiled.AbstractCompiledPage.createViewRoot(AbstractCompiledPage.java:167)
	at com.ibm.xsp.application.ViewHandlerExImpl._createViewRoot(ViewHandlerExImpl.java:521)
	at com.ibm.xsp.application.ViewHandlerExImpl.createViewRoot(ViewHandlerExImpl.java:567)
	at com.ibm.xsp.application.ViewHandlerExImpl.doCreateView(ViewHandlerExImpl.java:142)
	at com.ibm.xsp.application.ViewHandlerEx.createView(ViewHandlerEx.java:90)
	at dk.dtu.aqua.catchlog.control.AppViewHandler.createView(AppViewHandler.java:190)
	at com.ibm.xsp.webapp.FacesServlet.serviceView(FacesServlet.java:250)
	at com.ibm.xsp.webapp.FacesServletEx.serviceView(FacesServletEx.java:157)
	at com.ibm.xsp.webapp.FacesServlet.service(FacesServlet.java:159)
	at com.ibm.xsp.webapp.FacesServletEx.service(FacesServletEx.java:138)
	at com.ibm.xsp.webapp.DesignerFacesServlet.service(DesignerFacesServlet.java:103)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule.invokeServlet(ComponentModule.java:588)
	at com.ibm.domino.xsp.module.nsf.NSFComponentModule.invokeServlet(NSFComponentModule.java:1335)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule$AdapterInvoker.invokeServlet(ComponentModule.java:865)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule$ServletInvoker.doService(ComponentModule.java:808)
	at com.ibm.designer.runtime.domino.adapter.ComponentModule.doService(ComponentModule.java:577)
	at com.ibm.domino.xsp.module.nsf.NSFComponentModule.doService(NSFComponentModule.java:1319)
	at com.ibm.domino.xsp.module.nsf.NSFService.doServiceInternal(NSFService.java:662)
	at com.ibm.domino.xsp.module.nsf.NSFService.doService(NSFService.java:482)
	at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.doService(LCDEnvironment.java:357)
	at com.ibm.designer.runtime.domino.adapter.LCDEnvironment.service(LCDEnvironment.java:313)
	at com.ibm.domino.xsp.bridge.http.engine.XspCmdManager.service(XspCmdManager.java:272)
Caused by: javax.faces.el.EvaluationException: Error getting property 'count' from bean of type dk.dtu.aqua.catchlog.bean.BoastCatchesBean: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:39098","lastDispatchedTo":"db1.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719234,"requestType":"QueryRequest","retried":0,"service":{"operationId":"224eb019-71ea-408e-96c8-f97e36098027","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000}
	at com.sun.faces.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:104)
	at com.ibm.xsp.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:144)
	at com.sun.faces.el.impl.ArraySuffix.evaluate(ArraySuffix.java:182)
	at com.sun.faces.el.impl.ComplexValue.evaluate(ComplexValue.java:163)
	at com.sun.faces.el.impl.BinaryOperatorExpression.evaluate(BinaryOperatorExpression.java:175)
	at com.sun.faces.el.impl.ExpressionEvaluatorImpl.evaluate(ExpressionEvaluatorImpl.java:257)
	at com.sun.faces.el.ValueBindingImpl.getValue(ValueBindingImpl.java:150)
	... 83 more
Caused by: com.couchbase.client.core.error.RequestCanceledException: QueryRequest, Reason: NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT) {"cancelled":true,"completed":true,"coreId":"0xe7460baf00000004","idempotent":false,"lastDispatchedFrom":"172.23.130.47:39098","lastDispatchedTo":"db1.aqua.dtu.dk:8093","reason":"NO_MORE_RETRIES (CHANNEL_CLOSED_WHILE_IN_FLIGHT)","requestId":1719234,"requestType":"QueryRequest","retried":0,"service":{"operationId":"224eb019-71ea-408e-96c8-f97e36098027","statement":"SELECT t2.*, t1.* FROM data AS t1 LEFT JOIN data AS t2 ON 'Catch:Private:'||t1.`key` = META(t2).id AND t2.type='Private' WHERE t1.type='Catch' AND (t1.visibility='2')","type":"query"},"timeoutMs":75000,"timings":{"totalMicros":2518515}}
	at com.couchbase.client.java.AsyncUtils.block(AsyncUtils.java:51)
	at com.couchbase.client.java.Cluster.query(Cluster.java:393)
	at dk.dtu.aqua.catchlog.base.BaseCouchbaseDAO.getAllJoinedSubFromDb(BaseCouchbaseDAO.java:596)
	at dk.dtu.aqua.catchlog.dao.CouchbaseCatchDAO.getCatches(CouchbaseCatchDAO.java:146)
	at dk.dtu.aqua.catchlog.dao.CouchbaseCatchDAO.getAllBoastCatches(CouchbaseCatchDAO.java:159)
	at dk.dtu.aqua.catchlog.dao.facade.CatchCRUDFacade.getAllBoastCatches(CatchCRUDFacade.java:104)
	at dk.dtu.aqua.catchlog.dao.facade.CatchCRUDFacade.getAllBoastCatches(CatchCRUDFacade.java:100)
	at dk.dtu.aqua.catchlog.bean.DataBean.getAllBoastCatches(DataBean.java:477)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getAllCatches(BoastCatchesBean.java:21)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getRecentCatches(BoastCatchesBean.java:39)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getRecentCatches(BoastCatchesBean.java:34)
	at dk.dtu.aqua.catchlog.bean.BoastCatchesBean.getCount(BoastCatchesBean.java:30)
	at sun.reflect.GeneratedMethodAccessor443.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:55)
	at java.lang.reflect.Method.invoke(Method.java:508)
	at com.sun.faces.el.PropertyResolverImpl.getValue(PropertyResolverImpl.java:96)
	... 89 more
	Suppressed: java.lang.Exception: The above exception was originally thrown by another thread at the following location.
		at com.couchbase.client.core.msg.BaseRequest.cancel(BaseRequest.java:172)
		at com.couchbase.client.core.retry.RetryOrchestrator.lambda$maybeRetry$0(RetryOrchestrator.java:79)
		at com.couchbase.client.core.retry.RetryOrchestrator$$Lambda$388.00000000C4447700.accept(Unknown Source)
		at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:771)
		at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:789)
		at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2151)
		at com.couchbase.client.core.retry.RetryOrchestrator.maybeRetry(RetryOrchestrator.java:62)
		at com.couchbase.client.core.io.netty.chunk.ChunkedMessageHandler.channelInactive(ChunkedMessageHandler.java:238)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
		at com.couchbase.client.core.deps.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:418)
		at com.couchbase.client.core.deps.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389)
		at com.couchbase.client.core.deps.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354)
		at com.couchbase.client.core.deps.io.netty.handler.codec.http.HttpClientCodec$Decoder.channelInactive(HttpClientCodec.java:311)
		at com.couchbase.client.core.deps.io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:221)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248)
		at com.couchbase.client.core.deps.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241)
		at 
:
:
:

It is not straightforward to get a logger configured to get the output from the Java SDK. But I suppose I could catch the exception and write some interesting information from that? Or from the query itself? I’m just not sure what we need to further troubleshoot this…?

@jda in addition, can you collet a cbcollectinfo and share that somehow with us? Since it might contain sensitive data you can send me the link to it via direct message as well.

Well also see my message above. A timeout will not help you there at all since it is about sockets being closed, not ops timing out. Logs are crucial, and to be honest client side logs from the SDK is what I’d consider best practice to have the slightest chance of troubleshooting…

Thanks - I’ll do that…

But will have to run out of the office now - so will get back with that a little later. For both servers?

Yeah for both ideally. Looking at your error context again, the exact same query failed, this time after around 2.5s.

Sorry, didn’t see this prior to responding before…

That is a good point. The servers are all VMs on the same ESXi host… so shouldn’t be too “far” from one to the other…

Will get back with what I find.

I have now started tcpdump on the application server (two windows - one for each database server) and on each of the database servers.

To limit the amount of data I use this from the application server:

tcpdump -i any -nn dst 172.23.130.49

where the IP address is for one of the database servers. And similarly on the database servers I have limited output to that coming from the application server:

tcpdump -i any -nn src 172.23.130.47

Will this provide us the necessary information? On the database server we will not get traffic between the two database servers, the sync.gateway and my SSH shell :slight_smile:

Links to dbcollect_info files on its way in a private message

I have sent you a link to the uploaded tcpdump log files.

I’m not sure what I’m looking for. From App server to DB2 I see a special pattern with many requests like:

10:08:29.688986 IP 172.23.130.47.56348 > 172.23.130.50.11210: Flags [P.], seq 22858774:22858814, ack 332883274, win 15055, options [nop,nop,TS val 503884021 ecr 502488840], length 40
10:08:29.690661 IP 172.23.130.47.56348 > 172.23.130.50.11210: Flags [P.], seq 22858814:22858854, ack 332883511, win 15055, options [nop,nop,TS val 503884022 ecr 502488841], length 40

and

10:08:36.443601 IP 172.23.130.47.61712 > 172.23.130.50.8093: Flags [.], ack 634506, win 1268, options [nop,nop,TS val 503890775 ecr 502495595], length 0
10:08:36.444220 IP 172.23.130.47.61712 > 172.23.130.50.8093: Flags [.], ack 646916, win 1268, options [nop,nop,TS val 503890776 ecr 502495595], length 0

That just seems kind of “different” from other times of the output!

As for the logs from the SDK I’m working on it…

@daschl I just might have a way to enable the slf4j logger… Are we interested in the com.couchbase.client.java package?

I’m trying to configure the use of SLF4J via LogBack - and have added this line to logback.xml to try and enable logging from the SDK:

	<logger name="com.couchbase.client.java" level="TRACE" />

Is this what we need?

Hi @daschl

I think I have found out how to add slf4j to the solution. However, the couchbase classes seem to ignore any settings I set in the logback.xml. Can you confirm this? If so - where is the file that the SDK is logging to?

I haven’t put this in production until I’m certain about this and any further changes that I may need…

I would suggest logging “com.couchbase”, so that it also includes the core components.

Also please note that the SDK is neither ignoring nor looking at logback.xml etc - this is all the other side of the slf4j facade. The SDK logs into SLF4J if found on the classpath, and which implementation you use for logging (i.e. logback) is completely up to you.

@jda looking at the logs, we can see that something is really closing off client connections, but the SDK is not expecting it.

You can see this by grepping for

WARN com.couchbase.endpoint - [com.couchbase.endpoint][UnexpectedEndpointDisconnectedEvent] The remote side disconnected the endpoint unexpectedly

Now there is other noise in there, which I think is related: Writing into the channel failed unexpectedly … which can happen if the channel gets closed at the same time, but in this case the op will be retried.

So I think there are two follow up actions for you, and I’ll see if the cbcollects show something.

  • You need to figure out why those sockets are being closed upon the client. could be the jvm, could be the network etc.
  • I think you can band-aid your immediate problem by setting the “readonly” option to true for your query request. Since it is a select only (do not! do this for n1ql upsert, insert into etc) it will be retried even in the case above since it is considered to be idempotent. This will not fix the underlying issue but should give you more time to investigate properly.

I’ll let you know once I find something in the cbcollect

I looked through the infos provided and it looks like in addition to the client experiencing socket closes, the servers in between are also experiencing “write: connection reset by peer from” … which you can find in the ns_server.query.log for example on both machines.

This really starts to point towards the network, and something cutting off connections.

@daschl Thanks for the valuable input!

A general network issue could also explain why we don’t see any of this in the test environment (though it has a less activity). All of the servers with the trouble are on the same VMware ESXi server - so there really shouldn’t be any physical reason why they experience this. I have no direct access to that host but will get a hold on the people responsible for it and see what they can tell me.

If I understand what we know now correctly, then this has nothing to do with version 6.6 to 7.0 - so is more likely a coincidence that we experience the issue around the same time. Do you also agree on this?

@jda since I don’t know the state of your app before the upgrade (including logs to compare), it is impossible for me to say. The cbcollectinfo also reveals other issues that can / should be addressed over time, but those are probably out of scope for the immediate issue at hand. For example your db2 is in swap, which we do not recommend. And it looks like it has both servers installed? couchbase-server-community-6.6.0-7909.x86_64, couchbase-server-community-7.0.0-5302.x86_64) - which db1 does not.

Also on db2 one interface reports packets dropped, another network issue indication:

[warn] Interface ‘ens32’ (removedip) failures: RX:2863 / TX:0 - Details:
- RX packets:308,593,167 errors:0
dropped:2,863 overruns:0 frame:0

@daschl yes it is strange that two servers that start out being built exactly the same way (only name differs) over time starts to have differences.

I saw a message in connection with yum that there was a faulty registration - so I’ll try and troubleshoot that also (if I remember correctly it was something about cleaning the repo… - so didn’t seem important when being on holiday and working via mobile data :innocent: )… But only version 7.0.0 should be installed (only one couchbase under /opt/ and the web dashboard also reports 7.0.0).

When you say that db2 is in swap what exactly do you mean? That there is a difference in configuration (there shouldn’t be) or that it is running out of available memory and swapping? Looking at the servers right now they seem to be running with close to the same amount of memory being used…

@jda both nodes are running in swap, which we do not recommend since it causes performance problems and makes it hard to trouble shoot. At the time of cbcollect capture, both had a vm.swappiness of 30 configured (we recommend 0) and both used around 3% of swap.

Also, both machines have THP enabled (transparent huge pages), where our recommendation is to set it to never or madvise.

I’d recommend to check out Deployment Guidelines | Couchbase Docs, it really helps with the long time performance of your server cluster.

@jda more findings - there is something else going on on your esx host.

If you check your memcached.log file, you can see that it’s complaining about jumping epoch (pretty much 200 seconds on point) very often - so something is off with your system clock. maybe your virtual host is sleeping and/or not being scheduled properly… but I’m not an expert on causes why that could jump.

In addition to that, it might be unrelated but still not good - kv engine reports slow ops … you can see that tasks that should be quick are taking a longer time. i.e.:

2021-08-06T08:50:06.418234+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 496 ms
2021-08-06T08:51:06.938891+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 508 ms
2021-08-06T08:52:07.466170+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 515 ms
2021-08-06T08:53:07.997783+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 519 ms
2021-08-06T08:54:08.503049+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 491 ms
2021-08-06T08:55:08.979337+02:00 WARNING (No Engine) Slow runtime for 'Updating stat snapshot on disk' on thread WriterPool3: 476 ms

So I have a suspicion that something is not right with your vms. But then again, I’m not an expert in that area and at this point I’m probably out of my troubleshooting comfort zone so your mileage will vary.