Java SDK can not access Query node in couchbase 5.0

@subhashni The new nodes being rebalanced in are data only nodes. We have 8 new nodes coming in and 8 data nodes going out (swap rebalance). The cluster also has 3 index nodes and 2 query nodes. We wound up needing to pause the rebalance because it was taking too long (days…). We will continue the rebalance over the weekend and might need to drop views in order to complete this. However, even after the rebalance has stopped all of the java clients are spamming logs constantly with these warning messages (~200K log events per minute across our infrastructure now). Our cluster currently has 16 data nodes, 3 index, 2 query since we’re in between stages of rebalancing…

It appears that these warnings will continue until we complete the rebalance. Let me know once we have a fix in the java SDK for this.

@daschl It does not recover unless the rebalance completes. As far as we can tell, the applications appear to be fine other than the spammy logs but the warn messages are a bit concerning because it seems like something is fundamentally wrong with out the java sdk behaves during a rebalance. We’ve never had good experiences with the java sdk while performing maintenance on clusters. This particular cluster has > 1TB of data, 8 buckets, 50 design documents (views), and several N1QL indexes. Performing a rebalance on this size of cluster takes days so it’s critical that the client SDKs behave properly during these windows.

@bryan in this case it looks like the server sends us a config with the kv engine already enabled but its not ready to serve the bucket yet. The reason the workload is not impacted is that no partitions are active on it yet but when the client connects the server rejects since it doesn’t know about that bucket yet.

I 'll check with the team if we can improve the situation on the client side while working to improve this on the server as well.

Thank you! Keep us posted and we can re-test on newer version of SDK.

We too have this issue, are we fixing this warning?

@daschl this continues to cause problems for us… Any updates from your end? Fixes in clients we haven’t upgraded to yet?

@unhuman you might be running into https://issues.couchbase.com/browse/JVMCBC-564 or a side effect of it. Can you upgrade to 2.6.2 or 2.7.0 and check if it helps?

1 Like