SASL auth exception on default bucket when one node goes down in 2 node cluster

I am testing a 2 node cluster and I'm having problems generating workloads when one server is down but not failed over. Even when executing a connection against the server that is still up and running I am getting an error. Only when I've gone to the live server and failed over the down node am I able to successfully connect and send data to couchbase.

Scenario:
1) Both servers (node1, node2) are alive and cbworkloadgen is functioning normally
2) Shutdown couchbase-server service on node2
3) cbworkloadgen fails to connect with the following:

[node1]# /opt/couchbase/bin/tools/cbworkloadgen -n node1:8091 -r .95 -i 100000 -s 100 -t 1 -u Administrator -p password

s0 error: CBSink.connect() for send: error: SASL auth exception: node2:11210, user: default
s0 error: async operation: error: SASL auth exception: node2:11210, user: default on sink: http://node1:8091(default@N/A-0)
error: SASL auth exception: node2:11210, user: default

4) Execute failover of node2 on node1 UI
5) cbworkloadgen functions normally

Autofailover only works with 3 nodes so obviously I could add another node and hope the autofailover happens fast enough but we don't have another server allocated yet.

Is this normal behavior when a node goes down?

1 Answer

« Back to question.

Hi there,

I believe this is more a problem with cbworkloadgen.

Your application using a client library should be able to catch that sort of exception and move on…

In response to your final question:- yes, it's expected that you wouldn't be able to connect to a server that is down.

I hope this helps!

Thanks for the reply but the problem is that I cannot connect to the server that is still up.

As you can see from my command line syntax I'm specifying node1 for -n which is the server that is still up and running.

When I try to specify node2 I get an expected "connection refused".

The issue there could be that:

When the node is down, the vbucket map still points to it, so it is still trying to connect to that node (because Couchbase is "CP").