Unable to rebalance Membase Cluster
Hi Folks,
We have a membase cluster with 4 nodes (all of them Windows Server 2008 R2 64b).
For testing purposes we turned off the membase service in 3 of these nodes, but after setting up the service again on all of them, the rebalance operation did fail with the following error:
"Rebalance exited with reason {wait_for_memcached_failed,"bucketname",['ns_1@x.x.x.13','ns_1@x.x.x.11']}"
After several non-lucky attempts we removed these nodes from the cluster and the remaining ones were properly rebalanced.
Then we tryied to add one of the problematic nodes and got the following error (The same error occurred while trying to add the second one):
Server error during processing: ["web request failed",
{path,"/controller/addNode"},
{type,exit},
{what,
{timeout,
{gen_server,call,
[ns_cluster,
{add_node,"x.x.x.13",8091,
{"user","pass"}},
30000]}}},
{trace,
[{gen_server,call,3},
{ns_cluster,add_node,3},
{menelaus_web,handle_add_node,1},
{menelaus_web,loop,3},
{mochiweb_http,headers,5},
{proc_lib,init_p_do_apply,3}]}]
Trying to add the .13 node from the same node (the inverse process) did work and the node was added… but the rebalance process failed again:
"Rebalance exited with reason {wait_for_memcached_failed," bucketname ",['ns_1@x.x.x.13']}"
We repetead these operations several times, but had no luck and the cluster has only 2 of 4 nodes working.
Our membase version is: 1.7.2r-20-g6604356
Which one could be the root issue of this behavior ?. How you would recommend we should proceed ?
Thanks in Advance !,
Ale