Can't remove servers, rebalance or run any terminal commands
I was running a cluster of 4 servers, then shutdown 3 of them. Then I tried adding 2 new servers to the cluster. However, I cannot remove the 3 down servers, nor can I rebalance -- the buttons are disabled, here's a screenshot of my web console. (I'm running Ubuntu 11.04).
What do I do? I can't seem to do anything at this point. I thought I'd try to fix things from the command line, but when I run membase from the command line, I get the following output:
The maximum number of open files for the membase user is set too low. It must be at least 10240. Normally this can be increased by adding the following lines to /etc/security/limits.conf: membase soft nofile <value> membase hard nofile <value> Where <value> is greater than 10240.
I've tried doing what the output suggests, but it makes no difference.
How do I fix things? Is there a way I can completely kill the membase service?
Thanks!
I tried pressing the Failover button. A modal window opened and displayed a "Loading" indicator. I let it sit there for 5 minutes, but nothing happened -- it just stayed on the loading window. (I tried this several times, from different servers, and it did the same thing).
What I originally posted is the exact output from membase. It doesn't matter which commands/options I pass -- it always displays this output:
PROMPT> membase The maximum number of open files for the membase user is set too low. It must be at least 10240. Normally this can be increased by adding the following lines to /etc/security/limits.conf: membase soft nofile <value> membase hard nofile <value> Where <value> is greater than 10240.
Kyle, that error message is happening because the particular user that you are running membase as does not have a high enough ulimit. The error message is a bit misleading though. The server expects to be run as the 'membase' user (via the /etc/init.d/membase-server) script and so that's why the message says to add the "membase" user to that configuration file. If you replace 'membase' with the name of the user you are running it as, that message should go away.
Either way though, it really doesn't matter as the number of open files really just controls the maximum amount of connections the server can accept (there was a bug in previous versions which would cause memcached to crash if this value was too low...that has been resolved)
As far as the failover issues go, I'll try to reproduce that here and get back to you.
Perry
Kyle, I was unable to reproduce the issues that you're having in my own tests:
-I setup a 4 node cluster
-I turned off membase-server on 3 of them
-A added a 5th node...the rebalance failed as expected
-The failover buttons are very responsive and once I failed over the 3 downed nodes, I was able to rebalance the cluster.
Please make sure that you are using the latest version (1.7) and not any of the developer preview that we had previously released. If you are able to reproduce this again, I'd like to take a look at the logs on those various nodes.
Thanks
Perry
Thanks again for the reply, Perry.
Your step #2 isn't exactly accurate: when I encountered the problem, I didn't elegantly turn off membase-server. I actually shut down the machine entirely (running on Amazon EC2, I just terminated the instances).
Perhaps the inelegant method of killing those membase servers caused the problems? Also, I believe I was using 1.6, but I'll double check. I'll be sure to experiment if I run into this issue again, thanks for your assistance.
You may be right, the difference in #2 is subtle but definitely a difference.
I would highly suggest trying this again with 1.7 as we made a number of improvements to support exactly this eventuality.
Perry
You can't rebalance when nodes are down...there's no way for us to pull the data off in order to redistribute it.
If you have enough replica copies, you can failover the 3 nodes and then you'll be able to rebalance (even if you don't have replica copies you'll be able to failover and rebalance, but that would technically result in dataloss...although you can recover that as well by using mbrestore)
As a separate issue, I don't think you should be getting that output from running it manually. Can you show me exactly what you're running and how that output is being presented?
Hope that helps
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!