Adding a bucket causes issues
First let me say: Kudos on Membase. Love the UI, and everything about it!
Installed 1.6.5 Community on 2 individual machines over the past few days. Both are Windows Server 2003 VM's with a couple gigs of RAM. 32 bit.
The one machine works great. I have the default bucket and a new bucket. My code is using the non-default bucket in that environment and is very happy with it. There are no errors in the logs and it's purring away.
The other machine (seems to) works great when ONLY the default bucket is created. But upon the addition of the new bucket, the entire thing goes south. With the single default bucket, I can telnet into the server on port 11211 and run "stats" and get a listing of those statistics as one would expect. Once the new bucket is added, I start seeing issues in the Log area of the UI, and the telnet command no longer functions (get SERVER_ERROR proxy failed to write downstream) or along those lines. If I uninstall and reinstall with the default bucket, it works fine. I was also seeing errors earlier in the Log about sqlite and disk I/O errors. Don't know if this is related to the same problem or not. I also receive exception windows on the desktop of that machine, and entries - both Informational and Errors - in the Application Event Log of the machine. Here is a sample of those errors:
Reporting queued error: faulting application memcached.exe, version 0.0.0.0, faulting module ep.so, version 0.0.0.0, fault address 0x000a2e4c.
Reporting queued error: faulting application memcached.exe, version 0.0.0.0, faulting module unknown, version 0.0.0.0, fault address 0xc0840000.
I'm guessing there is some configuration issue on the machine that I am not aware of that is the root cause of this problem. Any idea what that may be?
Other troubleshooting ideas?
Perry
As far as perms go on that directory, the Machine\Administrators have Full Control, CREATOR OWNER has Special Perms(Create Files/Write Files & Create Folders/Append Data)., SYSTEM has Full, and Machine\Users has Read & Execute, List Folder Contents, Read and Special (Create Files/Write Files & Create Folders/Append Data).
As far as firewalls go, the Windows Firewall built into the OS is not running, so that's not an option.
I will create a bucket on this machine and post the log errors I am seeing.
This server has been running with 1 bucket on it very well today. There are no events in the Log to report. This server is intended to be just stand alone (for now), so not sure if a firewall is even a possible issue. The servers sit behind firewalls, I am not familiar with that layout though.
Thanks again. I will post logs later.
The first error occured 9 seconds after creating the bucket:
(IP Address edited)
Control connection to memcached on 'ns_1@xxx.xxx.xxx.xxx' disconnected: {{badmatch,
{error,
timeout}},
[{mc_client_binary,
stats_recv,
4},
{mc_client_binary,
stats,
4},
{ns_memcached,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]}
Module Code: ns_memcached004
now when I telnet to the box on port 11211, I receive "SERVER_ERROR proxy write to downstream"
I think I'll need to look at the full logs...can you send them to perry -at- couchbase -dot- com?
Thanks for the logs.
It appears that there's possibly something wrong with the disk on that node. I see lots of errors related to:
memcached<0.615.0>: Failed to create database: disk I/O error
I'm trying to figure out why the default bucket would be okay, but there's definitely something wrong. It looks like you've configured Membase to use: d:/Membase/data/ns_1/ as the data directory...can you check the permissions on that as well? I'm actually not seeing any permissions errors, so it's possible there is something lower level. Is that disk drive full? It would also be good to test whether this works successfully by using the default data directory (not changing it in the setup screen).
Perry
I'll uninstall and try it on the C: drive and see if that helps. There is plenty of space.
Nope, didn't help. Still received errors. Keep in mind these are VM's. Don't know if that'll make a difference to you or not.
It shouldn't make a difference, but there's definitely a problem with us trying to create/open the database (and likely something out of our control).
More troubleshooting...
-What user is the Membase service running as?
-Can you allow unlimited permissions to the data directory?
Perry
Perry
The service is running under the "System Account" while the c:\program files\membase\server\data directory has the following permissions
Administators: Full Control
Power Users have Modify/read & exec/List folder/read/write perms
SYSTEM has full control
USERS has read & exec/list folder/read perms
I haven't done anything yet with this server today...but as of right now, when I visit it on port 8091 via browser to view the UI I get
"Error 324 (net::ERR_EMPTY_RESPONSE): Unknown Error."
From your logs, you've configured the data directory to be d:/Membase/data/ns_1/ right?
What are the permissions on that?
I sent you 2 different log files. One was a few days old and there had been uninstalls/reinstalls done so it may be old.
Right at the moment there is just the c:\program files\membase\server\data directory. There is also the "bucket" directory that is setup as c:\membase that has the .mb,.mb-shm and .mb-wal files in it.
Yes, sorry for any confusion...
The d:\membase directory is what I am concerned with at the moment...
It seems like from the logs that we are having a very hard time gathering stats. Considering everything is local, this may point to problems with the system's responsiveness...
Judging from the fact that you have another server working just fine, would it be possible to start completely fresh with a new VM? I suspect that there's something quite wrong with this one...
Perry
I have it in someones queue to look at it and see if there are issues. I have since installed it on a 2 server cluster in our production environment and it seems to be working just fine.
Just to further comment on the issue. The production 2 server cluster that is running was having issues with more than one bucket in our environment. I removed the non-Default bucket, and it worked fine over the entire weekend. Friday while the bucket was running it was a mess. So I don't know what is going on there.
Thanks for the update...can you be a little bit clearer about where you are now?
so at the moment we have a single machine "cluster" in the dev environment that is working like a champ. The single server test version has never worked worth a darn, and I have installed a 2 machine cluster in the production environment that is unstable much like our test environment and is exhibiting some of the same behaviors.
My next step is to spin up a couple 2 VM clusters, one in dev and 1 in test to see if a fresh windows 2008 install will help us out.
If that doesn't work, I don't know what we'll do next.
Sounds good...everything I've seen points to problems with the underlying systems. It would be great if you could work with us to figure out what it is so that we can either add it to our release notes or improve the product to better handle this environment.
Perry
If anything this post will hopefully help somebody in a similar situation in the future. I'd be more than willing to help out any way I could. This seems like a great product and I'm anxious to make it work for us. I'd like to know the underlying issue as well to our problem.
I'll be happy to look deeper into the logs (unfortunately those faults don't tell me much), but I think your assumption about the special configuration on that machine is correct.
The two main things to check are the permissions on the data directory (by default is C:\Program Files\Membase\Server\data\ns_1 unless you've changed it) and any firewall that is in place.
I would actually first look at the firewall...can you turn it off completely and see if that stabilizes things?
Perry
Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Couchbase: http://www.couchbase.com/products-and-services/overview
Call or email "sales -at- couchbase-dot- com" today!