CRASH: Exit stats 74
We love the concept of membase. It really does seem like a scalable way of doing things. We are evaluating this to see if this software is ready for the enterprise in an enterprise deployment. However, we are getting a ton of these.
Port server vbucketmigrator on node 'ns_1@192.168.8.222' exited with status 74. Restarting. Messages: An error occured on the downstream connection..
Downstream connection closed.. shutdown upstream
Had 290 pending messages at exit.
There are 3 nodes in the cluster. Each are identical with 16GB of RAM each and 500GB disk (SAS). My gut is telling me this is not really enterprise ready yet. Is this an error I should take seriously. Love to hear feedback.
Can you give me a screenshot of the more detailed stats screen of each bucket?
"Looks" like severe over-provisioning, but it might also be something else.
Can you also send over the output of "/opt/membase/bin/ep_engine/management/stats :11210 all [bucket_password]"? Run it against each server and for each bucket on each server.
Perry
Unfortunately I had to shut it down but it looks like some sort of loop of some kind. There was tons of ram left on that bucket so not sure why it was really accessing the disk to that degree.
Without seeing some stats or logs I'd have a really hard time venturing a guess at what was going on.
If you can, please gather '/opt/membase/bin/ns_server/collect_info ' from each node and send it to perry -at- couchbase -dot- com.
Is this something you've seen happen before and can reproduce?
Perry
Hi Perry,
Thanks for getting back. What I can do is keep tabs on the cluster and give it a shot again and see if I can get it to show the same bug. Once I see it again, I will send you the data.
Raj
Raj, please make sure you are using the latest version (currently 1.6.5.3) as it has a number of stability fixes and improvements that you'll want to take advantage of.
Perry
Thanks. What is the best way to upgrade. I am new at this. I saw some instructions for upgrading but they seem to be for major releases.
Every release has specific information on upgrades and release notes here: http://techzone.couchbase.com/wiki/display/membase/Releases
We have now killed all connections to it. All operations are now 0 for about 30 minutes. Disk fetches are thru though roof and CPU on each machine is maxed out. See attached picture here:
http://bayimg.com/eAEpfAaDJ