Membase fall down every 2 days
Hello
Our membase server fall down every 2 days. Error log:
--------------------------
Module Code: ns_memcached004
Control connection to memcached on 'ns_1@127.0.0.1' disconnected: {{badmatch,
{error,
timeout}},
[{mc_client_binary,
stats_recv,
4},
{mc_client_binary,
stats,4},
{ns_memcached,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]}
--------------------------
--------------------------
Module Code: ns_port_server000
Port server memcached on node 'ns_1@127.0.0.1' exited with status 134. Restarting. Messages: memcached: stored-value.cc:51: bool StoredValue::restoreValue(value_t, EPStats&, HashTable&): Assertion `v->length() == valLength()' failed.
--------------------------
# uname -a
Linux 2.6.32-5-amd64 #1 SMP Wed Jan 12 03:40:32 UTC 2011 x86_64 GNU/Linux
Server config: 2x IntelR Xeon Processor E5620, RAM: 48GB DDR3 1333mhz
Bucket config: Cache Size. Dynamic RAM Quota: 16 Gb, Storage Size Total Cluster Storage (836 GB)
We have ~200 Gb of disk usage and 70% of ram usage when mambase usually crashes.
As I can see there is a same bug http://bugs.membase.org/browse/MB-3443 but as I understood it marked as fixed without any fixing?
We have an other memcached-server on this server and it working normally. The server has a heavy load - the average iddle is 20%
This is a really big problem...
Can somebody help us?
Thank you!
You may want to consider purchasing a subscription to our Enterprise Edition so that you can get access to our hotfixes and support department. These kinds of bugs may not be fixed in the Community Edition right away.
Perry
Hello, Perry
Yes, probably we will purchase a subscription to Enterprise Edition in future when we finally decide to use membase for our projects. We've moved to membese 2 weeks ago and we're still thinking if this is the right product for us. When we finally get a stable version it will help us to make a decision. Currently we have to delete and then create new databacket every two days because of this bug.
Thank you for your answer.
Makes sense. In order to get the fix for this bug sooner, you can build the server from source and re-test.
You'll also be able to download the Enterprise Edition and use it for free in a pre-production environment. Keep an eye for the 1.6.5.3 release in the coming week or so.
Perry
We just released Membase Server Enterprise Edition version 1.6.5.3 which has resolved these and many other stability issues.
Would you please download it and verify that your issues are fixed?
Thanks!
Perry
No the issue is still there under Couchbase 1.8.0
Port server memcached on node 'ns_1@10.xx.137.xxx' exited with status 134. Restarting. Messages: Vbucket <925> is going dead.
Vbucket <926> is going dead.
Vbucket <927> is going dead.
Vbucket <928> is going dead.
Vbucket <929> is going dead.
Vbucket <930> is going dead.
Vbucket <931> is going dead.
Vbucket <932> is going dead.
Vbucket <933> is going dead.
Vbucket <934> is going dead.
Vbucket <935> is going dead.
Vbucket <936> is going dead.
Vbucket <937> is going dead.
Vbucket <938> is going dead.
Vbucket <939> is going dead.
Vbucket <940> is going dead.
Vbucket <941> is going dead.
Vbucket <942> is going dead.
Vbucket <943> is going dead.
Vbucket <944> is going dead.
Vbucket <945> is going dead.
Vbucket <946> is going dead.
Vbucket <947> is going dead.
Vbucket <948> is going dead.
Vbucket <949> is going dead.
Vbucket <950> is going dead.
Vbucket <951> is going dead.
Vbucket <952> is going dead.
Vbucket <953> is going dead.
Vbucket <954> is going dead.
Vbucket <955> is going dead.
Vbucket <956> is going dead.
Vbucket <957> is going dead.
Vbucket <958> is going dead.
Vbucket <959> is going dead.
Vbucket <960> is going dead.
Vbucket <961> is going dead.
Vbucket <962> is going dead.
Vbucket <963> is going dead.
Vbucket <964> is going dead.
Vbucket <965> is going dead.
Vbucket <966> is going dead.
Vbucket <967> is going dead.
Vbucket <968> is going dead.
Vbucket <969> is going dead.
Vbucket <970> is going dead.
Vbucket <971> is going dead.
Vbucket <972> is going dead.
Vbucket <973> is going dead.
Vbucket <974> is going dead.
Vbucket <975> is going dead.
Vbucket <976> is going dead.
Vbucket <977> is going dead.
Vbucket <978> is going dead.
Vbucket <979> is going dead.
Vbucket <980> is going dead.
Vbucket <981> is going dead.
Vbucket <982> is going dead.
Vbucket <983> is going dead.
Vbucket <984> is going dead.
Vbucket <985> is going dead.
Vbucket <986> is going dead.
Vbucket <987> is going dead.
Vbucket <988> is going dead.
Vbucket <989> is going dead.
Vbucket <990> is going dead.
Vbucket <991> is going dead.
Vbucket <992> is going dead.
Vbucket <993> is going dead.
Vbucket <994> is going dead.
Vbucket <995> is going dead.
Vbucket <996> is going dead.
Vbucket <997> is going dead.
Vbucket <998> is going dead.
Vbucket <999> is going dead.
Vbucket <1000> is going dead.
Vbucket <1001> is going dead.
Vbucket <1002> is going dead.
Vbucket <1003> is going dead.
Vbucket <1004> is going dead.
Vbucket <1005> is going dead.
Vbucket <1006> is going dead.
Vbucket <1007> is going dead.
Vbucket <1008> is going dead.
Vbucket <1009> is going dead.
Vbucket <1010> is going dead.
Vbucket <1011> is going dead.
Vbucket <1012> is going dead.
Vbucket <1013> is going dead.
Vbucket <1014> is going dead.
Vbucket <1015> is going dead.
Vbucket <1016> is going dead.
Vbucket <1017> is going dead.
Vbucket <1018> is going dead.
Vbucket <1019> is going dead.
Vbucket <1020> is going dead.
Vbucket <1021> is going dead.
Vbucket <1022> is going dead.
Vbucket <1023> is going dead.
memcached: stored-value.cc:367: static void StoredValue::increaseCacheSize(HashTable&, size_t, bool): Assertion `ht.cacheSize.get() < ((size_t)1<<(sizeof(size_t)*8-1))' failed.
Ben having this issue for 6 months now!
Chris
The issue you post there appears to be quite different from the one at the start of the thread.
Is this from one of the package installs? Is this on one of the OSs listed in the docs under requirements? If so, then...
Can you get a diagnostic report (under logs, generate diagnostic report), compress it and file a new issue at http://www.couchbase.com/issues with a description and the log attached?
It's fixed in refresh branch. We're planning new release soon. It will include fix for that.