Server crash and don´t start
Hi,
The server is part of a cluster of 3 nodes.
It crashed and now don't start.
ERROR REPORT <5862.25264.38> 2012-06-22 03:59:43
===============================================================================
** Generic server <5862.25264.38> terminating
** Last message in was check_for_timeout
** When Server state == {state,0,#Port<5862.226293>,#Port<5862.226292>,
<5862.25265.38>,<<>>,<<>>,
{set,171,35,64,32,175,105,
{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]},
{{[128,64,0],
"s3","¦f&",
[153,121,89,57,25],
[140,108,76,44,12],
[159,127,95,63,31],
[146,114,82,50,18],
[165,133,101,69,37,5],
[152,120,88,56,24],
[139,107,75,43,11],
[158,126,94,62,30],
[145,113,81,49,17],
[164,132,100,68,36,4],
[151,119,87,55,23],
[138,106,74,42,10],
[157,125,93,61,29]},
{[144,112,80,48,16],
[163,131,99,67,35,3],
[150,118,86,54,22],
[169,137,105,73,41,9],
[156,124,92,60,28],
[143,111,79,47,15],
[162,130,98,66,34,2],
[149,117,85,53,21],
[511,168,136,104,72,40,8],
[155,123,91,59,27],
[142,110,78,46,14],
[161,129,97,65,33,1],
[148,116,84,52,20],
[167,135,103,71,39,7],
[154,122,90,58,26],
[141,109,77,45,13]},
{" ` ",
[147,83,19],
[134,70,6],
[],[],[],[],[],[],[],[],[],[],[],[],[]},
{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]}}},
-1,false,false,0,
{1340,333814,129633},
undefined}
** Reason for termination ==
** timeout
CRASH REPORT <5862.25264.38> 2012-06-22 03:59:43
===============================================================================
Crashing process
initial_call {ebucketmigrator_srv,init,['Argument__1']}
pid <5862.25264.38>
registered_name []
error_info
{exit,timeout,
[{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}
ancestors ['ns_vbm_sup-default','single_bucket_sup-default',<5862.272.0>]
messages [{'EXIT',<5862.25265.38>,killed}]
links [<5862.52.0>,<5862.20666.17>,#Port<5862.226292>]
dictionary []
trap_exit true
status running
heap_size 1597
stack_size 24
reductions 330901
The systems is Centos 6.2 and Couchbase-server 1.8.0
I was not doing anything in the system before the crash, the system was idle.
Now when I try to boot this node I get the error:
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.245.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
{[47,111,112,116,47,99,111,117,99,104,98,97,115,101,47,118,97,114,47,108,105,98,47,99,111,117,99,104,98,97,115,101,47,108,111,103,115],10485760,20,#Fun}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.246.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.247.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.248.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.249.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.250.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=SUPERVISOR REPORT==== 22-Jun-2012::21:24:50 ===
Supervisor: {local,ns_server_cluster_sup}
Context: child_terminated
Reason: {{badmatch,{'EXIT',{file,open}}},
[{ns_log_mf_h,'-start_link/0-fun-1-',0},
{misc,'-start_event_link/1-fun-0-',1}]}
Offender: [{pid,<0.250.0>},
{name,ns_log_mf_h},
{mfargs,{ns_log_mf_h,start_link,[]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
=PROGRESS REPORT==== 22-Jun-2012::21:24:50 ===
supervisor: {local,ns_server_cluster_sup}
started: [{pid,<0.251.0>},
{name,ns_log_mf_h},
{mfargs,{ns_log_mf_h,start_link,[]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
{[47,111,112,116,47,99,111,117,99,104,98,97,115,101,47,118,97,114,47,108,105,98,47,99,111,117,99,104,98,97,115,101,47,108,111,103,115],10485760,20,#Fun}
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
Error in process <0.251.0> on node 'ns_1@192.168.62.206' with exit value: {{badmatch,{'EXIT',{file,open}}},[{ns_log_mf_h,'-start_link/0-fun-1-',0},{misc,'-start_event_link/1-fun-0-',1}]}
=SUPERVISOR REPORT==== 22-Jun-2012::21:24:50 ===
Supervisor: {local,ns_server_cluster_sup}
Context: child_terminated
Reason: {{badmatch,{'EXIT',{file,open}}},
[{ns_log_mf_h,'-start_link/0-fun-1-',0},
{misc,'-start_event_link/1-fun-0-',1}]}
Offender: [{pid,<0.251.0>},
{name,ns_log_mf_h},
{mfargs,{ns_log_mf_h,start_link,[]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
=SUPERVISOR REPORT==== 22-Jun-2012::21:24:50 ===
Supervisor: {local,ns_server_cluster_sup}
Context: shutdown
Reason: reached_max_restart_intensity
Offender: [{pid,<0.251.0>},
{name,ns_log_mf_h},
{mfargs,{ns_log_mf_h,start_link,[]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
=INFO REPORT==== 22-Jun-2012::21:24:50 ===
moxi<0.230.0>: 2012-06-22 21:24:50: (cproxy_config.c.317) env: MOXI_SASL_PLAIN_USR (13)
moxi<0.230.0>: 2012-06-22 21:24:50: (cproxy_config.c.326) env: MOXI_SASL_PLAIN_PWD (9)
=INFO REPORT==== 22-Jun-2012::21:24:50 ===
memcached<0.232.0>: EOL on stdin. Exiting
=INFO REPORT==== 22-Jun-2012::21:24:50 ===
ns_1@192.168.62.206:<0.209.0>:mb_master:453: Got new peer supporting versioning: 'ns_1@192.168.62.207'
=ERROR REPORT==== 22-Jun-2012::21:24:50 ===
ns_1@192.168.62.206:<0.209.0>:mb_master:507: Got new-style heartbeat from 'ns_1@192.168.62.207' node when in compatible mode
=INFO REPORT==== 22-Jun-2012::21:24:51 ===
moxi<0.230.0>: EOL on stdin. Exiting
=INFO REPORT==== 22-Jun-2012::21:24:51 ===
menelaus_web streaming socket closed by client
Mnesia('ns_1@192.168.62.206'): mnesia_controller terminated: shutdown
Mnesia('ns_1@192.168.62.206'): mnesia_tm terminated: shutdown
Mnesia('ns_1@192.168.62.206'): mnesia_recover terminated: shutdown
Mnesia('ns_1@192.168.62.206'): mnesia_locker terminated: shutdown
Mnesia('ns_1@192.168.62.206'): mnesia_subscr terminated: shutdown
Mnesia('ns_1@192.168.62.206'): mnesia_monitor terminated: shutdown
=INFO REPORT==== 22-Jun-2012::21:24:51 ===
application: mnesia
exited: stopped
type: temporary
=INFO REPORT==== 22-Jun-2012::21:24:51 ===
ns_1@192.168.62.206:<0.68.0>:mb_mnesia:254: Shut Mnesia down: shutdown. Exiting.
=INFO REPORT==== 22-Jun-2012::21:24:51 ===
application: ns_server
exited: shutdown
type: temporary
Thanks,
Double check that directory /opt/couchbase/var/lib/couchbase/logs exists and is writable by couchbase user. Looks like logger fails to open log file(s).
If directory does exists and is writeable, try renaming all files in this directory or moving them away.
In this directory had a file with "-rw-r - r -. 1 root root" and the rest were like "-rw-r - r -. 1 couchbase couchbase". I changed the owner of this file to couchbase user and now couchbase-server starts.
Could this be the cause of the crash?
Thanks
Maybe. But I need to see logs in order to be sure.
How I can upload the log file?
File a bug report here: http://www.couchbase.com/issues/secure/Dashboard.jspa and upload diagnostics that's available on Log page with link named 'Generate diagnostics report'.
There isn't much in these logs to identify what the issue is. Can you tell me what system you are running on, what you were doing before the crash happened, and provide any other information you might have?