Hi, we have a cluster using the official docker image couchbase/server:community-6.6.0.
We have deployed it using kubernetes (no operator involved). The nodes tipically start up properly, but on some server node located in certain computes in our infrastruture, they fail to start. Has anybody experienced any of the errors below?
Note: Infrastructure layer:
- docker image: couchbase/server:community-6.6.0.
- kubernetes nodes → openstack instance using Flatcar 2765.2.3
checking ulimits, this is the response of the command:
couchbase@devops-couchbase-0:/$ ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 95976
max locked memory (kbytes, -l) 8192
max memory size (kbytes, -m) unlimited
open files (-n) 40960
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 40960
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Errors spotted
[ns_server:info,2024-03-19T09:16:47.633Z,ns_1@cb.local:ns_couchdb_port<0.233.0>:ns_port_server:log:224]ns_couchdb<0.233.0>: [os_mon] cpu supervisor port (cpu_sup): Erlang has closed
ns_couchdb<0.233.0>: [os_mon] memory supervisor port (memsup): Erlang has closed
couchbase@couchbase-7:/$ tail -f /opt/couchbase/var/lib/couchbase/logs/debug.log -n 100
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:cb_dist<0.176.0>:cb_dist:info_msg:754]cb_dist: Added connection {con,#Ref<0.1344419249.2259943427.12376>,
inet_tcp_dist,undefined,undefined}
[ns_server:info,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:ns_couchdb_port<0.233.0>:ns_port_server:log:224]ns_couchdb<0.233.0>: 15932: Booted. Waiting for shutdown request
ns_couchdb<0.233.0>: 15932: got shutdown request. Exiting
ns_couchdb<0.233.0>: working as port
ns_couchdb<0.233.0>: [os_mon] memory supervisor port (memsup): Erlang has closed
ns_couchdb<0.233.0>: [os_mon] cpu supervisor port (cpu_sup): Erlang has closed
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:cb_dist<0.176.0>:cb_dist:info_msg:754]cb_dist: Updated connection: {con,#Ref<0.1344419249.2259943427.12376>,
inet_tcp_dist,<0.1152.0>,
#Ref<0.1344419249.2259943427.12378>}
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.228.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {user_storage_events,<0.226.0>} exited with reason shutdown
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.227.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.226.0>} exited with reason shutdown
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.225.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.223.0>} exited with reason shutdown
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.224.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {user_storage_events,<0.223.0>} exited with reason shutdown
[ns_server:debug,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:cb_dist<0.176.0>:cb_dist:info_msg:754]cb_dist: Connection down: {con,#Ref<0.1344419249.2259943427.12376>,
inet_tcp_dist,<0.1152.0>,
#Ref<0.1344419249.2259943427.12378>}
[error_logger:info,2024-03-20T14:04:14.471Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================INFO REPORT=========================
{net_kernel,{'EXIT',<0.1152.0>,shutdown}}
[error_logger:info,2024-03-20T14:04:14.474Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================INFO REPORT=========================
{net_kernel,{net_kernel,913,nodedown,'couchdb_ns_1@cb.local'}}
[ns_server:debug,2024-03-20T14:04:14.474Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.216.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.215.0>} exited with reason shutdown
[ns_server:debug,2024-03-20T14:04:14.475Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:<0.201.0>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {ns_config_events,<0.200.0>} exited with reason shutdown
[error_logger:error,2024-03-20T14:04:14.475Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,ns_server_cluster_sup}
Context: start_error
Reason: {shutdown,
{failed_to_start_child,wait_for_couchdb_node,timeout}}
Offender: [{pid,undefined},
{id,ns_server_nodes_sup},
{mfargs,
{restartable,start_link,
[{ns_server_nodes_sup,start_link,[]},infinity]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
[ns_server:debug,2024-03-20T14:04:14.475Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:ns_config<0.193.0>:ns_config:wait_saver:866]Done waiting for saver.
[error_logger:error,2024-03-20T14:04:14.476Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,root_sup}
Context: start_error
Reason: {shutdown,
{failed_to_start_child,ns_server_nodes_sup,
{shutdown,
{failed_to_start_child,wait_for_couchdb_node,
timeout}}}}
Offender: [{pid,undefined},
{id,ns_server_cluster_sup},
{mfargs,{ns_server_cluster_sup,start_link,[]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,supervisor}]
[error_logger:error,2024-03-20T14:04:14.476Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: application_master:init/4
pid: <0.117.0>
registered_name: []
exception exit: {{shutdown,
{failed_to_start_child,ns_server_cluster_sup,
{shutdown,
{failed_to_start_child,ns_server_nodes_sup,
{shutdown,
{failed_to_start_child,wait_for_couchdb_node,
timeout}}}}}},
{ns_server,start,[normal,[]]}}
in function application_master:init/4 (application_master.erl, line 134)
ancestors: [<0.116.0>]
message_queue_len: 1
messages: [{'EXIT',<0.118.0>,normal}]
links: [<0.116.0>,<0.33.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 610
stack_size: 27
reductions: 274
neighbours:
[error_logger:info,2024-03-20T14:04:14.477Z,ns_1@couchbase-7.couchbase.lta.svc.cluster.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================INFO REPORT=========================
application: ns_server
exited: {{shutdown,
{failed_to_start_child,ns_server_cluster_sup,
{shutdown,
{failed_to_start_child,ns_server_nodes_sup,
{shutdown,
{failed_to_start_child,wait_for_couchdb_node,
timeout}}}}}},
{ns_server,start,[normal,[]]}}
type: permanent
babysitter.log
[error_logger:error,2024-03-20T14:17:33.004Z,babysitter_of_ns_1@cb.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,child_ns_server_sup}
Context: child_terminated
Reason: {abnormal,3}
Offender: [{pid,<0.1966.0>},
{id,{ns_server,"/opt/couchbase/lib/erlang/bin/erl",
["+A","16","-smp","enable","+sbt","u","+P","327680",
"+K","true","+swt","low","+sbwt","none","+MMmcs",
"30","+e102400","-setcookie","nocookie","-kernel",
"error_logger","false","-sasl","sasl_error_logger",
"false","-user","user_io","-run","child_erlang",
"child_start","ns_bootstrap","--","-smp","enable",
"-kernel","error_logger","false","inetrc",
"\"/opt/couchbase/etc/couchbase/hosts.cfg\"",
"dist_config_file",
"\"/opt/couchbase/var/lib/couchbase/config/dist_cfg\"",
"-proto_dist","cb","-epmd_module","cb_epmd",
"-ssl_dist_optfile",
"/opt/couchbase/etc/couchbase/ssl_dist_opts",
"-kernel","global_enable_tracing","false",
"-couch_ini",
"/opt/couchbase/etc/couchdb/default.ini",
"/opt/couchbase/etc/couchdb/default.d/capi.ini",
"/opt/couchbase/etc/couchdb/default.d/geocouch.ini",
"/opt/couchbase/etc/couchdb/local.ini"],
[{env,
[{"NS_SERVER_BABYSITTER_PID","43"},
{"CHILD_ERLANG_ENV_ARGS",
"[{sasl,\n [{sasl_error_logger,false},{included_applications,[]},{errlog_type,all}]},\n {ns_babysitter,\n [{cookiefile,\n \"/opt/couchbase/var/lib/couchbase/couchbase-server.babysitter.cookie\"},\n {included_applications,[]},\n {nodefile,\n \"/opt/couchbase/var/lib/couchbase/couchbase-server.babysitter.node\"},\n {pidfile,\"/opt/couchbase/var/lib/couchbase/couchbase-server.pid\"}]},\n {ns_server,\n [{path_config_datadir,\"/opt/couchbase/var/lib/couchbase\"},\n {path_config_etcdir,\"/opt/couchbase/etc/couchbase\"},\n {path_config_bindir,\"/opt/couchbase/bin\"},\n {error_logger_mf_dir,\"/opt/couchbase/var/lib/couchbase/logs\"},\n {loglevel_xdcr,debug},\n {max_t,10},\n {path_config_libdir,\"/opt/couchbase/lib\"},\n {net_kernel_verbosity,10},\n {loglevel_ns_doctor,debug},\n {config_path,\"/opt/couchbase/etc/couchbase/static_config\"},\n {path_config_tmpdir,\"/opt/couchbase/var/lib/couchbase/tmp\"},\n {loglevel_views,debug},\n {loglevel_default,debug},\n {loglevel_menelaus,debug},\n {loglevel_ns_server,debug},\n {loglevel_rebalance,debug},\n {loglevel_user,debug},\n {loglevel_error_logger,debug},\n {loglevel_mapreduce_errors,debug},\n {path_config_secdir,\"/opt/couchbase/etc/security\"},\n {loglevel_cbas,debug},\n {disk_sink_opts_json_rpc,\n [{rotation,\n [{compress,true},\n {size,41943040},\n {num_files,2},\n {buffer_size_max,52428800}]}]},\n {nodefile,\"/opt/couchbase/var/lib/couchbase/couchbase-server.node\"},\n {loglevel_couchdb,info},\n {included_applications,[]},\n {loglevel_access,info},\n {disk_sink_opts,\n [{rotation,\n [{compress,true},\n {size,41943040},\n {num_files,10},\n {buffer_size_max,52428800}]}]},\n {max_r,20},\n {loglevel_stats,debug},\n {loglevel_cluster,debug}]}]"},
{"ERL_CRASH_DUMP",
"erl_crash.dump.1710843935.43.ns_server"}]},
exit_status,use_stdio,stream,eof]}},
{mfargs,
{restartable,start_link,
[{supervisor_cushion,start_link,
[ns_server,5000,infinity,ns_port_server,
start_link,
[#Fun<ns_child_ports_sup.4.69815219>]]},
86400000]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,worker}]
[ns_server:debug,2024-03-20T14:17:33.028Z,babysitter_of_ns_1@cb.local:<0.2004.0>:restartable:start_child:98]Started child process <0.2005.0>
MFA: {supervisor_cushion,start_link,
[ns_server,5000,infinity,ns_port_server,start_link,
[#Fun<ns_child_ports_sup.4.69815219>]]}
[error_logger:info,2024-03-20T14:17:33.029Z,babysitter_of_ns_1@cb.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================PROGRESS REPORT=========================
supervisor: {local,child_ns_server_sup}
started: [{pid,<0.2004.0>},
{id,{ns_server,"/opt/couchbase/lib/erlang/bin/erl",
["+A","16","-smp","enable","+sbt","u","+P",
"327680","+K","true","+swt","low","+sbwt",
"none","+MMmcs","30","+e102400","-setcookie",
"nocookie","-kernel","error_logger","false",
"-sasl","sasl_error_logger","false","-user",
"user_io","-run","child_erlang","child_start",
"ns_bootstrap","--","-smp","enable","-kernel",
"error_logger","false","inetrc",
"\"/opt/couchbase/etc/couchbase/hosts.cfg\"",
"dist_config_file",
"\"/opt/couchbase/var/lib/couchbase/config/dist_cfg\"",
"-proto_dist","cb","-epmd_module","cb_epmd",
"-ssl_dist_optfile",
"/opt/couchbase/etc/couchbase/ssl_dist_opts",
"-kernel","global_enable_tracing","false",
"-couch_ini",
"/opt/couchbase/etc/couchdb/default.ini",
"/opt/couchbase/etc/couchdb/default.d/capi.ini",
"/opt/couchbase/etc/couchdb/default.d/geocouch.ini",
"/opt/couchbase/etc/couchdb/local.ini"],
[{env,
[{"NS_SERVER_BABYSITTER_PID","43"},
{"CHILD_ERLANG_ENV_ARGS",
"[{sasl,\n [{sasl_error_logger,false},{included_applications,[]},{errlog_type,all}]},\n {ns_babysitter,\n [{cookiefile,\n \"/opt/couchbase/var/lib/couchbase/couchbase-server.babysitter.cookie\"},\n {included_applications,[]},\n {nodefile,\n \"/opt/couchbase/var/lib/couchbase/couchbase-server.babysitter.node\"},\n {pidfile,\"/opt/couchbase/var/lib/couchbase/couchbase-server.pid\"}]},\n {ns_server,\n [{path_config_datadir,\"/opt/couchbase/var/lib/couchbase\"},\n {path_config_etcdir,\"/opt/couchbase/etc/couchbase\"},\n {path_config_bindir,\"/opt/couchbase/bin\"},\n {error_logger_mf_dir,\"/opt/couchbase/var/lib/couchbase/logs\"},\n {loglevel_xdcr,debug},\n {max_t,10},\n {path_config_libdir,\"/opt/couchbase/lib\"},\n {net_kernel_verbosity,10},\n {loglevel_ns_doctor,debug},\n {config_path,\"/opt/couchbase/etc/couchbase/static_config\"},\n {path_config_tmpdir,\"/opt/couchbase/var/lib/couchbase/tmp\"},\n {loglevel_views,debug},\n {loglevel_default,debug},\n {loglevel_menelaus,debug},\n {loglevel_ns_server,debug},\n {loglevel_rebalance,debug},\n {loglevel_user,debug},\n {loglevel_error_logger,debug},\n {loglevel_mapreduce_errors,debug},\n {path_config_secdir,\"/opt/couchbase/etc/security\"},\n {loglevel_cbas,debug},\n {disk_sink_opts_json_rpc,\n [{rotation,\n [{compress,true},\n {size,41943040},\n {num_files,2},\n {buffer_size_max,52428800}]}]},\n {nodefile,\"/opt/couchbase/var/lib/couchbase/couchbase-server.node\"},\n {loglevel_couchdb,info},\n {included_applications,[]},\n {loglevel_access,info},\n {disk_sink_opts,\n [{rotation,\n [{compress,true},\n {size,41943040},\n {num_files,10},\n {buffer_size_max,52428800}]}]},\n {max_r,20},\n {loglevel_stats,debug},\n {loglevel_cluster,debug}]}]"},
{"ERL_CRASH_DUMP",
"erl_crash.dump.1710843935.43.ns_server"}]},
exit_status,use_stdio,stream,eof]}},
{mfargs,
{restartable,start_link,
[{supervisor_cushion,start_link,
[ns_server,5000,infinity,ns_port_server,
start_link,
[#Fun<ns_child_ports_sup.4.69815219>]]},
86400000]}},
{restart_type,permanent},
{shutdown,infinity},
{child_type,worker}]
Thanks in advance