Cannot add node on couchbase 1.8 built from source
We deployed two couchbase (1.8.0) on OpenSUSE. Because there is no pre-built couchbase package for SUSE and installing from rpm brings a lot of dependency error, we built both couchbase from source. Both couchbase work fine as a single-node cluster, the data buckets are up and we connect with our java client it's working all right.
But we got problem when tring to add one of the node to another to form a cluster. We used Web console -> Manage -> Server Nodes -> Add Server to add a node to a cluster. (I know this will erase all data on the server to be added and that's fine in our case) But it displays an error message "Save request returned error" on the UI. And repeating the above from another server just generates similar symptoms and logs.
Checking the log on web console, there are following logs:
Server error during processing: ["web request failed",
{path,"/controller/addNode"},
{type,exit},
{what,
{{undef,
[{http,request,
[post,
{"http://10.190.233.76:8091/engageCluster2",
[{"Authorization",
"Basic QWRtaW5pc3RyYXRvcjpmaXJlIzE4KnZhbGU="}],
"application/json",
"{\"availableStorage\":{\"hdd\":[{\"path\":\"/\",\"sizeKBytes\":8262036,\"usagePercent\":48},{\"path\":\"/dev\",\"sizeKBytes\":1802312,\"usagePercent\":1},{\"path\":\"/usr/local\",\"sizeKBytes\":8238464,\"usagePercent\":30},{\"path\":\"/data\",\"sizeKBytes\":187846516,\"usagePercent\":3}]},\"memoryQuota\":2516,\"storageTotals\":{\"ram\":{\"usedByData\":27058816,\"total\":3691134976.0,\"quotaTotal\":2638217216.0,\"used\":1231179776},\"hdd\":{\"usedByData\":2538496,\"total\":192354832384.0,\"quotaTotal\":192354832384.0,\"used\":5770644971.0,\"free\":186584187413.0}},\"storage\":{\"ssd\":[],\"hdd\":[{\"path\":\"/data/membase/var/lib/couchbase/data\",\"quotaMb\":\"none\",\"state\":\"ok\"}]},\"systemStats\":{\"cpu_utilization_rate\":0,\"swap_total\":0,\"swap_used\":0},\"interestingStats\":{\"curr_items\":0,\"curr_items_tot\":0,\"vb_replica_curr_items\":0},\"uptime\":\"10180\",\"memoryTotal\":3691134976.0,\"memoryFree\":2459955200.0,\"mcdMemoryReserved\":2816,\"mcdMemoryAllocated\":2816,\"otpNode\":\"ns_1@10.190.248.245\",\"otpCookie\":\"tdvokvqxteckkywc\",\"clusterMembership\":\"active\",\"status\":\"healthy\",\"hostname\":\"10.190.248.245:8091\",\"clusterCompatibility\":1,\"version\":\"1.8.0r-25-g1e1c2c0-enterprise\",\"os\":\"x86_64-unknown-linux-gnu\",\"ports\":{\"proxy\":11211,\"direct\":11210}}"},
[{timeout,30000},
{connect_timeout,30000}],
[]],
[]},
{menelaus_rest,json_request_hilevel,3,
[{file,"src/menelaus_rest.erl"},
{line,82}]},
{ns_cluster,
do_add_node_with_connectivity,3,
[{file,"src/ns_cluster.erl"},
{line,318}]},
{ns_cluster,handle_call,3,
[{file,"src/ns_cluster.erl"},
{line,110}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,588}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,227}]}]},
{gen_server,call,
[ns_cluster,
{add_node,"10.190.233.76",8091,
{"Administrator","********"}},
30000]}}},
{trace,
[{gen_server,call,3,
[{file,"gen_server.erl"},{line,188}]},
{ns_cluster,add_node,3,
[{file,"src/ns_cluster.erl"},{line,61}]},
{menelaus_web,handle_add_node,1,
[{file,"src/menelaus_web.erl"},
{line,1361}]},
{menelaus_web,loop,3,
[{file,"src/menelaus_web.erl"},
{line,303}]},
{mochiweb_http,headers,5,
[{file,"src/mochiweb_http.erl"},
{line,133}]},
{proc_lib,init_p_do_apply,3,
Running cbbrowse_logs on the server we operated generate a log and I found the following are worthnoting error logs:
ERROR REPORT <5996.170.0> 2012-07-10 22:03:49
===============================================================================
ns_1@10.190.248.245:<5996.170.0>:ns_heart:137: Failed to grab system stats:
{error,{exit,{badarg,[{erlang,hd,[[]],[]},
{stats_reader,'-do_handle_call/3-fun-0-',2,
[{file,"src/stats_reader.erl"},
{line,115}]},
{mnesia_tm,non_transaction,5,
[{file,"mnesia_tm.erl"},{line,734}]},
{stats_reader,do_handle_call,3,
[{file,"src/stats_reader.erl"},
{line,110}]},
{stats_reader,handle_call,3,
[{file,"src/stats_reader.erl"},
{line,104}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,588}]},
{proc_lib,init_p_do_apply,3,
AND:
ERROR REPORT <5996.18239.0> 2012-07-10 22:02:00
===============================================================================
** Generic server ns_cluster terminating
** Last message in was {add_node,"10.190.233.76",8091,
{"Administrator","********"}}
** When Server state == {state}
** Reason for termination ==
** {'module could not be loaded',
[{http,request,
[post,
{"http://10.190.233.76:8091/engageCluster2",
[{"Authorization","Basic QWRtaW5pc3RyYXRvcjpmaXJlIzE4KnZhbGU="}],
"application/json",
"{\"availableStorage\":{\"hdd\":[{\"path\":\"/\",\"sizeKBytes\":8262036,\"usagePercent\":48},{\"path\":\"/dev\",\"size
KBytes\":1802312,\"usagePercent\":1},{\"path\":\"/usr/local\",\"sizeKBytes\":8238464,\"usagePercent\":30},{\"path\":\"/data\",\"size
KBytes\":187846516,\"usagePercent\":3}]},\"memoryQuota\":2516,\"storageTotals\":{\"ram\":{\"usedByData\":27058816,\"total\":36911349
76.0,\"quotaTotal\":2638217216.0,\"used\":1231179776},\"hdd\":{\"usedByData\":2538496,\"total\":192354832384.0,\"quotaTotal\":192354
832384.0,\"used\":5770644971.0,\"free\":186584187413.0}},\"storage\":{\"ssd\":[],\"hdd\":[{\"path\":\"/data/membase/var/lib/couchbas
e/data\",\"quotaMb\":\"none\",\"state\":\"ok\"}]},\"systemStats\":{\"cpu_utilization_rate\":0,\"swap_total\":0,\"swap_used\":0},\"in
terestingStats\":{\"curr_items\":0,\"curr_items_tot\":0,\"vb_replica_curr_items\":0},\"uptime\":\"10180\",\"memoryTotal\":3691134976
.0,\"memoryFree\":2459955200.0,\"mcdMemoryReserved\":2816,\"mcdMemoryAllocated\":2816,\"otpNode\":\"ns_1@10.190.248.245\",\"otpCooki
e\":\"tdvokvqxteckkywc\",\"clusterMembership\":\"active\",\"status\":\"healthy\",\"hostname\":\"10.190.248.245:8091\",\"clusterCompa
tibility\":1,\"version\":\"1.8.0r-25-g1e1c2c0-enterprise\",\"os\":\"x86_64-unknown-linux-gnu\",\"ports\":{\"proxy\":11211,\"direct\"
:11210}}"},
[{timeout,30000},{connect_timeout,30000}],
[]],
[]},
{menelaus_rest,json_request_hilevel,3,
[{file,"src/menelaus_rest.erl"},{line,82}]},
{ns_cluster,do_add_node_with_connectivity,3,
[{file,"src/ns_cluster.erl"},{line,318}]},
{ns_cluster,handle_call,3,[{file,"src/ns_cluster.erl"},{line,110}]},
{gen_server,handle_msg,5,[{file,"gen_server.erl"},{line,588}]},
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}
AND:
CRASH REPORT <5996.18239.0> 2012-07-10 22:02:00
===============================================================================
Crashing process
initial_call {ns_cluster,init,['Argument__1']}
pid <5996.18239.0>
registered_name ns_cluster
error_info
{exit,{undef,[{http,request,
[post,
{"http://10.190.233.76:8091/engageCluster2",
[{"Authorization",
"Basic QWRtaW5pc3RyYXRvcjpmaXJlIzE4KnZhbGU="}],
"application/json",
"{\"availableStorage\":{\"hdd\":[{\"path\":\"/\",\"sizeKBytes\":8262036,\"usagePercent\":48},{\"path\"
:\"/dev\",\"sizeKBytes\":1802312,\"usagePercent\":1},{\"path\":\"/usr/local\",\"sizeKBytes\":8238464,\"usagePercent\":30},{\"path\":
\"/data\",\"sizeKBytes\":187846516,\"usagePercent\":3}]},\"memoryQuota\":2516,\"storageTotals\":{\"ram\":{\"usedByData\":27058816,\"
total\":3691134976.0,\"quotaTotal\":2638217216.0,\"used\":1231179776},\"hdd\":{\"usedByData\":2538496,\"total\":192354832384.0,\"quo
taTotal\":192354832384.0,\"used\":5770644971.0,\"free\":186584187413.0}},\"storage\":{\"ssd\":[],\"hdd\":[{\"path\":\"/data/membase/
var/lib/couchbase/data\",\"quotaMb\":\"none\",\"state\":\"ok\"}]},\"systemStats\":{\"cpu_utilization_rate\":0,\"swap_total\":0,\"swa
p_used\":0},\"interestingStats\":{\"curr_items\":0,\"curr_items_tot\":0,\"vb_replica_curr_items\":0},\"uptime\":\"10180\",\"memoryTo
tal\":3691134976.0,\"memoryFree\":2459955200.0,\"mcdMemoryReserved\":2816,\"mcdMemoryAllocated\":2816,\"otpNode\":\"ns_1@10.190.248.
245\",\"otpCookie\":\"tdvokvqxteckkywc\",\"clusterMembership\":\"active\",\"status\":\"healthy\",\"hostname\":\"10.190.248.245:8091\
",\"clusterCompatibility\":1,\"version\":\"1.8.0r-25-g1e1c2c0-enterprise\",\"os\":\"x86_64-unknown-linux-gnu\",\"ports\":{\"proxy\":
11211,\"direct\":11210}}"},
[{timeout,30000},{connect_timeout,30000}],
[]],
{menelaus_rest,json_request_hilevel,3,
[{file,"src/menelaus_rest.erl"},
{line,82}]},
{ns_cluster,do_add_node_with_connectivity,3,
[{file,"src/ns_cluster.erl"},{line,318}]},
{ns_cluster,handle_call,3,
[{file,"src/ns_cluster.erl"},{line,110}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,588}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,227}]}]},
[{gen_server,terminate,6,[{file,"gen_server.erl"},{line,747}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,227}]}]}
ancestors [ns_server_cluster_sup,<5996.51.0>]
messages []
links [<5996.53.0>]
dictionary []
trap_exit false
status running
heap_size 75025
stack_size 24
btw, this kind of logs appears (in cbbrowse_log dump) every five seconds, what could this indicate?
ERROR REPORT <5786.155.0> 2012-07-11 15:23:41
===============================================================================
ns_1@10.190.233.76:<5786.155.0>:ns_heart:137: Failed to grab system stats:
{error,{exit,{badarg,[{erlang,hd,[[]],[]},
{stats_reader,'-do_handle_call/3-fun-0-',2,
[{file,"src/stats_reader.erl"},
{line,115}]},
{mnesia_tm,non_transaction,5,
[{file,"mnesia_tm.erl"},{line,734}]},
{stats_reader,do_handle_call,3,
[{file,"src/stats_reader.erl"},
{line,110}]},
{stats_reader,handle_call,3,
[{file,"src/stats_reader.erl"},
{line,104}]},
{gen_server,handle_msg,5,
[{file,"gen_server.erl"},{line,588}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,227}]}]}}}