Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Blocker
-
Resolution: Fixed
-
Affects Version/s: 2.0-beta
-
Fix Version/s: 2.0-beta
-
Component/s: ns_server, view-engine
-
Security Level: Public
-
Labels:None
-
Environment:See
MB-6058(second thing)
Description
After one of nodes was rebooted during rebalance and rebalance failed twice there's now some problem on that rebooted node.
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Restarting compaction for replica group `_design/dev_view3`, set view `default`. Reason: partition states were updated
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.1200.0>:couch_log:info:39] Restarting compaction for replica group `_design/dev_view2`, set view `default`. Reason: partition states were updated
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view3`, compaction starting
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.1200.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view2`, compaction starting
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.705.0>:couch_log:info:39] Set view `default`, main group `_design/dev_view3`, defined new replica partitions: [47,48,49,50,51,52,150,151,152,153,154,155,156,157,187,188,189,190,205,254,255,256,277,278,279,280,308,356,357,358,359,401,403,457,458,459,460,461,462,463,487,488,489,672,673,674,675,712,713,714,740,741,742,792,793,794,795,796,797,798,799]
New full set of replica partitions is: [47,48,49,50,51,52,150,151,152,153,154,155,156,157,187,188,189,190,205,254,255,256,277,278,279,280,308,356,357,358,359,401,403,457,458,459,460,461,462,463,487,488,489,672,673,674,675,712,713,714,740,741,742,792,793,794,795,796,797,798,799]
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.705.0>:couch_log:info:39] Set view `default`, main group `_design/dev_view3`, terminating with reason: {{badmatch,
{not_found,
no_db_file}},
[{couch_set_view_group,
monitor_partitions,
3},
{couch_set_view_group,
monitor_partitions,
2},
{couch_set_view_group,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]}
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view3`, terminating with reason: shutdown
[views:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:'capi_set_view_manager-default':capi_set_view_manager:apply_index_states:367]
couch_set_view:add_replica_partitions([<<"default">>,<<"_design/dev_view3">>,
[47,48,49,50,51,52,150,151,152,153,154,
155,156,157,187,188,189,190,205,254,
255,256,277,278,279,280,308,356,357,
358,359,401,403,457,458,459,460,461,
462,463,487,488,489,672,673,674,675,
712,713,714,740,741,742,792,793,794,
795,796,797,798,799]]) raised exit:{{{badmatch,
{not_found,
no_db_file}},
[{couch_set_view_group,
monitor_partitions,
3},
{couch_set_view_group,
monitor_partitions,
2},
{couch_set_view_group,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]},
{gen_server,
call,
[<0.705.0>,
{add_replicas,
6641967501501417383806791632523375489625984170218009161305639732734490564036031833581652368858531824236549515887750274907866037489577590312730316390889342644699189815496146066239026917192134394144920743793928168199194568457626425643944116224},
infinity]}}
[error_logger:error] [2012-08-13 12:19:19] [ns_1@10.3.121.25:error_logger:ale_error_logger_handler:log_msg:76] ** Generic server <0.705.0> terminating
** Last message in was {add_replicas,6641967501501417383806791632523375489625984170218009161305639732734490564036031833581652368858531824236549515887750274907866037489577590312730316390889342644699189815496146066239026917192134394144920743793928168199194568457626425643944116224}
** When Server state == {state,
{"/data",<<"default">>,
{set_view_group,
<<101,48,96,72,5,6,250,227,107,51,35,214,124,13,132,
179>>,
nil,<<"default">>,<<"_design/dev_view3">>,[],
[{set_view,0,
[<<"view3">>],
<<"function (doc) {\n emit(doc['name'], [doc['achievements'], {location:{city: doc['city']}}]);\n}\n\n">>,
nil,[],[],undefined}],
nil,nil,
{set_view_index_header,1,0,0,0,0,[],nil,[],false,
[],nil,[]},
nil,main,nil,nil,nil,[]}},
<0.710.0>,
{set_view_group,
<<101,48,96,72,5,6,250,227,107,51,35,214,124,13,132,
179>>,
<0.5401.0>,<<"default">>,<<"_design/dev_view3">>,[],
[{set_view,0,
[<<"view3">>],
<<"function (doc) {\n emit(doc['name'], [doc['achievements'], {location:{city: doc['city']}}]);\n}\n\n">>,
{btree,<0.5401.0>,nil,
#Fun<couch_btree.3.59827385>,
#Fun<couch_btree.4.7841881>,
#Fun<couch_set_view_group.27.133282011>,
#Fun<couch_set_view_group.26.3863607>,5120,true},
[],[],#Ref<0.0.10.20933>}],
{btree,<0.5401.0>,nil,#Fun<couch_btree.3.59827385>,
#Fun<couch_btree.4.7841881>,
#Fun<couch_btree.5.72034400>,
#Fun<couch_set_view_group.11.83913258>,5120,true},
<0.5406.0>,
{set_view_index_header,1,1024,
225460087946287989877899396285904440311899765536343703386406136522124206147773051676300150243028163086763687936,
0,0,
[{74,0},
{92,0},
{93,0},
{263,0},
{264,0},
{298,0},
{365,0},
{366,0}],
nil,
Full logs below. Could be some issue in ns_server but since it's becoming late here I'm filing this in hope Filipe can find something obvious in view engine's log messages.
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Restarting compaction for replica group `_design/dev_view3`, set view `default`. Reason: partition states were updated
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.1200.0>:couch_log:info:39] Restarting compaction for replica group `_design/dev_view2`, set view `default`. Reason: partition states were updated
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view3`, compaction starting
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.1200.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view2`, compaction starting
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.705.0>:couch_log:info:39] Set view `default`, main group `_design/dev_view3`, defined new replica partitions: [47,48,49,50,51,52,150,151,152,153,154,155,156,157,187,188,189,190,205,254,255,256,277,278,279,280,308,356,357,358,359,401,403,457,458,459,460,461,462,463,487,488,489,672,673,674,675,712,713,714,740,741,742,792,793,794,795,796,797,798,799]
New full set of replica partitions is: [47,48,49,50,51,52,150,151,152,153,154,155,156,157,187,188,189,190,205,254,255,256,277,278,279,280,308,356,357,358,359,401,403,457,458,459,460,461,462,463,487,488,489,672,673,674,675,712,713,714,740,741,742,792,793,794,795,796,797,798,799]
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.705.0>:couch_log:info:39] Set view `default`, main group `_design/dev_view3`, terminating with reason: {{badmatch,
{not_found,
no_db_file}},
[{couch_set_view_group,
monitor_partitions,
3},
{couch_set_view_group,
monitor_partitions,
2},
{couch_set_view_group,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]}
[couchdb:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:<0.710.0>:couch_log:info:39] Set view `default`, replica group `_design/dev_view3`, terminating with reason: shutdown
[views:info] [2012-08-13 12:19:19] [ns_1@10.3.121.25:'capi_set_view_manager-default':capi_set_view_manager:apply_index_states:367]
couch_set_view:add_replica_partitions([<<"default">>,<<"_design/dev_view3">>,
[47,48,49,50,51,52,150,151,152,153,154,
155,156,157,187,188,189,190,205,254,
255,256,277,278,279,280,308,356,357,
358,359,401,403,457,458,459,460,461,
462,463,487,488,489,672,673,674,675,
712,713,714,740,741,742,792,793,794,
795,796,797,798,799]]) raised exit:{{{badmatch,
{not_found,
no_db_file}},
[{couch_set_view_group,
monitor_partitions,
3},
{couch_set_view_group,
monitor_partitions,
2},
{couch_set_view_group,
handle_call,
3},
{gen_server,
handle_msg,
5},
{proc_lib,
init_p_do_apply,
3}]},
{gen_server,
call,
[<0.705.0>,
{add_replicas,
6641967501501417383806791632523375489625984170218009161305639732734490564036031833581652368858531824236549515887750274907866037489577590312730316390889342644699189815496146066239026917192134394144920743793928168199194568457626425643944116224},
infinity]}}
[error_logger:error] [2012-08-13 12:19:19] [ns_1@10.3.121.25:error_logger:ale_error_logger_handler:log_msg:76] ** Generic server <0.705.0> terminating
** Last message in was {add_replicas,6641967501501417383806791632523375489625984170218009161305639732734490564036031833581652368858531824236549515887750274907866037489577590312730316390889342644699189815496146066239026917192134394144920743793928168199194568457626425643944116224}
** When Server state == {state,
{"/data",<<"default">>,
{set_view_group,
<<101,48,96,72,5,6,250,227,107,51,35,214,124,13,132,
179>>,
nil,<<"default">>,<<"_design/dev_view3">>,[],
[{set_view,0,
[<<"view3">>],
<<"function (doc) {\n emit(doc['name'], [doc['achievements'], {location:{city: doc['city']}}]);\n}\n\n">>,
nil,[],[],undefined}],
nil,nil,
{set_view_index_header,1,0,0,0,0,[],nil,[],false,
[],nil,[]},
nil,main,nil,nil,nil,[]}},
<0.710.0>,
{set_view_group,
<<101,48,96,72,5,6,250,227,107,51,35,214,124,13,132,
179>>,
<0.5401.0>,<<"default">>,<<"_design/dev_view3">>,[],
[{set_view,0,
[<<"view3">>],
<<"function (doc) {\n emit(doc['name'], [doc['achievements'], {location:{city: doc['city']}}]);\n}\n\n">>,
{btree,<0.5401.0>,nil,
#Fun<couch_btree.3.59827385>,
#Fun<couch_btree.4.7841881>,
#Fun<couch_set_view_group.27.133282011>,
#Fun<couch_set_view_group.26.3863607>,5120,true},
[],[],#Ref<0.0.10.20933>}],
{btree,<0.5401.0>,nil,#Fun<couch_btree.3.59827385>,
#Fun<couch_btree.4.7841881>,
#Fun<couch_btree.5.72034400>,
#Fun<couch_set_view_group.11.83913258>,5120,true},
<0.5406.0>,
{set_view_index_header,1,1024,
225460087946287989877899396285904440311899765536343703386406136522124206147773051676300150243028163086763687936,
0,0,
[{74,0},
{92,0},
{93,0},
{263,0},
{264,0},
{298,0},
{365,0},
{366,0}],
nil,
Full logs below. Could be some issue in ns_server but since it's becoming late here I'm filing this in hope Filipe can find something obvious in view engine's log messages.
One of the new replica partitions/databases could not be open, as apparently it doesn't exist (according to couch_db:open_int result).
Looking at the logs, it doesn't seem the database was deleted at the same or shortly before the add_replicas call.
I don't see what can be done other than crashing, or returning an explicit error to ns_server so that it can react to it (probably no idea what do there as well). Once a new partition/database/vbucket is added, it must be open for monitoring.
Submitted a change that will raise a more explicit error, mentioning the name of the database, when this happens: http://review.couchbase.org/#change,19598