Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Membase Server 1.7.x

Membase 1.7.0 keeps crashing/hanging upon (re)start

7 replies [Last post]
  • Login or register to post comments
Thu, 06/30/2011 - 01:46
thijs
Offline
Joined: 06/30/2011
Groups: None

Starting Membase 1.7.0 after its initial upgrade (from 1.6.5) worked flawlessly, but ever since restarting the service is giving major problems: Most of the time it won't come back up, either crashing right away or hanging indefinitely.

As long as I just keep restarting the service at some point it will come back up, but it might take ten to twenty attempts for it to start properly. During most restarts the erlsrv.exe process simply (appears to) crash within seconds. In some cases it just gets stuck at 25% CPU (quad-core system).

Attached the relevant output of ":8091/diag" in chronological order from the point where I first attempted to restart the service until the point where it actually managed to get back up again (some duplicate entries omitted). It's a testing set-up so I gave up at the end of the day and started again the following morning. Interestingly the amount of "stats_archiver" entries in each crash report seems to decrease up until the point where the service actually starts properly again. I also have an "erl_crash.dump" from one of the related erlsrv.exe crashes, should that be helpful in further diagnosing.

I'm running the Windows x64 version of Windows Server 2008 (not R2).

CRASH REPORT  <0.68.0>                                      2011-06-29 18:23:41
===============================================================================
Crashing process                                                               
   initial_call                                {mb_mnesia,init,['Argument__1']}
   pid                                                                 <0.68.0>
   registered_name                                                           []
   error_info
         {exit,{{badmatch,{timeout,['stats_archiver-Statistics-week',
                                   'stats_archiver-default-week',
                                   'stats_archiver-Session-week',
                                   'stats_archiver-MediaWiki-week',
                                   'stats_archiver-Session-day',
                                   'stats_archiver-default-day',
                                   'stats_archiver-Statistics-day',
                                   'stats_archiver-MediaWiki-month',
                                   'stats_archiver-default-month',
                                   'stats_archiver-Statistics-month',
                                   'stats_archiver-Session-month',
                                   'stats_archiver-Statistics-minute',
                                   'stats_archiver-Session-minute',
                                   'stats_archiver-default-minute',
                                   'stats_archiver-default-year',
                                   'stats_archiver-Session-hour',
                                   'stats_archiver-Statistics-year',
                                   'stats_archiver-MediaWiki-year',
                                   'stats_archiver-default-hour',
                                   'stats_archiver-Session-year']}},
               [{mb_mnesia,ensure_schema,0},
                {mb_mnesia,init,1},
                {gen_server,init_it,6},
                {proc_lib,init_p_do_apply,3}]},
              [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}
   ancestors                     [mb_mnesia_sup,ns_server_cluster_sup,<0.59.0>]
   messages                                                                  []
   links                                                    [<0.82.0>,<0.66.0>]
   dictionary                                                                []
   trap_exit                                                               true
   status                                                               running
   heap_size                                                                610
   stack_size                                                                24
   reductions                                                              7994

CRASH REPORT  <0.68.0>                                      2011-06-29 18:25:54
===============================================================================
Crashing process                                                               
   initial_call                                {mb_mnesia,init,['Argument__1']}
   pid                                                                 <0.68.0>
   registered_name                                                           []
   error_info
         {exit,{{badmatch,{timeout,['stats_archiver-Statistics-week',
                                   'stats_archiver-default-week',
                                   'stats_archiver-Session-week',
                                   'stats_archiver-MediaWiki-week',
                                   'stats_archiver-Session-day',
                                   'stats_archiver-default-day',
                                   'stats_archiver-Statistics-day',
                                   'stats_archiver-MediaWiki-month',
                                   'stats_archiver-default-month',
                                   'stats_archiver-Statistics-month',
                                   'stats_archiver-Session-month',
                                   'stats_archiver-Statistics-minute',
                                   'stats_archiver-Session-minute',
                                   'stats_archiver-default-minute',
                                   'stats_archiver-default-year',
                                   'stats_archiver-Session-hour',
                                   'stats_archiver-Statistics-year',
                                   'stats_archiver-MediaWiki-year',
                                   'stats_archiver-default-hour',
                                   'stats_archiver-Session-year']}},
               [{mb_mnesia,ensure_schema,0},
                {mb_mnesia,init,1},
                {gen_server,init_it,6},
                {proc_lib,init_p_do_apply,3}]},
              [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}
   ancestors                     [mb_mnesia_sup,ns_server_cluster_sup,<0.59.0>]
   messages                                                                  []
   links                                                    [<0.82.0>,<0.66.0>]
   dictionary                                                                []
   trap_exit                                                               true
   status                                                               running
   heap_size                                                                610
   stack_size                                                                24
   reductions                                                              7994

CRASH REPORT  <0.68.0>                                      2011-06-30 08:53:45
===============================================================================
Crashing process                                                               
   initial_call                                {mb_mnesia,init,['Argument__1']}
   pid                                                                 <0.68.0>
   registered_name                                                           []
   error_info
         {exit,{{badmatch,{timeout,['stats_archiver-Statistics-week',
                                   'stats_archiver-default-week',
                                   'stats_archiver-default-day',
                                   'stats_archiver-default-month',
                                   'stats_archiver-default-minute',
                                   'stats_archiver-default-year',
                                   'stats_archiver-default-hour']}},
               [{mb_mnesia,ensure_schema,0},
                {mb_mnesia,init,1},
                {gen_server,init_it,6},
                {proc_lib,init_p_do_apply,3}]},
              [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}
   ancestors                     [mb_mnesia_sup,ns_server_cluster_sup,<0.59.0>]
   messages                                                                  []
   links                                                    [<0.82.0>,<0.66.0>]
   dictionary                                                                []
   trap_exit                                                               true
   status                                                               running
   heap_size                                                                610
   stack_size                                                                24
   reductions                                                              7985

CRASH REPORT  <0.68.0>                                      2011-06-30 09:51:41
===============================================================================
Crashing process                                                               
   initial_call                                {mb_mnesia,init,['Argument__1']}
   pid                                                                 <0.68.0>
   registered_name                                                           []
   error_info
         {exit,{{badmatch,{timeout,['stats_archiver-default-week',
                                   'stats_archiver-default-month',
                                   'stats_archiver-default-year']}},
               [{mb_mnesia,ensure_schema,0},
                {mb_mnesia,init,1},
                {gen_server,init_it,6},
                {proc_lib,init_p_do_apply,3}]},
              [{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}
   ancestors                     [mb_mnesia_sup,ns_server_cluster_sup,<0.59.0>]
   messages                                                                  []
   links                                                    [<0.82.0>,<0.66.0>]
   dictionary                                                                []
   trap_exit                                                               true
   status                                                               running
   heap_size                                                                610
   stack_size                                                                24
   reductions                                                              7985

Top
  • Login or register to post comments
Thu, 06/30/2011 - 08:44
farshid
Offline
Joined: 04/25/2011
Groups:

thijs,

This is a known issue in 1.7.0. can you please apply the suggested workaround on the nodes where membase hangs or crashes during startup.

http://www.couchbase.org/issues/browse/MB-4006

the workaround is to remove all files under /opt/membase/var/lib/membase/mnesia/*.* .
on windows platform these files are located in program_files\membase\server\...

Top
  • Login or register to post comments
Fri, 07/01/2011 - 04:15
thijs
Offline
Joined: 06/30/2011
Groups: None

Thanks, the workaround indeed solves the issue. Just a shame that it means deleting all archived statistics...

If there's anything I can do (c.q. additional information I can provide) to help resolve the issue please let me know.

Top
  • Login or register to post comments
Mon, 04/16/2012 - 17:36
dc
Offline
Joined: 04/16/2012
Groups: None

I just had a similar issue on one of our server which is running: 1.7.2r-20-g6604356
OS: Windows 2003 server.
The service seemed to be running but you could not access membase. Even rebooting the system did not help.

You can find the full log here:
http://www63.zippyshare.com/v/19241756/file.html

According to: http://www.couchbase.com/issues/browse/MB-4006, it was meant to be fixed in 1.7.1
The work around (Deleting the mnesia folder) did worked, but is not something that is acceptable when we deploy our solution to our Enterprise clients.

Is the "Fix Version/s" wrong ? Was it not included in 1.7.2 ? Should this issue be re-opened ? Or am I facing a new issue?

Top
  • Login or register to post comments
Wed, 05/02/2012 - 00:57
dc
Offline
Joined: 04/16/2012
Groups: None

The issue has not occurred since the specified work around has been applied.
However, we are still concerned that this may affect our customers.
Should I open my own thread ? I just thought the it's the same issue, so I posted it on here.

Thanks,
dc

Top
  • Login or register to post comments
Wed, 05/02/2012 - 01:01
dc
Offline
Joined: 04/16/2012
Groups: None

Can anyone help with this ?
Thanks,
dc

Top
  • Login or register to post comments
Wed, 05/02/2012 - 01:05
ingenthr
Offline
Joined: 03/16/2010
Groups:

The bug there was reopened on the 17th of April. It seems to be something being looked at.

Top
  • Login or register to post comments
Wed, 05/02/2012 - 17:05
dc
Offline
Joined: 04/16/2012
Groups: None

Thanks @ingenthr!
I didn't realize it has been re-opened.

Top
  • Login or register to post comments
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker