[MB-4765] Erlang dump created on membase server restart Created: 23/Jan/12 Updated: 13/May/12 Resolved: 25/Apr/12 |
|
| Status: | Resolved |
| Project: | Couchbase Server |
| Component/s: | None |
| Affects Version/s: | 1.7.2, 1.8.0 |
| Fix Version/s: | 1.8.1 |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | James Mauss | Assignee: | Aleksey Kondratenko |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | 1.8.1-release-notes | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: | 1.7.2 | ||
| Attachments: |
|
| Description |
|
After a recent OS update and reboot, the 2-server clusters did not reload from disk successfully, and a membase-server restart results in an erlang error.
When the cluster was warming up, both nodes are responding to a membase-server restart command with "{"init terminating in do_boot",{{badmatch,{error,{shutdown,{ns_server,start,[normal,[]]}}}},[{init,start_it,1},{init,start_em,1}]}} Erlang has closed /opt/membase/lib/erlang/lib/os_mon-2.2.6/priv/bin/memsup: Erlang has closed. Crash dump was written to: erl_crash.dump Once the warmup finished the cluster was back in a working state. Logs and crash file attached. |
| Comments |
| Comment by James Mauss [ 03/Feb/12 ] |
| The customer would like to know why the erlang dump was being created. |
| Comment by Aleksey Kondratenko [ 03/Feb/12 ] |
|
another copy of erlang was still running.
We have known issue fixed in 2.0 that initscript stop action merely sends shutdown signal to ns_server without waiting for actual shutdown. Actual shutdown waits until memcached ends persisting it's data. So may take time. Thus initscript restart doesn't really work in most real world cases in 1.7 and current 1.8. |
| Comment by Aleksey Kondratenko [ 03/Feb/12 ] |
| added pivotal story to address that for 1.8.1 https://www.pivotaltracker.com/projects/212245 |
| Comment by Aleksey Kondratenko [ 25/Apr/12 ] |
|
Pivotal link is broken. Anyway. This is done. 1.8.1 has reliable shutdown backported from 2.0 |