[MB-3158] cluster failover/rebalance expected behavior when one or more nodes are out of disk space Created: 10/Dec/10 Updated: 03/Aug/12 Resolved: 03/Aug/12 |
|
| Status: | Resolved |
| Project: | Couchbase Server |
| Component/s: | couchbase-bucket, ns_server |
| Affects Version/s: | None |
| Fix Version/s: | 2.0 |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Frank Weigel | Assignee: | Dipti Borkar |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | 1.7.0-release-notes, 1.7.1-release-notes | ||
| Σ Remaining Estimate: | Not Specified | Remaining Estimate: | Not Specified |
| Σ Time Spent: | Not Specified | Time Spent: | Not Specified |
| Σ Original Estimate: | Not Specified | Original Estimate: | Not Specified |
| Issue Links: |
|
||||||||||
| Sub-Tasks: |
|
||||||||||
| Flagged: |
Release Note
|
||||||||||
| Description |
|
On or more nodes running out of disk space cannot take down a cluster. Data already stored on a node should stay accessible, only writes to membase buckets should fail.
Nodes should still be able to be removed from cluster or failed over. Attention also needs to be paid to behaviour of other areas that use disk space, such as logs |
| Comments |
| Comment by Frank Weigel [ 27/Jan/11 ] |
| A Pivotal Tracker story has been created for this Issue: http://www.pivotaltracker.com/story/show/9305789 |
| Comment by Farshid Ghods [ 24/May/11 ] |
| try this scenario and update before RC |
| Comment by Farshid Ghods [ 24/May/11 ] |
|
when one node runs out of disk space memcached goes into pending mode ) and the user can rebalance this node out from the cluster.
Shutting down bucket "default" on 'ns_1@172.16.75.128' for server shutdown ns_memcached002 ns_1@172.16.75.128 18:48:24 - Tue May 24, 2011 Usage of disk "/" on node "172.16.75.128" is over 100% if you have two nodes running out of disk space you will not be able to failover those two nodes because failover will timeout. the workaround is if you have two or more nodes running out of disk space you need to stop membase server on those two nodes and then you can fail over those nodes. |
| Comment by Perry Krug [ 25/May/11 ] |
| Just so long as we understand this this is still a bug. Failover should NEVER time out |
| Comment by Peter Wansch [ 28/Jun/12 ] |
| Farshid, this can be closed right? |