[MB-6323] Disk not drain (windows) Created: 20/Aug/12  Updated: 31/Aug/12  Resolved: 29/Aug/12

Status: Resolved
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: None
Fix Version/s: 2.0-beta
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Ronnie Sun (Inactive) Assignee: Jin Lim
Resolution: Fixed Votes: 0
Labels: pblock
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: 2.0.0-1578, windows

Attachments: Zip Archive Archive.zip     PNG File cbstats.png     PNG File disk_not_drain_windows.png     Zip Archive ns_diag.zip    

 Description   
Please refer to the screenshot. For build 1578, disk did not drain.

> Loaded 7M items. Waited for them to be flushed. 3.5M active items stayed in the queue.

> Memory fragmentation was nearly 2.4G. May relate to disk too.

Node: 10.2.2.231 (vms)

Created issue:


 Comments   
Comment by Steve Yen [ 20/Aug/12 ]
(seemed assigned to wrong folk)
Comment by Chiyoung Seo [ 20/Aug/12 ]
Ronnie,

When you have this kind of problems, please grab the diag log file and attach it to the bug. Otherwise, it is difficult for us to debug this issue.
Comment by Ronnie Sun (Inactive) [ 21/Aug/12 ]
attached diag logs
Comment by Ronnie Sun (Inactive) [ 21/Aug/12 ]
@chiyoung,

Attached diag logs.

The cluster is live, 10.2.2.231.

I'll leave it to you for debugging. Thanks
Comment by Chiyoung Seo [ 21/Aug/12 ]
The write dispatcher thread on 10.2.2.231 got an exception and terminated and that's the reason why disk write queue is not drained at all:

[ns_server:info,2012-08-19T14:27:38.036,ns_1@10.2.2.231:ns_port_memcached:ns_port_server:log:169]memcached<0.751.0>: 38 3: RW_Dispatcher: Caught an exception: MUTEX ERROR: Failed to destroy mutex: Invalid argument

Comment by Jin Lim [ 23/Aug/12 ]
http://review.couchbase.org/#/c/20109/
Comment by Thuan Nguyen [ 23/Aug/12 ]
Integrated in github-ep-engine-2-0 #409 (See [http://qa.hq.northscale.net/job/github-ep-engine-2-0/409/])
    MB-6323 Remove a race condition in Dispatcher wake method (Revision 5ea867c641d33eb31b57bdb709e8513a3a57152a)

     Result = SUCCESS
Jin Lim :
Files :
* src/bgfetcher.hh
* src/flusher.cc
* src/dispatcher.cc
* src/dispatcher.hh
Comment by Chiyoung Seo [ 24/Aug/12 ]
This happened it again with the latest fix.
Comment by Chiyoung Seo [ 24/Aug/12 ]
Ronnie, please attach the diag file.
Comment by Ronnie Sun (Inactive) [ 24/Aug/12 ]
Node 10-2-2-232 went down

Attached diag logs.
Comment by Jin Lim [ 24/Aug/12 ]
Please reattache the diag logs. Don't see them. Thanks!
Comment by Jin Lim [ 24/Aug/12 ]
Got the logs. Both write commit failure and mutex error occurred again. Looking into it.
Comment by Chiyoung Seo [ 29/Aug/12 ]
http://review.couchbase.org/#/c/20322/
Comment by Thuan Nguyen [ 30/Aug/12 ]
Integrated in github-ep-engine-2-0 #419 (See [http://qa.hq.northscale.net/job/github-ep-engine-2-0/419/])
    MB-6323 Ignore EINVAL from pthread_mutex/cond_destroy() (Revision 92e8151fcd8ea25b0799a5f9f59d1459f1aaea48)
MB-6323 attempt to open db with initial rev = 1 if file is not found (Revision aea990ac4f589f785b283d8b98e7d09d6e140884)

     Result = SUCCESS
Jin Lim :
Files :
* src/mutex.cc
* Makefile.am
* src/syncobject.hh

Jin Lim :
Files :
* src/couch-kvstore/couch-kvstore.cc
Comment by Thuan Nguyen [ 31/Aug/12 ]
Integrated in github-ep-engine-2-0 #421 (See [http://qa.hq.northscale.net/job/github-ep-engine-2-0/421/])
    MB-6323 return TMPFAIL for failed bgfetch instead of assert (Revision 4bf1696c646d8a7706d29de86d6814b0ff1004bf)

     Result = SUCCESS
Jin Lim :
Files :
* src/ep.cc
Generated at Tue Sep 16 02:29:44 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.