Details
-
Type:
Bug
-
Status:
Open
-
Priority:
Blocker
-
Resolution: Unresolved
-
Affects Version/s: 2.0, 2.0.1
-
Fix Version/s: 2.1
-
Component/s: couchbase-bucket, storage-engine
-
Security Level: Public
-
Labels:
Description
SUBJ.
Just try single node dev cluster and try to load beers sample (it'll fail for perhaps different reason).
You can observe in logs:
We're doing set_vbucket_state requests from janitor (ns_memcached messages) and this is the first messages from mccouch about just created vbucket (capi_set_view_manager messages and mc_connection messages). Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. From timestamps it's clear that it's not.
[ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active
[views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active
[ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active
[ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
[ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active
[ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active
And last message is at:
[ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active
[ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
[ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active
[ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter.
So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side.
Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default).
Just try single node dev cluster and try to load beers sample (it'll fail for perhaps different reason).
You can observe in logs:
We're doing set_vbucket_state requests from janitor (ns_memcached messages) and this is the first messages from mccouch about just created vbucket (capi_set_view_manager messages and mc_connection messages). Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. From timestamps it's clear that it's not.
[ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active
[views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active
[ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active
[ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
[ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active
[ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active
And last message is at:
[ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active
[ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
[ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify
[views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active
[ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done
I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter.
So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side.
Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default).
Issue Links
- is duplicated by
-
MB-7160
if flush times out final stage (janitor creating vbuckets back) it returns success causing clients to see TMPFAIL after flush succeeds (WAS: there are reports that even after invoking FLUSH nodes return TMPFAIL...)
-
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Aleksey Kondratenko
made changes -
| Field | Original Value | New Value |
|---|---|---|
| Description |
SUBJ.
Just try single node dev cluster and try to load beers sample (it's fail for perhaps different reason). You can observe in logs: (we're doing set_vbucket_state requests from janitor and are this is the first message from mccouch about just created bucket. Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. It's clear that it's not. [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active [views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active And last message is at: [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done [ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active [ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter. So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side. Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default). |
SUBJ.
Just try single node dev cluster and try to load beers sample (it'll fail for perhaps different reason). You can observe in logs: We're doing set_vbucket_state requests from janitor (ns_memcached messages) and this is the first messages from mccouch about just created vbucket (capi_set_view_manager messages and mc_connection messages). Note: I've added extra log messages at the beginning of sync_notify and end to see if it's couchdb's or ns_servers' fault. From timestamps it's clear that it's not. [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 740 state to active [views:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1023. Updated state: active [ns_server:info,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 739 state to active [ns_server:debug,2012-08-15T9:47:40.309,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 738 state to active [ns_server:info,2012-08-15T9:47:40.310,n_0@127.0.0.1:<0.870.0>:ns_memcached:do_handle_call:485]Changed vbucket 737 state to active And last message is at: [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [views:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/1. Updated state: active [ns_server:debug,2012-08-15T9:49:14.410,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done [ns_server:debug,2012-08-15T9:49:14.501,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:106]sending out sync_notify [views:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:mc_couch_events:capi_set_view_manager:handle_mc_couch_event:418]Got set_vbucket event for beer-sample/0. Updated state: active [ns_server:debug,2012-08-15T9:49:14.502,n_0@127.0.0.1:<0.868.0>:mc_connection:do_notify_vbucket_update:113]done I've just retried with remounting with barrier=0 and it was blazing fast. What this mount option changes is short fsync are nearly instant because they merely send data to disk's buffer without waiting until data actually hits platter. So that's evidence of some excessive use of fsyncs somewhere during vbucket db files creation and that's clearly on ep-engine/couchstore side. Set this to blocker as it'll affect people trying to create buckets on real hardware, hard disks and any modern linux distro (where barriers are on by default). |
Farshid Ghods
made changes -
| Priority | Blocker [ 1 ] | Critical [ 2 ] |
Peter Wansch
made changes -
| Fix Version/s | .next [ 10342 ] | |
| Priority | Critical [ 2 ] | Major [ 3 ] |
Dipti Borkar
made changes -
| Assignee | Damien Katz [ damien ] | Peter Wansch [ peter ] |
| Fix Version/s | 2.0 [ 10114 ] | |
| Fix Version/s | .next [ 10342 ] |
Tommie McAfee
made changes -
| Attachment | debug.1.gz [ 14791 ] |
Tommie McAfee
made changes -
| Attachment | debug.1.gz [ 14792 ] |
Tommie McAfee
made changes -
| Attachment | debug.1.gz [ 14791 ] |
Farshid Ghods
made changes -
| Labels | 2.0-beta-release-notes |
Peter Wansch
made changes -
| Assignee | Peter Wansch [ peter ] | Mike Wiederhold [ mikew ] |
Chiyoung Seo
made changes -
| Sprint Status | Current Sprint |
Chiyoung Seo
made changes -
| Fix Version/s | .next [ 10342 ] | |
| Fix Version/s | 2.0 [ 10114 ] |
Chiyoung Seo
made changes -
| Sprint Status | Current Sprint |
Dipti Borkar
made changes -
| Priority | Major [ 3 ] | Blocker [ 1 ] |
Sergey Avseyev
made changes -
Mike Wiederhold
made changes -
| Fix Version/s | Backlog [ 10090 ] | |
| Fix Version/s | .next [ 10342 ] |
Mike Wiederhold
made changes -
| Planned Start | (set to new fixed version's start date) | |
| Planned End | (set to new fixed version's start date) |
Dipti Borkar
made changes -
| Fix Version/s | 2.0.2 [ 10418 ] | |
| Fix Version/s | Backlog [ 10090 ] |
Dipti Borkar
made changes -
| Planned Start | (set to new fixed version's start date) | |
| Planned End | (set to new fixed version's start date) |
Dipti Borkar
made changes -
| Affects Version/s | 2.0 [ 10114 ] | |
| Affects Version/s | 2.0.1 [ 10399 ] | |
| Affects Version/s | 2.0-beta [ 10113 ] |
Mike Wiederhold
made changes -
| Fix Version/s | 2.1 [ 10414 ] | |
| Fix Version/s | 2.0.2 [ 10418 ] |
Frank Weigel
made changes -
| Labels | 2.0-beta-release-notes | 2.0-beta-release-notes customer |
Dipti Borkar
made changes -
| Labels | 2.0-beta-release-notes customer | PM-PRIORITIZED customer |