<!-- 
RSS generated by JIRA (5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9) at Sat May 25 13:39:10 CDT 2013

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary add field=key&field=summary to the URL of your request.
For example:
http://www.couchbase.com/issues/si/jira.issueviews:issue-xml/MB-7147/MB-7147.xml?field=key&field=summary
-->
<rss version="0.92" >
<channel>
    <title>Couchbase</title>
    <link>http://www.couchbase.com/issues</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>5.2.4</version>
        <build-number>845</build-number>
        <build-date>26-12-2012</build-date>
    </build-info>

<item>
            <title>[MB-7147] swap rebalance slowness</title>
                <link>http://www.couchbase.com/issues/browse/MB-7147</link>
                <project id="10010" key="MB">Couchbase Server</project>
                        <description>____________________&lt;br/&gt;
From: Ronnie Sun&lt;br/&gt;
Sent: Friday, November 09, 2012 6:34 PM&lt;br/&gt;
To: Steve Yen&lt;br/&gt;
Subject: Re: swap rebalance perf?&lt;br/&gt;
&lt;br/&gt;
Hi Steve,&lt;br/&gt;
&lt;br/&gt;
Got some preliminary results showing 2.0 is 10x slower, which falls into the same area we&amp;#39;ve seen for reb-litmus tests.&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 203 seconds&lt;br/&gt;
2.0.0-1939: 2983 seconds&lt;br/&gt;
&lt;br/&gt;
Spec:  (reb-litmus-swap-2) 3M mixed, start with 2 nodes, swap 1 node with 9 client firing 1k ops each.&lt;br/&gt;
&lt;br/&gt;
Current jenkins automation for this has some problems, so I don&amp;#39;t have comparison graphs.&lt;br/&gt;
&lt;br/&gt;
Thanks,&lt;br/&gt;
Ronnie&lt;br/&gt;
</description>
                <environment></environment>
            <key id="20669">MB-7147</key>
            <summary>swap rebalance slowness</summary>
                <type id="1" iconUrl="http://www.couchbase.com/issues/images/icons/issuetypes/bug.png">Bug</type>
                                <priority id="1" iconUrl="http://www.couchbase.com/issues/images/icons/priorities/blocker.png">Blocker</priority>
                    <status id="5" iconUrl="http://www.couchbase.com/issues/images/icons/statuses/resolved.png">Resolved</status>
                    <resolution id="1">Fixed</resolution>
                    <security id="10011">Public</security>
                        <assignee username="ronnie">Ronnie Sun</assignee>
                                <reporter username="ronnie">Ronnie Sun</reporter>
                        <labels>
                    </labels>
                <created>Fri, 9 Nov 2012 20:49:01 -0600</created>
                <updated>Thu, 3 Jan 2013 12:21:26 -0600</updated>
                    <resolved>Wed, 21 Nov 2012 13:25:22 -0600</resolved>
                            <version>2.0-beta-2</version>
                                <fixVersion>2.0</fixVersion>
                                <component>ns_server</component>
                                <votes>0</votes>
                        <watches>3</watches>
                                                    <comments>
                    <comment id="43727" author="steve" created="Fri, 9 Nov 2012 20:56:01 -0600"  >adding affected version and fix version values so this will show up in the right filters for 2.0 bug scrub.&lt;br/&gt;
&lt;br/&gt;
also, changing from major, and if it&amp;#39;s Blocker, we&amp;#39;ll raise it in the next bug scrub mtg.</comment>
                    <comment id="43733" author="chiyoung" created="Sat, 10 Nov 2012 00:21:36 -0600"  >Ronnie,&lt;br/&gt;
&lt;br/&gt;
I need cb_collect_info from the nodes that are newly added.</comment>
                    <comment id="43748" author="ronnie" created="Sat, 10 Nov 2012 16:54:16 -0600"  >chiyoung,&lt;br/&gt;
&lt;br/&gt;
I repeated the test on vms, this time reb took 1 hr.&lt;br/&gt;
&lt;br/&gt;
Left the cluster for you, 10.2.2.168.&lt;br/&gt;
&lt;br/&gt;
Note the ip addresses shown on the UI are internal ips.&lt;br/&gt;
&lt;br/&gt;
Here are the maps fyi:&lt;br/&gt;
&lt;br/&gt;
10.2.2.167 :   192.168.0.20  (reb out)&lt;br/&gt;
10.2.2.168 :    192168.0.21  (new node)&lt;br/&gt;
10.2.2.169 :    192.168.0.22 &lt;br/&gt;
&lt;br/&gt;
</comment>
                    <comment id="43749" author="ronnie" created="Sat, 10 Nov 2012 16:55:12 -0600"  >btw: info.zip is from 10.2.2.168</comment>
                    <comment id="43800" author="chiyoung" created="Mon, 12 Nov 2012 13:03:06 -0600"  >The followings are the timings of vbucket checkpoint persistence from the node that is newly added:&lt;br/&gt;
&lt;br/&gt;
&amp;nbsp;chk_persistence_cmd (1336 total)&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;16ms - 32ms   : (  0.07%)   1 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;32ms - 65ms   : (  0.15%)   1 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;65ms - 131ms  : (  0.37%)   3 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;131ms - 262ms : (  0.97%)   8 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;262ms - 524ms : ( 13.70%) 170 #################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;524ms - 1s    : ( 34.66%) 280 #############################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1s - 2s       : ( 53.22%) 248 ##########################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;2s - 4s       : ( 68.49%) 204 #####################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;4s - 8s       : ( 78.89%) 139 ##############&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;8s - 16s      : ( 95.66%) 224 #######################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;16s - 33s     : (100.00%)  58 ######&lt;br/&gt;
&lt;br/&gt;
From the above timings, we can easily see they took more than 50minutes overall.&lt;br/&gt;
</comment>
                    <comment id="43810" author="steve" created="Mon, 12 Nov 2012 13:41:46 -0600"  >bug-scrub: chiyoung still diagnosing underlying issue</comment>
                    <comment id="43846" author="chiyoung" created="Mon, 12 Nov 2012 16:42:31 -0600"  >Alk,&lt;br/&gt;
&lt;br/&gt;
I have a quick question on checkpoint persistence. I saw that ep-engine sometimes receives two checkpoint_persistence commands for the same vbucket even though it doesn&amp;#39;t send any timeout tmp_fail response to the ebucketmigrator.&lt;br/&gt;
&lt;br/&gt;
For example, this is the timings from the node that was newly added for swap rebalance in two node cluster:&lt;br/&gt;
&lt;br/&gt;
cbstats 10.2.2.168:11210 timings&lt;br/&gt;
&amp;nbsp;chk_persistence_cmd (1354 total)&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;32ms - 65ms   : (  0.22%)   3 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;65ms - 131ms  : (  0.59%)   5 &lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;131ms - 262ms : (  1.92%)  18 #&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;262ms - 524ms : ( 14.03%) 164 ##############&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;524ms - 1s    : ( 35.08%) 285 #########################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;1s - 2s       : ( 50.89%) 214 ###################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;2s - 4s       : ( 71.34%) 277 #########################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;4s - 8s       : ( 88.04%) 226 ####################&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;8s - 16s      : ( 97.71%) 131 ###########&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;16s - 33s     : (100.00%)  31 ##&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
There were no timeout responses to the ebucketmigrator. In this case, I expect that there would be 1024 checkpoint_persistence commands (512 for active and 512 for replica vbuckets).&lt;br/&gt;
&lt;br/&gt;
Please let me know if this is still fine and reassign it back to me.&lt;br/&gt;
</comment>
                    <comment id="43848" author="chiyoung" created="Mon, 12 Nov 2012 16:49:48 -0600"  >An example of such duplicate checkpoint_persistence requests is for vbucket 37 on 10.2.2.168:&lt;br/&gt;
&lt;br/&gt;
memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:52:59.092677 PST 3: Notified the completion of checkpoint persistence for vbucket 37, cookie 0x5c62000&lt;br/&gt;
...&lt;br/&gt;
memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:53:01.096198 PST 3: Notified the completion of checkpoint persistence for vbucket 37, cookie 0x5c63600</comment>
                    <comment id="43852" author="ronnie" created="Mon, 12 Nov 2012 16:57:09 -0600"  >collect info for chiyoung&amp;#39;s comments.</comment>
                    <comment id="43915" author="chiyoung" created="Tue, 13 Nov 2012 13:43:45 -0600"  >Reassigned it to me as I was told that it&amp;#39;s an expected behavior.</comment>
                    <comment id="43936" author="alkondratenko" created="Tue, 13 Nov 2012 16:46:33 -0600"  >As part of meeting understanding views and rebalance we decided Ronnie will run same rebalance but with no mutations and same rebalance but with rebalance_index_waiting_disabled set to false. To see if that 10x is due to need to persist deltas from ongoing mutations</comment>
                    <comment id="43989" author="alkondratenko" created="Wed, 14 Nov 2012 12:30:09 -0600"  >Ronnie just posted on some internal forum that without mutations we&amp;#39;re still 4x slower. I&amp;#39;m still waiting newer diags.&lt;br/&gt;
&lt;br/&gt;
I also lack information about environment. Looks like this is run on vmware, but any details about this environment are missing</comment>
                    <comment id="43992" author="alkondratenko" created="Wed, 14 Nov 2012 12:56:31 -0600"  >Here&amp;#39;s related messages about vbucket 37:&lt;br/&gt;
&lt;br/&gt;
[rebalance:debug,2012-11-12T12:52:58.562,&lt;a href=&apos;mailto:ns_1@192.168.0.21&apos;&gt;ns_1@192.168.0.21&lt;/a&gt;:&amp;lt;0.7340.0&amp;gt;:janitor_agent:handle_call:651]Going to wait for persistence of checkpoint 2 in vbucket 37&lt;br/&gt;
[ns_server:info,2012-11-12T12:52:58.572,&lt;a href=&apos;mailto:ns_1@192.168.0.21&apos;&gt;ns_1@192.168.0.21&lt;/a&gt;:ns_port_memcached&amp;lt;0.4538.0&amp;gt;:ns_port_server:log:171]memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:52:58.371396 PST 3: TAP (Consumer) eq_tapq:anon_137 - disconnected&lt;br/&gt;
memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:52:58.453295 PST 3: TAP (Consumer) eq_tapq:anon_138 - Reset vbucket 37 was completed succecssfully.&lt;br/&gt;
&lt;br/&gt;
memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:52:59.092677 PST 3: Notified the completion of checkpoint persistence for vbucket 37, cookie 0x5c62000&lt;br/&gt;
&lt;br/&gt;
I.e. we can see our initial waiting for checkpoint on vbucket 37 completes in about 1 second. That&amp;#39;s were we expect bulk of items to get persisted. After that we create another checkpoint on source and wait for it&amp;#39;s persistence. It&amp;#39;s needed for views consistency. In this case even though we don&amp;#39;t have any views we still do that second checkpoint. But we always expect that second checkpoint to be persisted very quickly because it&amp;#39;ll have only few items to persist.&lt;br/&gt;
&lt;br/&gt;
Here&amp;#39;s relevant messages:&lt;br/&gt;
&lt;br/&gt;
[rebalance:debug,2012-11-12T12:53:00.225,&lt;a href=&apos;mailto:ns_1@192.168.0.21&apos;&gt;ns_1@192.168.0.21&lt;/a&gt;:&amp;lt;0.7363.0&amp;gt;:janitor_agent:handle_call:651]Going to wait for persistence of checkpoint 3 in vbucket 37&lt;br/&gt;
&lt;br/&gt;
(hm, we see about 1 seconds delay from ep-engine&amp;#39;s log message to ns_server proceeding to next step, which is potential problem in ns_server or memcached)&lt;br/&gt;
&lt;br/&gt;
[ns_server:info,2012-11-12T12:53:01.296,&lt;a href=&apos;mailto:ns_1@192.168.0.21&apos;&gt;ns_1@192.168.0.21&lt;/a&gt;:ns_port_memcached&amp;lt;0.4538.0&amp;gt;:ns_port_server:log:171]memcached&amp;lt;0.4538.0&amp;gt;: Mon Nov 12 12:53:01.096198 PST 3: Notified the completion of checkpoint persistence for vbucket 37, cookie&lt;br/&gt;
&lt;br/&gt;
We see another 1 second delay. Ronnie mentioned (not on ticket as usual) that load is about 9k ops per second which per vbucket 37 is about 9 items per second. So that&amp;#39;s at most few tens of items that we needed to persist for that second &amp;quot;delta&amp;quot; checkpoint.&lt;br/&gt;
</comment>
                    <comment id="44010" author="ronnie" created="Wed, 14 Nov 2012 13:48:38 -0600"  >Repeated the test (physical cluster), took ~840 seconds,&lt;br/&gt;
&lt;br/&gt;
cbcollect info stats attached. (including diags)&lt;br/&gt;
&lt;br/&gt;
Starting rebalance, KeepNodes = [&amp;#39;&lt;a href=&apos;mailto:ns_1@10.2.1.63&apos;&gt;ns_1@10.2.1.63&lt;/a&gt;&amp;#39;,&amp;#39;&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;&amp;#39;], EjectNodes = [&amp;#39;&lt;a href=&apos;mailto:ns_1@10.2.1.61&apos;&gt;ns_1@10.2.1.61&lt;/a&gt;&amp;#39;]</comment>
                    <comment id="44027" author="chiyoung" created="Wed, 14 Nov 2012 16:51:14 -0600"  >I think this is a design defect on the consistence view. I don&amp;#39;t want to adapt the ep-engine flusher anymore at this time. I think we really should revisit the overall design of consistent view.&lt;br/&gt;
&lt;br/&gt;
For a safer and more reliable workaround, we should disable the consistence view during rebalance if there are no views defined.</comment>
                    <comment id="44029" author="ronnie" created="Wed, 14 Nov 2012 16:52:31 -0600"  >And test with index_waiting disabled took 2580 seconds.&lt;br/&gt;
&lt;br/&gt;
</comment>
                    <comment id="44043" author="alkondratenko" created="Wed, 14 Nov 2012 19:13:23 -0600"  >Actually there is some something interesting and non-flusher related in in timestamps from latest diags (without ongoing mutations).&lt;br/&gt;
&lt;br/&gt;
E.g.:&lt;br/&gt;
&lt;br/&gt;
[rebalance:debug,2012-11-14T11:10:17.702,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:&amp;lt;0.17937.1&amp;gt;:janitor_agent:handle_call:651]Going to wait for persistence of checkpoint 2 in vbucket 511&lt;br/&gt;
[ns_server:info,2012-11-14T11:10:17.899,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:ns_port_memcached&amp;lt;0.2008.0&amp;gt;:ns_port_server:log:171]memcached&amp;lt;0.2008.0&amp;gt;: Wed Nov 14 11:10:17.698884 PST 3: TAP (Consumer) eq_tapq:anon_1537 - Reset vbucket 511 was completed succecssfully.&lt;br/&gt;
&lt;br/&gt;
[ns_server:debug,2012-11-14T11:10:17.951,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:janitor_agent-default&amp;lt;0.2070.0&amp;gt;:janitor_agent:handle_info:682]Got done message from subprocess: &amp;lt;0.17937.1&amp;gt; (ok)&lt;br/&gt;
[ns_server:info,2012-11-14T11:10:18.151,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:ns_port_memcached&amp;lt;0.2008.0&amp;gt;:ns_port_server:log:171]memcached&amp;lt;0.2008.0&amp;gt;: Wed Nov 14 11:10:17.951680 PST 3: Notified the completion of checkpoint persistence for vbucket 511, cookie 0x13fef080&lt;br/&gt;
&lt;br/&gt;
[ns_server:info,2012-11-14T11:10:18.656,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:ns_port_memcached&amp;lt;0.2008.0&amp;gt;:ns_port_server:log:171]memcached&amp;lt;0.2008.0&amp;gt;: Wed Nov 14 11:10:18.455993 PST 3: Schedule cleanup of &amp;quot;eq_tapq:anon_1536&amp;quot;&lt;br/&gt;
&lt;br/&gt;
[rebalance:debug,2012-11-14T11:10:19.365,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:&amp;lt;0.17944.1&amp;gt;:janitor_agent:handle_call:651]Going to wait for persistence of checkpoint 2 in vbucket 511&lt;br/&gt;
[ns_server:debug,2012-11-14T11:10:19.366,&lt;a href=&apos;mailto:ns_1@10.2.1.58&apos;&gt;ns_1@10.2.1.58&lt;/a&gt;:janitor_agent-default&amp;lt;0.2070.0&amp;gt;:janitor_agent:handle_info:682]Got done message from subprocess: &amp;lt;0.17944.1&amp;gt; (ok)&lt;br/&gt;
&lt;br/&gt;
We see that we actually get reply quite quickly from ep-engine in both invokations. Second is especially quick as it&amp;#39;s a NOP in this case. But there&amp;#39;s some weird 1 second delay in between those two calls. Only significant activity in between there only stats request for &amp;quot;checkpoint &amp;lt;vbucket-id&amp;gt;&amp;quot; and possible create_new_checkpoint request.&lt;br/&gt;
</comment>
                    <comment id="44047" author="chiyoung" created="Wed, 14 Nov 2012 21:16:02 -0600"  >Steve,&lt;br/&gt;
&lt;br/&gt;
From now, I wouldn&amp;#39;t work on any rebalance slowness issues due to consistent view. As I mentioned, depending on the persistence for view updates is NOT scalable approach.&lt;br/&gt;
&lt;br/&gt;
Initially, I suggested to stream incoming mutations or takeover items to the indexer through TAP without waiting for persistence, but heard from someone that it won&amp;#39;t make any difference compared with the current approach. However, I recently showed that the current approach is way slow due to the disk slowness and fsync per vbucket.&lt;br/&gt;
&lt;br/&gt;
I don&amp;#39;t expect that we will change the current architecture in post 2.0, and am not interested in giving any suggestions to people who are so stubborn.&lt;br/&gt;
&lt;br/&gt;
Again, my suggestion is to disable the consistent view if there are no views defined.&lt;br/&gt;
&lt;br/&gt;
Please assign it to people who is willing to work on this issue.&lt;br/&gt;
&lt;br/&gt;
In post 2.0, I will remove this prioritization because it&amp;#39;s not a good approach, but instead workaround. I will send a notice to the cluster and XDCR teams after 2.0.&lt;br/&gt;
</comment>
                    <comment id="44126" author="steve" created="Thu, 15 Nov 2012 13:21:06 -0600"  >Chiyoung made more changes in ep-engine flusher priority with toy-build.  4x faster.  But more agressive flusher can likely lead to more starvation of writes.</comment>
                    <comment id="44133" author="steve" created="Thu, 15 Nov 2012 13:45:26 -0600"  >Ronnie, reassigning to you to please rerun tests with the Chiyoung&amp;#39;s toy-build, and help cover Pavel while he&amp;#39;s traveling.&lt;br/&gt;
&lt;br/&gt;
From Chiyoung&amp;#39;s email...&lt;br/&gt;
&lt;br/&gt;
Ronnie, Pavel,&lt;br/&gt;
&lt;br/&gt;
I made more changes in ep-engine, so that the flusher can work on persisting high priority vbuckets much more aggressively. In my tests, I observed ~ 4X faster rebalance.&lt;br/&gt;
&lt;br/&gt;
However, my main concern on this new change is that it might cause the starvation on flushing regular vbuckets, and consequently grow the number of dirty items on those regular vbuckets (i.e., disk write queue size might grow a lot especially in XDCR tests because we prioritize flushing 32 vbuckets once every 30 minutes)&lt;br/&gt;
&lt;br/&gt;
You can download the toy build from&lt;br/&gt;
&lt;br/&gt;
&lt;a href=&quot;http://builds.hq.northscale.net/latestbuilds/couchbase-server-community_toy-chiyoung-x86_64_2.0.0-2002-toy.rpm&quot;&gt;http://builds.hq.northscale.net/latestbuilds/couchbase-server-community_toy-chiyoung-x86_64_2.0.0-2002-toy.rpm&lt;/a&gt;&lt;br/&gt;
&lt;br/&gt;
Ronnie, can you please test it with the swap rebalance?&lt;br/&gt;
&lt;br/&gt;
Pavel, can you test it with the XDCR?&lt;br/&gt;
&lt;br/&gt;
Thanks,&lt;br/&gt;
Chiyoung</comment>
                    <comment id="44209" author="steve" created="Fri, 16 Nov 2012 13:41:06 -0600"  >priority to critical, per bug-scrub, but still want more larger scale test runs (more nodes &amp;amp; more items).&lt;br/&gt;
&lt;br/&gt;
Ronnie,&lt;br/&gt;
please send a test case to Tony for larger scale retest, and please reassign.&lt;br/&gt;
thanks,&lt;br/&gt;
steve&lt;br/&gt;
</comment>
                    <comment id="44212" author="ronnie" created="Fri, 16 Nov 2012 14:04:37 -0600"  >test 1:&lt;br/&gt;
&lt;br/&gt;
- 8 nodes, swap 1&lt;br/&gt;
- 20M items. &lt;br/&gt;
- 30 clients with 500 ops per second (set:get = 50:50)&lt;br/&gt;
&lt;br/&gt;
test 2:&lt;br/&gt;
- 8 nodes, swap 3&lt;br/&gt;
- 20M items.&lt;br/&gt;
- 30 clients with 500 ops per second</comment>
                    <comment id="44342" author="steve" created="Mon, 19 Nov 2012 15:24:38 -0600"  >See also: &lt;a href=&quot;http://www.couchbase.com/issues/browse/CBD-682&quot;&gt;http://www.couchbase.com/issues/browse/CBD-682&lt;/a&gt;</comment>
                    <comment id="44363" author="Chisheng" created="Mon, 19 Nov 2012 18:48:03 -0600"  >[global]&lt;br/&gt;
#username:root&lt;br/&gt;
username:root&lt;br/&gt;
password:couchbase&lt;br/&gt;
port:8091&lt;br/&gt;
&lt;br/&gt;
data_path=/data&lt;br/&gt;
&lt;br/&gt;
[servers]&lt;br/&gt;
1:10.6.2.37&lt;br/&gt;
2:10.6.2.38&lt;br/&gt;
3:10.6.2.39&lt;br/&gt;
4:10.6.2.40&lt;br/&gt;
5:10.6.2.42&lt;br/&gt;
6:10.6.2.43&lt;br/&gt;
7:10.6.2.44&lt;br/&gt;
8:10.6.2.45&lt;br/&gt;
&lt;br/&gt;
#9:10.3.121.24&lt;br/&gt;
#10:10.3.121.25&lt;br/&gt;
#11:10.3.121.26&lt;br/&gt;
#12:10.3.121.27&lt;br/&gt;
&lt;br/&gt;
[membase]&lt;br/&gt;
rest_username:Administrator&lt;br/&gt;
rest_password:password&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
This is the cluster information for orange cluster. Ronnie you can run the test on them.</comment>
                    <comment id="44366" author="ronnie" created="Mon, 19 Nov 2012 19:09:13 -0600"  >due to the cluster limitations:&lt;br/&gt;
&lt;br/&gt;
change test plan to:&lt;br/&gt;
&lt;br/&gt;
1) 4 node swap 3.&lt;br/&gt;
&lt;br/&gt;
2) 6 nodes swap 1.&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
</comment>
                    <comment id="44456" author="steve" created="Tue, 20 Nov 2012 13:38:52 -0600"  >bug-scrub - moved to 2.0.1</comment>
                    <comment id="44463" author="ronnie" created="Tue, 20 Nov 2012 13:59:05 -0600"  >Early results for larger number of nodes swap rebalance (swap 3 nodes from a 4 node cluster).&lt;br/&gt;
&lt;br/&gt;
No foreground load case looks promising.&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 988.58 seconds.&lt;br/&gt;
2.0.0- 1954: 974.67 seconds.&lt;br/&gt;
</comment>
                    <comment id="44565" author="ronnie" created="Wed, 21 Nov 2012 12:21:41 -0600"  >4-3 swap rebalance:&lt;br/&gt;
&lt;br/&gt;
without foreground load (1x identical):&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 988.58 seconds. &lt;br/&gt;
2.0.0- 1954: 974.67 seconds. &lt;br/&gt;
&lt;br/&gt;
with ~10k foreground (7x slower):&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 222 sec&lt;br/&gt;
2.0.0- 1954: 1723 sec &lt;br/&gt;
&lt;br/&gt;
6-1 swap rebalance:&lt;br/&gt;
&lt;br/&gt;
with ~10k foreground  (1x identical):&lt;br/&gt;
1.8.1-938: 639.52 sec&lt;br/&gt;
2.0.0- 1954: 654.22 sec &lt;br/&gt;
</comment>
                    <comment id="44566" author="ronnie" created="Wed, 21 Nov 2012 12:31:25 -0600"  >Based on 6-1 results (started with 6 nodes, swap 1 node):&lt;br/&gt;
&lt;br/&gt;
2.0 has similar reb time as 1.8.1, while latencies are ~30% worse.&lt;br/&gt;
&lt;br/&gt;
&lt;br/&gt;
</comment>
                    <comment id="44570" author="steve" created="Wed, 21 Nov 2012 12:38:38 -0600"  >&lt;a href=&quot;https://www.yammer.com/couchbase.com/#/Threads/show?threadId=236527848&quot;&gt;https://www.yammer.com/couchbase.com/#/Threads/show?threadId=236527848&lt;/a&gt;&lt;br/&gt;
&lt;br/&gt;
4-3 swap rebalance:&lt;br/&gt;
&lt;br/&gt;
without foreground load (1x identical):&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 988.58 seconds. &lt;br/&gt;
2.0.0- 1954: 974.67 seconds. &lt;br/&gt;
&lt;br/&gt;
with ~10k foreground (7x slower):&lt;br/&gt;
&lt;br/&gt;
1.8.1-938: 222 sec&lt;br/&gt;
2.0.0- 1954: 1723 sec &lt;br/&gt;
&lt;br/&gt;
6-1 swap rebalance:&lt;br/&gt;
&lt;br/&gt;
with ~10k foreground (1x identical):&lt;br/&gt;
1.8.1-938: 639.52 sec&lt;br/&gt;
2.0.0- 1954: 654.22 sec&lt;br/&gt;
Like &amp;#xB7; Reply &amp;#xB7; Share &amp;#xB7; More &amp;#xB7; 13 minutes ago&lt;br/&gt;
&amp;nbsp;&lt;br/&gt;
Ronnie Sun: Based on 6-1 results (started with 6 nodes, swap 1 node):&lt;br/&gt;
&lt;br/&gt;
2.0 has similar reb time as 1.8.1, while latencies are ~30% worse.&lt;br/&gt;
&lt;br/&gt;
reb-swap-6-1.loop_1.8.1-938-rel-enterprise_2.0.0-1954-rel-enterprise_orange_Nov-21-2012_10-07-08&lt;br/&gt;
&amp;nbsp;&lt;br/&gt;
Steve Yen: @Aliaksey Kandratsenka, @Chiyoung Seo&lt;br/&gt;
&lt;br/&gt;
&amp;quot;4-3 swap rebalance&amp;quot; == 4 nodes initially and swap 3 of them.&lt;br/&gt;
&lt;br/&gt;
&amp;quot;6-1 swap rebalance&amp;quot; == 6 nodes initially and swap 1 of them.&lt;br/&gt;
&lt;br/&gt;
@Ronnie Sun also reports latencies were worse in 2.0 during swap rebalance.&lt;br/&gt;
&lt;br/&gt;
This seems to indicate that we don&amp;#39;t need @Chiyoung Seo&amp;#39;s patch that prioritizes the vbucket takeover even more.&lt;br/&gt;
&lt;br/&gt;
This was on system test cluster (xen VM&amp;#39;s) with SSD, key-value mixed workload (no views), with consistent views enabled (the @Chiyoung Seo .</comment>
                    <comment id="44675" author="FilipeManana" created="Thu, 22 Nov 2012 10:51:16 -0600"  >No views defined, so updating component list.</comment>
                    <comment id="44676" author="FilipeManana" created="Thu, 22 Nov 2012 10:51:16 -0600"  >No views defined, so updating component list.</comment>
                </comments>
                    <attachments>
                    <attachment id="15811" name="10.2.1.58-11142012-1140-diag.zip" size="4539866" author="ronnie" created="Wed, 14 Nov 2012 13:48:38 -0600" />
                    <attachment id="15812" name="10.2.1.61-11142012-1139-diag.zip" size="3696169" author="ronnie" created="Wed, 14 Nov 2012 13:48:38 -0600" />
                    <attachment id="15813" name="10.2.1.63-11142012-1142-diag.zip" size="7327942" author="ronnie" created="Wed, 14 Nov 2012 13:48:38 -0600" />
                    <attachment id="15781" name="info_168.zip" size="6814570" author="ronnie" created="Mon, 12 Nov 2012 16:57:09 -0600" />
                    <attachment id="15782" name="info_169.zip" size="12085154" author="ronnie" created="Mon, 12 Nov 2012 16:57:09 -0600" />
                    <attachment id="15770" name="info.zip" size="8434394" author="ronnie" created="Sat, 10 Nov 2012 16:54:16 -0600" />
                    <attachment id="15780" name="ns-diag-168.txt.zip" size="5479361" author="ronnie" created="Mon, 12 Nov 2012 16:57:09 -0600" />
                    <attachment id="15858" name="reb-swap-6-1.loop_1.8.1-938-rel-enterprise_2.0.0-1954-rel-enterprise_orange_Nov-21-2012_10-07-08.pdf" size="1889122" author="ronnie" created="Wed, 21 Nov 2012 12:31:25 -0600" />
                </attachments>
            <subtasks>
        </subtasks>
                <customfields>
                                                                        <customfield id="customfield_10180" key="com.atlassian.jira.ext.charting:firstresponsedate">
                <customfieldname>Date of First Response</customfieldname>
                <customfieldvalues>
                    <customfieldvalue>Fri, 9 Nov 2012 20:56:01 -0600</customfieldvalue>

                </customfieldvalues>
            </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10081" key="com.pyxis.greenhopper.jira:gh-global-rank">
                <customfieldname>Rank</customfieldname>
                <customfieldvalues>
                    <customfieldvalue>3563</customfieldvalue>
                </customfieldvalues>
            </customfield>
                                                                                                                                                                                        <customfield id="customfield_10181" key="com.atlassian.jira.ext.charting:timeinstatus">
                <customfieldname>Time In Status</customfieldname>
                <customfieldvalues>
                    
                </customfieldvalues>
            </customfield>
                                                </customfields>
    </item>
</channel>
</rss>