[MB-6974] erlang FULL_SWEEP setting needs override to 512 instead of erlang's default Created: 19/Oct/12  Updated: 12/Nov/12  Resolved: 23/Oct/12

Status: Resolved
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.0-beta
Fix Version/s: 2.0-beta-2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Steve Yen Assignee: Aleksey Kondratenko
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Looks like 512 is the better setting than the default...

  https://docs.google.com/spreadsheet/ccc?key=0AgLUessE73UXdGd2NmhLOThCRUc0ekFkZ0FDeGdLRXc#gid=0



 Comments   
Comment by Steve Yen [ 19/Oct/12 ]
Changing title to be clearer.

Also, I spoke with Damien and he's good with 512, too.
Comment by Pavel Paulau [ 19/Oct/12 ]
You don't wait for system test results? You don't need extra perf runs for that?
Comment by Aleksey Kondratenko [ 19/Oct/12 ]
We can always revert later. But I'd wait at least a bit.
Comment by Farshid Ghods (Inactive) [ 22/Oct/12 ]
Tommie,

can you update the ticket once you have initial results from pine and plum cluster where we are running the cluster with this settings ?
Comment by Tommie McAfee [ 22/Oct/12 ]
Current status of verification on key-value and query clusters:
 
Overall looks good on 23 node with key-value workload with SWEEP=512:
 * load 20 Million items, size 356bytes
 * access phase: --get 70% --create 10% --update 15% --delete 5% --ops 40000
 * rebalance out 3 nodes completed
 * load until 44% active resident
 * access phase with cache_miss ratio between 1-3%
 * swap rebalance in 3 nodes, out 3 nodes completed

Timeouts on 4 node cluster with views and SWEEP=10000:
  * load 13 million items into 2 different buckets
  * run 300/queries-per-sec against each bucket
  * access phase at 70% active resident
  * rebalance out 1 node
        - Failed with etimeout
        - usually able to retry and will succeed

On 4 node cluster with views and SWEEP=512:
   * verification pending...
Comment by Steve Yen [ 23/Oct/12 ]
assigning back to alk to get the 512 setting changed.
Comment by Tommie McAfee [ 23/Oct/12 ]

Recent system test verification failed due to another issue MB-6490….Instead of the timeouts we saw at higher gc levels.
Comment by Steve Yen [ 23/Oct/12 ]
changing priority to blocker (from critical)
Comment by Aleksey Kondratenko [ 23/Oct/12 ]
Change is in gerrit and will be merged soon
Comment by Steve Yen [ 23/Oct/12 ]
Aliaksey A. had a good point on the fix, where we need a corresponding fix for the windows?
Comment by kzeller [ 12/Nov/12 ]
added to RN: By default we provide garbage collection more frequently than the
normal default for Erlang. This keeps memory usage by the Erlang
virtual machine lower, and enables better performance.
Generated at Fri Aug 29 09:23:58 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.