Rebalance Questions

Rebalance Questions

I want to speed up the rebalancing.

  1. If the value increases from the default to 12, does it use more server resources?

  2. The maximum value is 64, is there a problem if I change it to the maximum value?

  3. Does this “rebalanceMovesPerNode” help speed up rebalancing?

Hi @bellpumpkin. Yes, those settings will help speed up rebalance…and they will also consume more resources so we would advise increasing it by a small amount initially and then testing for both the increase in speed but also any impact on your application.

If there is no speed difference between 12,24,32,64 when increasing the value of ‘rebalanceMovesPerNode’, is it a CPU problem?

The speed difference between 4 and 12, 4 and 64 is clear, but the difference between 12 and 32, 64 is not big. Can you tell why?

It’s hard to be very definitive without looking at stats/logs from a specific example run.

A few more details:

  • If you’re working with a large amount of data, there’s only so much impact that parallelising will have vs the overall time it takes to move the data.
  • The CPU and disk speed will certainly play a role, especially depending on whether your data is fully in RAM or not.
  • Similarly, if you are tight on RAM, then moving data from one node into another may be slowed down by the highwatermark threshold on the destination node so as not to overwhelm the destination. This is also one of the key areas for potential impact to the application and an example of why it may be better to have a slower rebalance than to push a lot of active data out of RAM and slow down the running application.
  • Network bandwidth in theory could play a role, but doubtful in today’s high speed networking world.
  • [edit] There are multiple other factors that will limit how many vbuckets are able to be moved concurrently. In particular, doing a “swap rebalance” (i.e. same number of nodes coming in as are coming out) ends up being limited to one concurrent vbucket move per source:destination pair. So if you’re testing a ‘swap rebalance’, then you really won’t see any effect by changing that value. If you’re adding a different amount of nodes than you’re removing, then you will.

If we step back a bit, can you share some insight and feedback on why you are trying to increase rebalance speed? In our minds, rebalance should be able to happen on a live system and continue with minimal or no intervention. Are you experiencing any issues with rebalance taking “too long”?