Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Configurable size threshold to qualify system as off-balanced
  2. Configurable data-distribution skew
  3. Reuse existing manual-rebalancing flow
  4. Minimize the impact on concurrent operations caused by continuous rebalancing
    1. Configurable schedule
    2. Ability to disable auto-balancing
    Ability to plug a custom AR manager
  5. Reuse existing manual-rebalancing flow for a consistent rebalancing experience

Alternatives

The user can schedule a cron job to invoke the gfsh rebalance command on a periodic basis.

...

Background

...

  1. A member is unhealthyif its heap is critical

...

  1. . Ideally a user would want to redistribute load on a unhealthy member to other members iff the members have sufficient capacity (i.e.

...

  1. totalBytes + newBucket

...

  1. Size << localMaxMemory). In some cases this can cause entire cluster to fail. Redistribution of load may cause healthy members to become unhealthy. Rebalancing can also increases IO activity significantly. So it may be safer to manually rebalance the cluster if any node is unhealthy. 
  2. Current implementation of rebalance operation can be used to estimate transfer-size, before actually executing transfer size. Transfer size is the

...

  1. total number of bytes that

...

  1. may be

...

  1. moved during a rebalance operation. It is mainly based on the total number of buckets below redundancy level and load on individual nodes. It will be inefficient if rebalance is executed if transfer-size is too small. Moreover rebalancing when transfer size is high may overload the system.
  2. New capacity of a cluster can be increased by adding new nodes. A user can specify rebalance flag after the last node is added. This way frequent rebalance can be avoided. 

Based on the points discussed above, we plan to use transfer-size metric as the primary decision factor for triggering rebalance. Presence of empty node will be ignored assuming user may be adding more capacity. Similarly critical nodes will be ignored assuming such nodes need specific region based rebalance actions.

How is load defined?

Load on a member is a function of

...

When is a cluster off-balance?

  1. [Auto-balance candidate] if transfer-size is more than X% of the total data size, rebalance can result in a consistent data distribution and create comparable free space on all nodes

  2. if some nodes in the cluster are heavily loaded, percentage heap utilization is much higher than other members in the cluster
  3. if the cluster [Auto-balance candidate] if the cluster is not running at configured redundancy levels
  4. Or [prefer manual rebalance] or any unhealthy node exists in the cluster.

Use Cases

...

  1. After node failure and recovery, gfsh command "rebalance -simulate" reports a high transfer-size. In this case, the nodes may have comparable utilization, but a rebalance would result in a uniform region data distribution. So action would be taken
  2. Over time, some buckets may grow much larger than other buckets in the region. Or some regions may grow more than others. Rebalance would get triggered, resulting in a uniform distribution

...

  1. Schedule - cron string: In order to minimize the impact on concurrent operations, we feel it’s important to provide the user with the ability to configure the frequency and timing of automatic rebalancing. Bucket movement does add load to the system and in our performance tests we can see that the throughput of concurrent operations drops during bucket movement. A user is expected to configure off-peak hours for rebalancing. So a schedule based on cron like configuration is useful.
  2. Size-threshold-percent - int between 1 and 99: Rebalancing will be triggered if the transfer-size is more than this threshold. This threshold is the percentage of the total data size. Rebalance operation computes transfer size based on relationship between regions, primary ownership and redundancy.

  3. Minimum cluster: Rebalancing could be harmful when the cache is initially being populated, because bucket sizes may vary wildly when there is very little data. Because of that, we will also provide a threshold before automatic rebalancing will kick in.

E.g.

<cache>
...
 <initializer>
  <!-- Optional auto-rebalance manager -->
  <class-name> com.gemstone.gemfire.cache.util.AutoBalancer </class-name>
 
  <!-- Optional. Default: Once a week on Saturday. E.g. check at 3 am every night -->
  <parameter name=”schedule”> 0 0 3 * * ? </parameter>
 
  <!-- Optional. Default: 20%10%. E.g. Don’t rebalance until the transfer size is more than 10% of the total data size -->
  <parameter name=”size-threshold-percent”> 10 </parameter>
 
  <!-- Optional. Default: 100 50%MB. E.g. Don’t rebalance a region until atthe leasttransfer onesize memberis isatleast 50%100 fullMB -->
  <parameter name=”size”minimum-threshold-percentage”>size”> 50100000000 </parameter>
 </initializer>
... 
</cache>

...