Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Using ReadWrite lock gives semantic N transactions, or 1 exchange.

Late Affinity Assignment

Late Affinity Assignment is optimization which assumes delayed switch of primary partitions after topology change.

This is feature is enabled by default since Optimisation, Centralized affinity, flag - true by default 2.0+ , and is the only possible option since 2.1+.

Affinity assigns primary partition (to be migrated) to new node. But instead we create temporary backup at this new node.

When temporary backup loads all actual data it becomes primary.

On each topology change, for each started cache partition-to-node mapping is calculated using {@link AffinityFunction} configured for cache. When late affinity assignment mode is disabled then new affinity mapping is applied immediately.

In late affinity assignment mode if primary node was changed for some partition then current primary is not changed and new primary is temporarily assigned as backup. Later, when it's ensured that all new "ideal affinity" primaries are ready to become true primaries, cluster performs late affinity switch procedure: on separate PME (triggered by CacheAffinityChangeMessage) primary assignments are recalculated to match with {@link AffinityFunction}. There are three cases which may require late affinity assignment:

  • Node join. When node joins the cluster, neither of its partitions can be primary. See {@link CacheAffinitySharedManager#initAffinityOnNodeJoin}.
  • BLT change. On PME triggered by baseline change command, all primary partitions don't change their disposition. See {@link CacheAffinitySharedManager#onBaselineTopologyChanged}.
  • Cache start / cluster activation in persistent mode. In case persisted partition is outdated, its state is reset to MOVING. Primary is always chosen from OWNING partitions. Unlike node join case, if node is partially outdated and some of partitions are MOVING, its up-to-date partitions still can be primary. See {@link CacheAffinitySharedManager#initAffinityBasedOnPartitionsAvailability}.

Moment of the late affinity switch is controlled by the coordinator node. It stores information about all ongoing rebalancing processes in CacheAffinitySharedManager#waitInfo. When it's empty, coordinator node sends affinity change message (see {@link CacheAffinitySharedManager#checkRebalanceState}), which triggers late affinity switch PME. In case all partitions on joined node are up-to-date, such PME is triggered right away after topology change PMEAdditional synthetic Exchange will be issued.