Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state: Under Discussion Vote pending

Discussion thread: here

Vote Discussion thread: here

JIRA: here 

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

The Unclean Recovery uses a deterministic way to elect the leader persisted the most data. On a high level, once the unclean recovery is triggered, the controller will use a new API GetReplicaLogInfo to query the log end offset and the leader epoch from each replica. The one with the highest leader epoch plus the longest log end offset will be the new leader. To help explain when and how the Unclean Recovery is performed, let's first introduce some config changes.

The new unclean.recovery.strategy has the following 3 options.

...

  1. If there are other ISR members, choose an ISR member.

  2. If there are unfenced ELR members, choose an ELR member.

  3. If there are fenced ELR members

    1. If the unclean.recovery.strategy=Aggressive, then an unclean recovery will happen.

    2. Otherwise, we will wait for the fenced ELR members to be unfenced.

  4. If there are no ELR members.

    1. If the unclean.recovery.strategy=Aggressive, the controller will do the unclean recovery.

    2. If the unclean.recovery.strategy=Balanced, the controller will do the unclean recovery when all the LastKnownELR are unfenced. See the following section for the explanations.
    3. Otherwise, unclean.recovery.strategy=None, the controller will not attempt to elect a leader. Waiting for the user operations.

...

  1. The kafka-leader-election.sh tool will be upgraded to allow manual leader election.

    1. It can directly select a leader.

    2. It can trigger an unclean recovery for the replica with the longest log in either Aggressive or Balance mode.

  2. Configs to update
    1. unclean.recovery.strategy. Described in the above section. Balanced is the default value. 
    2. unclean.recovery.manager.enabled. True for using the unclean recovery manager to perform an unclean recovery. False otherwise. False is the default value.
    3. unclean.recovery.timeout.ms. The time limits of waiting for the replicas' response during the Unclean Recovery. 5 min is the default value.
  3. For compatibility, the original unclean.leader.election.enable options True/False will be mapped to unclean.recovery.strategy options.
    1. unclean.leader.election.enable.false -> unclean.recovery.strategy.Balanced
    2. unclean.leader.election.enable.true -> unclean.recovery.strategy.Aggressive

Public Interfaces

We will deliver the KIP in phases, so the API changes are also marked coming with either ELR or Unclean Recovery.

...

  • For the existing unclean.leader.election.enable
    1. If true, unclean.recovery.strategy will be set to Aggressive.

    2. If false, unclean.recovery.strategy will be set to Balanced.

  • unclean.leader.election.enable will be marked as deprecated.

...