Table of Contents |
---|
Status
Current state: Under DiscussionDraft
Discussion thread: TBD
JIRA: KAFKA-7408
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
While electing a leader, if none of the in-sync replicas is alive, the controller elects a replica that was not a part of the in-sync replica set. Such a leader is called an unclean leader. Since this replica was not a part of the last known in-sync replica set, it is associated with data loss in a partition.
Currently, there is no way for brokers to tell whether a given leader was an unclean leader at the time of its election. This information can be useful at the broker in order to invoke the appropriate data loss handling routine. One such scenario can be handling of data loss in a partition that is part of a transaction. Another scenario is when a partition holds some kind of metadata and any data loss in this partition further requires an automated or manual intervention.
Changed Interfaces
This KIP will send the IsUncleanLeader
boolean in the LeaderAndIsrRequest
under LeaderAndIsrPartitionState commonStructs
array, like so:
...
Brokers will use the AlterISR RPC
to set the IsUncleanLeader
to false
Proposed Changes
This KIP proposes to append a boolean state to the LeaderAndIsr
state maintained at zookeeper. The new boolean state will be called IsUncleanLeader
. When set to true
, it will signify that the current leader was elected as an unclean leader, false
otherwise. The flag can be maintained according to the following rules:
...
- What impact (if any) will there be on existing users?
- If we are changing behavior how will we phase out the older behavior?
- If we need special migration tools, describe them here.
- When will we remove the existing behavior?
Rejected Alternatives
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.