Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In the case of unclean leader election, one roundtrip of OffsetForLeaderEpoch request/response is still a common case. We may see multiple roundtrips of OffsetForLeaderEpoch request/response in rare cases: 2 roundtrips is required starting at 3 consecutive fast leader failovers (see Scenario 3 later in this section). There is no additional/special handling of unclean leader election config in the proposed approach. The only difference is that leader and follower may look further back than High Watermak in the leader epoch history to find the largest epoch both brokers know about, and then truncating based on the latest offset of that epoch. Theoretically, the solution lets both brokers to compare the complete epoch leneage lineage between them, if needed. In practice, the common case would only require a few roundtrips for the solution to converge, usually exactly one roundtrip.

...

  1. In the current protocol, when the follower sends OffsetForLeaderEpoch request for the partition to the leader, the request includes the latest Leader Epoch in the follower's Leader Epoch Sequence. This step and steps before that are the same as described in KIP-101. 
  2. The leader responds with the largest epoch less than or equal to the requested epoch (LeaderEpoch) and the end offset of this epoch (LastOffset).
  3. If the follower has LeaderEpoch received from the leader in its Leader Epoch Sequence file, then we go to step 4. Otherwise, 
    1. The follower truncates to the end offset of the largest epoch less than LeaderEpoch, and
    2. The follower sends OffsetForLeaderEpoch request with the largest epoch less than LeaderEpoch, and
    steps
    1.  Steps 2 and 3 repeat until the follower receives the epoch it knows about (the epoch is in the follower's Leader Epoch Sequence file).
  4. The By following steps 2 and 3, the follower truncates all offsets with epochs larger than the epoch received from the leader (LeaderEpoch), and then . In this step, the follower truncates its log to the leader's LastOffset, if leader's LastOffset is smaller than follower's Log End Offset.
  5. The follower starts fetching from the leader and the remainder of the protocol remains unchanged.

...