r is unable to send FETCH response
Status
Current state: Under Discussion
...
Section |
---|
|
We add a new quorum state Prospective for servers which are sending Pre-Vote requests as well as new state transitions. The original (left) and new states (right) are below for comparison. Column |
---|
| * Unattached|Resigned transitions to: * Unattached: After learning of a new election with a higher epoch * Voted: After granting a vote to a candidate * Candidate: After expiration of the election timeout * Follower: After discovering a leader with an equal or larger epoch * * Voted transitions to: * Unattached: After learning of a new election with a higher epoch * Candidate: After expiration of the election timeout * * Candidate transitions to: * Unattached: After learning of a new election with a higher epoch * Candidate: After expiration of the election timeout * Leader: After receiving a majority of votes * * Leader transitions to: * Unattached: After learning of a new election with a higher epoch * Resigned: When shutting down gracefully * * Follower transitions to: * Unattached: After learning of a new election with a higher epoch * Candidate: After expiration of the fetch timeout * Follower: After discovering a leader with a larger epoch
|
Column |
---|
| * Unattached|Resigned transitions to: * Unattached: After learning of a candidate with a higher epoch (clarifying language)
* Voted: After granting a standard vote to a candidate (clarifying language)
* Prospective: After expiration of the election timeout
* Follower: After discovering a leader with an equal or larger epoch
*
* Voted transitions to: * Unattached: After learning of a candidate with a higher epoch
* Prospective: After expiration of the election timeout
* Follower: After discovering a leader with an equal or larger epoch (missed in original docs) *
* Prospective transitions to: * Unattached: After learning of a candidate with a higher epoch * Prospective: After expiration of the election timeout * Candidate: After receiving a majority of pre-votes
* Follower: After discovering a leader with an equal or larger epoch *
* Candidate transitions to: * Unattached: After learning of a candidate with a higher epoch
* Prospective: After expiration of the election timeout
* Leader: After receiving a majority of standard votes
* Follower: After discovering a leader with an equal or larger epoch (missed in original docs)
*
* Leader transitions to: * Unattached: After learning of a candidate with a higher epoch * Resigned: When shutting down gracefully *
* Follower transitions to: * Unattached: After learning of a candidate with a higher epoch
* Prospective: After expiration of the fetch timeout
* Follower: After discovering a leader with a larger epoch |
|
...
We prevent servers from increasing their epoch prior to establishing they can win an election.
Can this prevent necessary electionsPre-Vote prevent a quorum from electing a leader?
Yes. If a leader is unable to send FETCH responses to [majority - 1] of servers, it can impede its connected followers from granting no new metadata can be committed and we will need a new leader to make progress. We may need the minority of servers which are able to communicate with the leader to grant their vote to prospectives which can communicate with a majority of the cluster. This . Without Pre-Vote, the epoch bump would have forced servers to participate in the election. With Pre-Vote, the minority of servers which are connected to the leader will not grant Pre-Vote requests. This is the reason why an additional "Check Quorum" safeguard is needed which is what KAFKA-15489 implements. Check Quorum ensures a leader steps down if it is unable to receive fetch send FETCH responses from to a majority of servers. This will allow free up all servers to grant their votes to eligible prospectives.
...