Table of Contents |
---|
Master KIP
KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum (Accepted)
Status
Current state: DraftAdopted
Discussion thread: here
JIRA:
Jira | ||||||
---|---|---|---|---|---|---|
|
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Leader and ISR information is stored in the `/brokers/topics/[topic]/partitions/[partitionId]/state` znode. It can be modified by both the controller and the current leader in the following circumstances:
...
Below we discuss the behavior of the new API in more detail.
Public Interfaces
We will introduce a new AlterIsr API which requires CLUSTER_ACTION permission (similar to other InterBroker APIs such as LeaderAndIsr). The request and response schema definitions are provided below:
Code Block |
---|
AlterIsrRequest => BrokerId BrokerEpoch [Topic [PartitionId LeaderAndIsr]] BrokerId => Int32 BrokerEpoch => Int64 Topic => String PartitionId => Int32 LeaderAndIsr => Leader LeaderEpoch Isr CurrentVersionCurrentZkVersion Leader => Int32 LeaderEpoch => Int32 Isr => [Int32] CurrentVersionCurrentZkVersion => Int32 AlterIsrResponse => ErrorCode [Topic [PartitionId Version ErrorCode]] ErrorCode => Int16 Topic => String PartitionId => Int32 Version => Int32 |
Possible top-level errors:
...
- FENCED_LEADER_EPOCH: There is a new leader for the topic partition. The leader should not retry.
- INVALID_REQUEST: The update was rejected due to some internal inconsistency (e.g. invalid replicas specified in the ISR)
- INVALID_ISR_VERSION (NEW): The (Zk)version in the request is out of date. Likely there is a pending LeaderAndIsr request which the leader has not yet received.
- UNKNOWN_TOPIC_OR_PARTITION: The topic no longer exists.
Proposed Changes
The main change in this KIP is to send the AlterIsr request to the controller when shrinking or expanding the ISR on the leader. The controller will update the state asynchronously and send a LeaderAndIsr request once the change has been completed.
...
It can happen that a leader and ISR update fails after the AlterIsr was acknowledged. For example, if the controller fails before applying the changes, a new controller will be elected. It could also happen that there was already a pending change at the time the leader wanted to update ISR. In either case, the leader will receive a LeaderAndIsr update with the new version. The leader will check the status of current replicas at that time to see whether any additional changes are needed.
Compatibility, Deprecation, and Migration Plan
Use of the new AlterIsr API will be gated by the `inter.broker.protocol.version` config. Once the IBP has been bumped to the latest version, the controller will no longer register a watch for the leader and ISR notification znode.
Rejected Alternatives
None yet.