Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The replica fetcher threads handle multiple partitions. In case a partition fails, the replica fetcher thread associated with that partition terminates. The partitions that have caught up and are running well are also left untracked with termination of the thread which leads to under-replicated partitions. A better approach would be, whenever a partition crashes, the concerned thread should stop tracking the crashed partition one and continue handling rest of the partitions.

...

If all partitions for a fetcher thread are marked as failed, the thread would be shut down. In cases where a replica is deleted on a broker through a StopReplicaRequest while the partition is present in failedPartitions set, the partition would be removed from the set. 

Until the next leader epoch, the partition would remain in the failedPartitions set. At the leader epoch, the failed partitions would be marked as un-failed by removing from the set for failed partitions. Hereafter, the controller can choose the partition as leader or follower and would follow the usual behavior.

...