Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The replica fetcher threads are handling handle multiple partitions. In case a partition fails, the replica fetcher thread associated with that partition terminates. The partitions that have caught up and are running well are also left untracked with termination of the thread which leads to under-replicated partitions. A better approach would be, whenever a partition crashes, the concerned thread should stop tracking the crashed partition and continue handling rest of the partitions.

Public Interfaces

New metrics:

  • failedFailedPartitionsCount - partitions-count - Count of partitions that have failed.

  • failed-log-dirs - Count of failed log directories.
  • total-replica-fetcher-threads TotalReplicaFetcherThreads - Total replica fetcher threads. (we might add if its useful)

...