Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Log directory failure notifications are queued and batched together in the next broker heartbeat request. If there are any queued partition-to-directory assignments — sent in AssignReplicasToDirs —  to send to the controller, those that are respective to any of the newly failed log directories (i.e. assignments that are either into or out-of these directories) are prioritized and sent first. The broker retries these until it receives a successful reply, which conveys that the metadata change has been successfully persisted. This ensures that the Controller is in sync with regards to partition-to-directory assignments and can reliably determine which partitions need leadership and ISR update.

If the Broker repeatedly cannot communicate fails to communicate a log directory failure, or a replica assignment into a failed directory, after a configurable amount of time — log.dir.failure.timeout.ms — and it is the leader for any replicas in the failed log directory the broker will shutdown, as that is the only other way to guarantee that the controller will elect a new leader for those partitions.

...