Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Here are a few clarification to make our solution easier to understand:
- Broker assumes a log directory to be good after it starts, and mark log directory as bad once there is IOException when broker attempts to access (i.e. read or write) the log directory.
- Broker will be offline if all log directories are bad.
- Broker will stop serving replicas in any bad log directory. New replicas will only be created on good log directory.
- If LeaderAndIsrResponse shows error for a given replica, controller will consider that replica to be offline, broadcast UpdateMetadataRequest and do leader election if the replica is a leader.
- Broker will remove offline replica from its replica fetcher threads.
- Even if isNewReplica=false and replica is not found on any log directory, broker will still create replica on a good log directory if there is no bad log directory.

In the following we describe how our solution works under eight different scenarios. Some existing steps (e.g. kafka-topics.sh creates znode) are omitted for simplicity.

...