Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Always assign dir & remove offlinedirs flag

...

RegisterBrokerRecord and BrokerRegistrationChangeRecord will both will have two a new fieldsfield:

{ "name": "OnlineLogDirsLogDirs", "type":  "[]uuid", "versions":  "3+", "taggedVersions": "3+", "tag": "0",
"about": "Log directories configured in this broker which are available." },{ "name": "OfflineLogDirs", "type": "bool", "versions": "3+",
"about": "Whether any log directories configured in this broker are not available." }

PartitionRecord and PartitionChangeRecord will both have a new Assignment field which replaces the current Replicas field:

{ "name": "Replicas", "type":  "[]int32", "versions":  "0", "entityType": "brokerId",
"about": "The replicas of this partition, sorted by preferred order." },
(...)
{ "name": "Assignment", "type": "[]ReplicaAssignment", "versions": "1+",
"about": "The replicas of this partition, sorted by preferred order.", "fields": [
{ "name": "Broker", "type": "int32", "versions": "1+", "entityType": "brokerId",
"about": "The broker ID hosting the replica." },
{ "name": "Directory", "type": "uuid", "versions": "1+",
"taggedVersionsabout": "+1", "tag": "0",
"about": "TheThe log directory hosting the replica" }
]}

The new Directory field is a tagged field, which won't be used when a single log directory is registered, avoiding the need to grow the metadata record size for non JBOD-configured clusters.

Although not explicitly specified in the schema, the default value for Directory is Uuid.UnknownDir (Uuid.ZERO), as that's the default default value for UUID types.

...

BrokerRegistrationRequest will include the following two new fieldsfield:

{ "name": "OnlineLogDirsLogDirs", "type":  "[]uuid", "versions":  "2+",
"about": "Log directories configured in this broker which are available." },

BrokerHeartbeatRequest will include the following new field:

{ "name": "OfflineLogDirs", "type":  "bool", "versions":  "2+",
"about": "Whether any log directories configured in this broker are not available." }

BrokerHeartbeatRequest will include the following new field:

{ "name": "OfflineLogDirs", "type":  "[[]uuid", "versions": "1+", "taggedVersions": "1+", "tag": "0",
"about": "Log directories that failed and went offline." }

...

The set of all loaded log directory UUIDs is sent along in the broker registration request to the controller as the OnlineLogDirs field. If any configured log directories is unavailable, OfflineLogDirs is set to true. LogDirs field. 

Metadata cachingMetadata caching

Currently, Replicas are considered offline if the hosting broker is offline. Additionally, replicas will also be considered offline if the replica references a log directory UUID (in the new field partitionRecord.Assignment.Directory) that is not present in the hosting Broker's latest registration under OnlineLogDirs and LogDirs and either:

  • the log directory UUID is UUID.OfflineDir
  • the hosting broker's registration indicates multiple online log directories. i.e. brokerRegistration.OnlineLogDirsLogDirs.length > 1the hosting broker's registration indicates offline directories. i.e. brokerRegistration.OfflineLogDirs == true

If neither of the above conditions are true, we assume that there is only one log directory configured, the broker is not configured with multiple log directories, replicas all live in the same directory and neither log directory assignments nor log directory failures shall be communicated to the Controller. 

...

  • Persist a BrokerRegistrationChange record, with the new list of online log directories and update the offline log directories flag.
  • Update the Leader and ISR for all the replicas assigned to the failed log directories, persisting PartitionChangeRecords, in a similar way to how leadership and ISR is updated when a broker becomes fenced, unregistered or shuts down.

...

  • If there are no indicated online log directory UUIDs the request is invalid and the controller replies with an error 42 – INVALID_REQUEST.
  • If multiple log directories are registered the broker will remain fenced until the controller learns of all the partition to log directory placements in that broker - i.e. no remaining replicas assigned to Uuid.UnknownDir . The broker will indicate these using the AssignReplicasToDirs RPC.

    • The broker remains fenced by not wanting to unfence itself in heartbeat requests until the number of mismatching replica to log directory assignments is zero. This number is represented by the new metric QueuedReplicaToDirAssignments.
  • If multiple log directories are registered and some of them are new (not present in previous registration) then these log directories are assumed to be empty. If they are not, the broker will use the AssignReplicasToDirs RPC to correct assignment and choose not to become UNFENCED before the metadata is correct.
  • In the special case where previous broker registration indicates a single online log directory and no offline log directories, and the inbound broker registration request indicates more than one log directory, and one of the indicated log directories is the same one previously registered, then a logical update to all partitions in that broker takes place, assigning the replica's directory to the single directory previously registered – i.e. it is assumed that all replicas are still in the same directory, and this transition to JBOD avoids creating partition change records. This same logic is considered in every node while consuming and caching metadata changes. Any metadata snapshot created after this change explicitly refers the exact log directory UUID for each partition in that broker in each respective PartitionRecord.
  • If the registration request does not indicate any offline directories (i.e. OfflineLogDirs=false) and it does not include all directories previously registered (i.e. OnlineLogDirs in the previous registration contains UUIDs that are not present in the request's OnlineLogDirs) then the Controller assumes that those directories have been removed from configuration and that any hosted partitions in those directories will need to be re-created by the broker in the remaining configured log directories. So another logical update takes place here, applied to all partitions assigned to the removed directory UUIDs, assigning them instead to UUID.OfflineDir.

Brokers whose registration indicates that multiple log directories are configured remain FENCED until all log directory assignments for that broker are learnt by the active controller and persisted into metadata.

...