Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Clarifications around ApiVersions, remove final IBP

...

We will introduce a new feature flag named metadata.version which takes over (and expands on) the role of of inter.broker.protocol.version. This new feature flag will track changes to the metadata record format and RPCs. Whenever a new record or RPC is introduced, or an incompatible change is made to an existing record or RPC, we will increase this version. The metadata.version is free to increase many times between Kafka releases. This is similar to the IV (inter-version) versions of the IBP.

...

When the quorum leader is starting up for the first time after this feature flag has been introduced, it will need a way to initialize the finalized version. After the leader finishes loading its state from disk, if has not encountered a FeatureLevelRecord, it will read an initial value for this feature from its local meta.properties file and generate a FeatureLevelRecord. We will extend the format sub-command of kafka-storage.sh to allow operators to specify which version is initialized. If no value has been specified by the operator, the tool will select the latest known value for that version of the software. 

Compatibility

One special case is when we are upgrading from an existing "preview" KRaft cluster. In this case, the meta.properties fill will already exist and will not have an initial metadata.version specified. For this case, the controller will automatically initialize metadata.version as 1. By requiring that metadata.version 1 is backwards compatible to KRaft 3.0, we allow for a straightforward downgrade path back to an earlier (preview) KRaft software version.

Compatibility

It is possible that brokers and controllers attempt to join the cluster or quorum, but cannot support the current metadata.version. For brokers, this is already handled by the controller during registration. If a broker attempts to register with the controller, but the controller determines that the broker cannot support the current set of finalized features (which includes metadata.version), it will reject the registration request and the broker will remain fencedIt is possible that brokers and controllers attempt to join the cluster or quorum, but cannot support the current metadata.version. For brokers, this is already handled by the controller during registration. If a broker attempts to register with the controller, but the controller determines that the broker cannot support the current set of finalized features (which includes metadata.version), it will reject the registration request. For controllers, it is more complicated since we need to allow the quorum to be established in order to allow records to be exchanged and learn about the new new metadata.version. A controller running old software will join the quorum and begin replicating the metadata log. If this inactive controller encounters a FeatureLevelRecord for metadata.version that it cannot support, it should terminate.

In the unlikely event that an active controller encounters an unsupported order to ensure that a given metadata.version, it should resign and terminate. 

If a broker encounters an unsupported metadata.version, it should unregister itself and terminate.

Upgrades

KRaft upgrades are done in two steps with only a single rolling restart of the cluster required. After all the nodes of the cluster are running the new software version, they will continue using the previous version of RPCs and record formats. Only after increasing the metadata.version will these new RPCs and records be used. Since a software upgrade may span across multiple metadata.version versions, it should be possible to perform many online upgrades without restarting any nodes. This provides a mechanism for incrementally increasing metadata.version to try out new features introduced between the initial software version and the upgraded software version.

can be used by the quorum, the active controller will check that the given version is compatible with at least a majority of the quorum. Since it's possible for a minority of the quorum to be offline while committing a new metadata.version, we cannot require that all controller nodes support the version (otherwise we affect the availability of upgrades).

In the unlikely event that an active controller encounters an unsupported metadata.version, it should resign and terminate. 

If a broker encounters an unsupported metadata.version, it should unregister itself and terminate.

Upgrades

KRaft upgrades are done in two steps with only a single rolling restart of the cluster required. After all the nodes of the cluster are running the new software version, they will continue using the previous version of RPCs and record formats. Only after increasing the metadata.version will these new RPCs and records be used. Since a software upgrade may span across multiple metadata.version versions, it should be possible to perform many online upgrades without restarting any nodes. This provides a mechanism for incrementally increasing metadata.version to try out new features introduced between the initial software version and the upgraded software version.

One One major difference with the static IBP-based upgrade is that the metadata.version may be changed arbitrarily at runtime. This means broker and controller components which depend on this version will need to dynamically adjust their state and behavior as the version changes. 

...

Now that the RPCs in-use by a broker or controller can change at runtime (due to changing metadata.version), we will need a way to inform a node's remote clients that new RPCs are available. Brokers will One example of this would be holding back a new RPC that has corresponding new metadata records. We do not want brokers to start using this RPC until the controller can actually persist the new record type.

Brokers will observe changes to metadata.version as they replicate records from the metadata log. If a new metadata.version is seen, brokers will renegotiate compatible RPCs with other brokers through the the ApiVersions workflow. This will allow for new RPCs to be put into effect without restarting the brokers. Note that not all broker-to-broker RPCs use ApiVersions negotiation. We will need to likely want to migrate all inter-broker clients to use ApiVersion, but we will evaluate on a case-by-case basis.

Since clients have no visibility to changes in metadata.version, the only mechanism Since clients have no visibility to changes in metadata.version, the only mechanism we have for updating the negotiated ApiVersions is connection establishment. By closing the broker side of the connection, clients would be forced to reconnect and receive an updated set of ApiVersions. We may want to investigate alternative approaches here to this in a future KIP.

Downgrades

...

Compatibility, Deprecation, and Migration Plan

For Starting with the release that includes this KIP, clusters running self-managed mode , there will be one final version of will ignore inter.broker.protocol.version. Once upgraded to this versionInstead, components will begin using metadata.version as the gatekeeper for new features, RPCs, and metadata records. The new version will be managed using Kafka’s feature flag capabilities. The final IBP version will also gate the protocol changes detailed above.

For clusters in ZooKeeper For clusters in ZooKeeper mode, there may be additional increases of inter.broker.protocol.version to introduce new RPCs. While Zookeeper is still supported, will need to take care that whenever an IBP (or IV) is added, a metadata.version is also added. Keeping a one to one mapping between IBP and metadata.version will help keep the implementation simple and provide a clear path for migrating ZooKeeper clusters to KRaft.

This design assumes a Kafka cluster in self-managed mode as a starting point. A future KIP will detail the procedure for migrating a ZooKeeper managed Kafka to a self-managed Kafka.

...

The main downside of this approach is that there is an undetermined amount of time when both versions of a record would be needed in the metadata log. This could lead to increased space requirements for an extended period of time. 

Final IBP

Initially, this design included details about a final IBP version that would be used by KRaft clusters to signal the transition from IBP to metadata.version. Instead of this, we decided to simply require that KRaft requires a metadata.version to be set starting with the release that this KIP will be included in. An additional constraint of making version 1 backwards compatible was added to deal with the KRaft preview upgrade case. Overall, this will simplify the code quite a bit