Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. group_coordinator: As described in the above section.

  2. transaction_coordinator: This feature flag could cover changes to the message format for the transaction state internal topic schema, and protocol changes related to transaction coordinator. The advantages are similar to #2.

  3. consumer_offsets_topic_schema: Currently the message format used in the consumer internal topics is tied to IBP. It’s a candidate for a feature, because we will be able to dynamically finalize a new version of the topic schema, without having to roll the brokers twice.

  4. inter_broker_protocol: For transitional purposes, the inter.broker.protocol itself can be made a coarse-grained feature flag. This could can be a way to operationally migrate away avoiding the double roll during IBP bumps. 

Compatibility, deprecation and migration plan

...

Using the versioning scheme

...

to avoid double rolls

There is a configuration key in Kafka called inter.broker.protocol (IBP). Currently, the IBP can only be set by changing the static configuration file supplied to the broker during startup. Here Here are the various phases of work that would be required to deprecate IBP completelyuse this KIP to avoid Broker double rolls in the cluster (when IBP values are advanced).

Phase #1

Post development of the feature versioning system described in this KIP, we hit phase #1 where the new versioning system is ready to be used. Here we would like to migrate clusters from IBP-based validations to the new versioning system based setup. In order to achieve this, we shall once again use an IBP double roll. This means, once a cluster has fully migrated to a certain IBP version, we can almost entirely switch to using the new versioning scheme. Let’s say the value for such an IBP version is migration_ibp_version. Then, in order for the versioning system to safely provide a migration path, we do the following::

  • The feature versioning system will be released under a new IBP version: migration_ibp_version.

  • For the initial roll out, we should only operate the versioning system after the Firstly, the versioning system itself is a new thing. For the initial roll out, we should only operate the versioning system after the second IBP roll → this brings the IBP of the cluster to migration_ibp_version (that’s when versioning system is fully deployed).

    • As a safeguard, each broker will validate that it’s IBP version is at least at migration_ibp_version before applying broker validations for feature versions, and before advertising it's features in ZK.

    • As a safeguard, the controller will validate that it’s IBP version is at least at migration_ibp_version before allowing for feature version

    upgrades
    • changes to be finalized in a cluster (via the ApiKeys.UPDATE_FEATURES API).

  • All other decision making logic based on feature versions in the broker code will always validate that the broker’s current IBP version is at least at  migration_ibp_version.

Phase #2

There can be a transitional phase where a Kafka cluster can use both IBP-based setup as well as the new versioning scheme. In order to completely deprecate the existing IBP-based setup, we would want to ensure we no longer have any references to IBP configuration in the broker code base. Once that is done, we can stop using IBP configuration altogether, and deprecate/remove the relevant support in the code.

Phase #3

This is the phase when we no longer use the IBP-based setup and have completely switched to using the new versioning scheme.

Phase #4 (long-term)

Completely deprecate IBP, and remove references to the code.

Migrating from versioning scheme back to IBP (i.e. emergency downgrade)

Phase #2

This is the phase when both the new feature versioning system as well as the existing IBP setup (in the form of static configuration) are active in the Kafka broker. Feature flags may be optionally defined in the code as part of Kafka development. When these flags get released, there will be a requirement to finalize such feature flags using the provided API/tooling, as required by this KIP. By this point, we still have not eliminated the requirement for double roll during broker upgrades.


Phase #3

For the future, we would like to move away from using the existing IBP-based setup (in the form of a static configuration) in the broker code base. This requires several steps, as proposed below:

  1. We need a way to map the usage of IBP in the code (in the form of a static configuration) to the usage of IBP in the new feature versioning system. To achieve this, we introduce a feature flag that represents IBP, we will call this feature flag as ibp-feature. We will use the ibp-feature flag in the code at places wherever newer IBP values (from static configuration) are needed to be used:
    1. The max version values for this flag will start from 1 and continue increasing for future IBP version advancements.
    2. The min version value for this flag will start from 1, and it is unlikely to be modified (since we rarely or almost never deprecate IBP versions).
  2. By this point, IBP-based decision making in the broker code will be such that:
    1. If the ibp-feature flag is finalized yet and if static IBP config value is >= migration_ibp_version , then the value of the ibp-feature flag is preferred for decision making over the IBP value from static configuration.
    2. Otherwise if the ibp-feature flag is not finalized yet, we continue to use the latest IBP value based on static configuration for decision making.
  3. We then release the ibp-feature flag as part of the next major Kafka release. The release would eventually get deployed to Kafka clusters, and, the ibp-feature flag is expected to be finalized in the cluster (via provided tooling).
  4. Once #3 happens, all future Kafka code changes can continue to use the ibp-feature flag, thereby effectively stopping the use of IBP as a static configuration.

Phase #3

This is the phase when we no longer advance the IBP values in the old IBP-based setup (in the form of static broker configuration) and have completely switched to ibp-feature flag in the Kafka code. The former will be kept around for legacy purposes onlyWe do not foresee this happening once the migration is over, and IBP has been deprecated. However, it can happen that we would want to switch back to using IBP-based setup while we are in Phase #3 above (for example, due to an unprecedented issue). This will be supported, because we deprecate IBP only in Phase #4 above. To "downgrade", a user would first need one rolling restart of the cluster to reduce the IBP to the required version. Then another rolling restart to change the binary. Note that each feature will have its own set of requirements to make this possible (in some cases, it may not be possible), but this is outside the scope of this document.

Future work

As part of future work, we could consider adding better support for downgrades & deprecation. We may consider providing a mechanism through which the cluster operator could learn whether a feature is being actively used by clients at a certain feature version. This information can be useful to decide whether the feature is safe to be deprecated. Note that this could involve collecting data from clients about which feature versions are actively being used, and the scheme for this needs to be thought through.

...