You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Status

Current state: "Under Discussion"

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here

Motivation

Kafka has a dependency on Zookeeper 3.6.3, which reached its end of life in December 2022. We would like to upgrade Zookeeper to version 3.8.1 which is the latest release of the 3.8.x versions.

Zookeeper 3.8.1 server supports clients no older than 3.5.x and Zookeeper 3.8.1. clients support server versions no older than 3.5.x.

ZooKeeper clients from 3.5.x onwards are fully compatible with 3.8.x servers.
The upgrade from 3.6.x and 3.7.x can be executed as usual, no particular additional upgrade procedure is needed.
ZooKeeper 3.8.x clients are compatible with 3.5.x, 3.6.x and 3.7.x servers as long as you are not using new APIs not present these versions.

In comparison, Zookeeper 3.6.3 server supports clients no older than 3.4.x and Zookeeper 3.6.3 clients support server versions no older than 3.5.x

ZooKeeper clients from 3.4 and 3.5 branch are fully compatible with 3.6 servers.
The upgrade from 3.5.7 to 3.6.0 can be executed as usual, no particular additional upgrade procedure is needed.
ZooKeeper 3.6.0 clients are compatible with 3.5 servers as long as you are not using new APIs not present in 3.5.

Public Interfaces

No public interfaces are being changed.

Proposed Changes

Similarly to https://github.com/apache/kafka/pull/12620/files we would like to upgrade to 3.8.1.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users? Users who use Kafka clusters with Zookeeper clients older than 3.5.x won't be able to communicate with a Zookeeper cluster using 3.8.1. As mentioned in the accompanying JIRA ticket Apache Kafka has been using Zookeeper since version 2.4, everything above and including this version should be stable. It is acceptable to break compatibility with Apache Kafka versions prior to 2.4 as they are considered beyond their end of life and are not maintained (source: Time Based Release Plan#WhatIsOurEOLPolicy).

These are the configurations that Kafka passes onto Zookeeper clients:

def zkClientConfigFromKafkaConfig(config: KafkaConfig, forceZkSslClientEnable: Boolean = false): ZKClientConfig = {
val clientConfig = new ZKClientConfig
if (config.zkSslClientEnable || forceZkSslClientEnable) {
KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslClientEnableProp, "true")
config.zkClientCnxnSocketClassName.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkClientCnxnSocketProp, _))
config.zkSslKeyStoreLocation.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslKeyStoreLocationProp, _))
config.zkSslKeyStorePassword.foreach(x => KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslKeyStorePasswordProp, x.value))
config.zkSslKeyStoreType.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslKeyStoreTypeProp, _))
config.zkSslTrustStoreLocation.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslTrustStoreLocationProp, _))
config.zkSslTrustStorePassword.foreach(x => KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslTrustStorePasswordProp, x.value))
config.zkSslTrustStoreType.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslTrustStoreTypeProp, _))
KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslProtocolProp, config.ZkSslProtocol)
config.ZkSslEnabledProtocols.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslEnabledProtocolsProp, _))
config.ZkSslCipherSuites.foreach(KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslCipherSuitesProp, _))
KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslEndpointIdentificationAlgorithmProp, config.ZkSslEndpointIdentificationAlgorithm)
KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslCrlEnableProp, config.ZkSslCrlEnable.toString)
KafkaConfig.setZooKeeperClientProperty(clientConfig, KafkaConfig.ZkSslOcspEnableProp, config.ZkSslOcspEnable.toString)
}
// The zk sasl is enabled by default so it can produce false error when broker does not intend to use SASL.
if (!JaasUtils.isZkSaslEnabled) clientConfig.setProperty(JaasUtils.ZK_SASL_CLIENT, "false")
clientConfig
}

Below is a list of changes to behaviours which Kafka uses to communicate with Zookeeper:

Kafka-related changes in Zookeeper 3.7.0

  • Unable to render Jira issues macro, execution error. - Zookeeper now allows multiple super users. Kafka does not pass on the value of zookeeper.superUser to its Zookeeper client so this change should not affect it.
  • Unable to render Jira issues macro, execution error. - Quotas which were previously logged but not enforced are now enforced. These quotas have to do with create/update/delete etc. operations. This will affect Kafka users who put quotas in their Zookeeper clusters.
  • Unable to render Jira issues macro, execution error. - Kerberos authentication over SSL is now supported. Kafka does not support Kerberos authentication with Zookeeper so this change should not affect it.
  • Unable to render Jira issues macro, execution error. - User enforced authentication was only available for SASL before this change. User enforced authentication extends to all other types of authentication supported by Zookeeper. The point of this change is that no additional ACLs are needed to prevent unauthenticated access if one authentication method is enabled.

Kafka-related changes in Zookeeper 3.8.0

  • Unable to render Jira issues macro, execution error. - Zookeeper used to use plaintext password for its trust and key stores. This change makes files which store those passwords to take precedence, but they don't remove the already working logic.


  • If we are changing behavior how will we phase out the older behavior? It should gradually be phased out as users update their Kafka versions
  • If we need special migration tools, describe them here. N/A
  • When will we remove the existing behavior? N/A

Test Plan

We ran the following test on the latest trunk of Kafka with Zookeeper 3.6.3 and Zookeeper 3.8.1:

1) Start 1 Zookeeper node on an m5.4xlarge machine

2) Start 1 Kafka broker on a different m5.4xlarge machine

3) Using 4 admin clients sequentially create up to 2000 topics with 1 partition

4) Using 4 admin clients sequentially change the number of partitions on all 2000 topics to 2

5) Using 4 admin clients sequentially delete all topics


Zookeeper 3.8.1 request latency (PROPOSED)

https://g-576b9cd7b5.grafana-workspace.us-east-1.amazonaws.com/dashboard/snapshot/zmXo3V1hC7MkNsGTfD2lZNe6AREk2DwZ?orgId=1

https://g-576b9cd7b5.grafana-workspace.us-east-1.amazonaws.com/dashboard/snapshot/naH2y9jFJsg9greTv72DUu22Q0WvVE2P?orgId=1

Zookeeper 3.6.3 request latency (CURRENT)

https://g-576b9cd7b5.grafana-workspace.us-east-1.amazonaws.com/dashboard/snapshot/yF2EoSNMeSK7BALUdSmqOPAv6QljK2Yn?orgId=1

https://g-576b9cd7b5.grafana-workspace.us-east-1.amazonaws.com/dashboard/snapshot/Vgqt3I8OuPm9upqNvpvMBTs5TrtFY2k5?orgId=1

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way. N/A

  • No labels