...
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
ZooKeeper TLS Functionality
Since the merge KAFKA-8634 (https://github.com/apache/kafka/commit/d67495d6a7f4c5f7e8736a25d6a11a1c1bef8d87) in trunk, Apache Kafka ships with Apache Zookeeper supporting TLS and Dynamic Reconfiguration (AK 2.4 ultimately shipped with ZooKeeper version 3.5.6 rather than 3.5.5, but the general functionality is the same). When doing a deployment in a security-minded environment the desire is to use TLS to encrypt communication in transit.
Note that the current version of ZooKeeper (3.5.6 as of this writing and the version shipped with Apache Kafka 2.4) only supports mutual certificate authentication. There is a sever-side config "ssl.clientAuth
" that the ZooKeeper code recognizes (case-insensitively: want
/need
/none
are the valid options), but this config has no effect in 3.5.6 (
). A recent build from source confirms that this config worked in the 3.6 SNAPSHOT, but that version is not yet released. Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key ZOOKEEPER-3674
Note also that ZooKeeper will associate multiple identities with any session that successfully authenticates multiple ways (e.g. both client certificate and SASL). The X.509 identity is the full Distinguished Name from the client's certificate, and this can be changed (i.e. use just a part of the DN) only by implementing and using a custom ZooKeeper authentication provider that overrides the method protected String getClientId(X509Certificate clientCert)
. A client that accesses an ACL-protected Znode is authorized if it has at least 1 of the identities present in any authorizing ACL.
ZooKeeper also supports TLS connectivity between ZK nodes for Quorum-related communication. This is configured independent of Kafka within ZooKeeper.
It is possible to enable TLS connectivity to Zookeeper from Apache Kafka 2.4 -- the problem is that configuration information has to be passed via system properties as -D command line options on the Java invocation of the broker or CLI tool (e.g. ZooKeeper Security Migration), and such -D command line options are not secure because anyone with access to the box can see the command line used to start the process; the configuration includes sensitive keystore/truststore password information, so we need a secure mechanism for passing the configuration values. The motivation for this KIP is to harden/secure the configuration mechanism for Zookeeper TLS connectivity.
With this KIP we aim to introduce the necessary changes to enable the use of secure configuration values when defining TLS encrypted channels for communications with Zookeeper. These changes will enable the secure use of TLS from brokers as well as any CLI tools that will still contain non-deprecated direct ZooKeeper communication require it in the next AK (AK 2.5) release.
Brokers
Brokers talk to ZooKeeper, of course. In addition, the class kafka.security.authorizer.AclAuthorizer
talks directly to ZooKeeper and supports being pointed to a separate ZooKeeper quorum, so it must be possible to configure that ZooKeeper connection for TLS as well. Note that kafka.security.auth.SimpleAclAuthorizer
was deprecated in AK 2.4 (in favor of AclAuthorizer
) and will not support TLS connectivity to ZooKeeper.
CLI Tools
The list of CLI tools that used non-deprecated direct ZooKeeper access in the previous AK (AK 2.4) release was as follows:
zookeeper-security-migration.sh
(kafka.admin.ZSecurityMigrator
)kafka-reassign-partitions.{bat,sh}
(kafka.admin.ReassignPartitionsCommand
)kafka-configs.{bat,sh}
(kafka.admin.ConfigCommand
)zookeeper-shell.{bat,sh}
It doesn't make sense to address direct ZK access in the #1 since connecting to ZooKeeper and applying/removing ZooKeeper ACLs is the whole point of the tool. (In theory we could replace its direct ZK access in favor of a Kafka API, but that seems silly.)
Direct ZK access in #2 above is being addressed via the already-accepted KIP-455: Create an Administrative API for Replica Reassignment., and the direct access flag will be deprecated via KIP-555: Deprecate Direct Zookeeper access in Kafka Administrative Tools.
Direct There is not yet a KIP to address direct ZK access in #3 , and in fact ConfigCommand
presents a bit of a conundrum because it explicitly states in a comment that a supported use case is bootstrapping a Kafka cluster with encrypted passwords in Zookeeper (see has already been replaced via a --bootstrap-server
flag and will be deprecated in the next release via KIP-555 as well. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/admin/ConfigCommand.scala#L65). This is a very special use case for sure, but it does mean that it will be especially difficult to fully/100% deprecate this particular direct Zookeeper connectivity without a different storage mechanism for dynamic configuration values being available (e.g. the self-managed quorum referred to in KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum).
It doesn't make sense to address direct ZK access in the ZkSecurityMigrator since connecting to ZooKeeper and applying/removing ACLs is the whole point of the tool. (In theory we could replace its direct ZK access in favor of a Kafka API, but that seems silly.)
ZooKeeper also supports TLS connectivity between ZK nodes for Quorum-related communication. This is configured independent of Kafka within ZooKeeper.
accessing a ZooKeeper instance via this CLI tool will be required, and passing TLS configuration to it in a secured way will be necessary.
There is an additional CLI tool that supports bootstrapping information into ZooKeeper besides ConfigCommand
: kafka-acls.{
bat,sh} (kafka.admin.AclCommand
). Accessing a ZooKeeper instance via this CLI tool will also be required, and passing TLS configuration to it in a secured way will also be necessaryThe class kafka.security.authorizer.AclAuthorizer
talks directly to ZooKeeper, so it must be possible to configure that ZooKeeper connection for TLS as well. Note that kafka.security.auth.SimpleAclAuthorizer
was deprecated in AK 2.4 (in favor of AclAuthorizer
) and will not support TLS connectivity to ZooKeeper.
Goals
- Harden/secure the configuration mechanism for Zookeeper TLS connectivity from:
- Kafka Brokers (including from
kafka.security.authorizer.AclAuthorizer
if/when configured) zookeeper-security-migration.sh
kafka-reassign-partitionsconfigs.{bat,sh}
and/or andkafka-configsacls.{bat,sh}
assuming direct ZooKeeper connectivity is not deprecated in the next AK release (and if so, then great – no need to do anything)zookeeper-shell.{
- Kafka Brokers (including from
- Support client certificate authentication to ZooKeeper both with and without SASL authentication in ZK Security Migrator and the broker (when
zookeeper.set.acl
is true). - Add system tests to confirm the hardened/secured configuration for TLS connectivity to ZooKeeper
- Add explicit Kafka documentation on how to configure TLS connectivity to ZooKeeper
- Add a reference in the Kafka documentation to the ZooKeeper Quorum TLS configuration (https://zookeeper.apache.org/doc/r3.5.6/zookeeperAdmin.html#Communication+using+the+Netty+framework)
...
- Zookeeper-to-Zookeeper Quorum TLS system tests and in-depth documentation (the ZooKeeper project already has such tests and documentation)
- Kafka API for
kafka.admin.ConfigCommand
and deprecating its direct ZooKeeper connectivity (this must be addressed as a separate KIP)Dynamic reconfiguration of ZooKeeper TLS configs
Public Interfaces
New Broker and AclAuthorizer Configurations
The below table contains the complete list of added configs. All configs being added are optional Strings with no default value unless otherwise noted. These values are potentially required to access ZooKeeper in the first place, so they are not dynamically reconfigurable (dynamic reconfiguration values are currently stored in ZooKeeper). Sensitive values (e.g. those of type Password
) can be encrypted as described in KIP-421: Automatically resolve external configurations.
. As an example, these are some of the configs that will be introduced:
...
Every config can be prefixed with "authorizer.
" for the case when kafka.security.authorizer.AclAuthorizer
connects via TLS to a ZooKeeper quorum separate from the one that Kafka is using – this specific use case will be identified in the configuration by explicitly setting authorizer.zookeeper.client.secure=true
. In this case the configs prefixed with "authorizer.
" are not "overrides" like the other authorizer ZooKeeper connectivity configs such as connection/session timeouts and max inflight requests; ZooKeeper TLS connectivity values for the authorizer are not "merged" with Kafka's ZooKeeper TLS configs (if any) because semantically the two sets of configs are for different ZooKeeper quorums and there is no guarantee that they would be applicable across the two quorums; any configs that need to be identical across the two ZooKeeper quorums will have to be repeated with and without the prefix. The same defaults described below will apply to the prefixed configs.
Config Key | Documentation |
---|---|
| Set client to use TLS when connecting to ZooKeeper. When true, <code>zookeeper.clientCnxnSocket</code> must be set (typically to <code>org.apache.zookeeper.ClientCnxnSocketNetty</code>); other values to set may include <include list of all other properties below> |
zookeeper.clientCnxnSocket | Typically set to <code>org.apache.zookeeper.ClientCnxnSocketNetty</code> when using TLS connectivity to ZooKeeper |
zookeeper.ssl.keyStore.location | Keystore location when using a client-side certificate with TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>keyStore</code>, which differs from Kafka. |
| Keystore password when using a client-side certificate with TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>keyStore</code>, which differs from Kafka. |
zookeeper.ssl.keyStore.type | Keystore type when using a client-side certificate with TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>keyStore</code>, which differs from Kafka. The default value of <code>null</code> means the type will be auto-detected based on the filename extension of the keystore. |
zookeeper.ssl.trustStore.location | Truststore location when using TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>trustStore</code>, which differs from Kafka. |
zookeeper.ssl.trustStore.password
| Truststore password when using TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>trustStore</code>, which differs from Kafka. |
zookeeper.ssl.trustStore.type | Truststore type when using TLS connectivity to ZooKeeper. Note ZooKeeper's use of camel-case <code>trustStore</code>, which differs from Kafka. The default value of <code>null</code> means the type will be auto-detected based on the filename extension of the truststore. |
| Specifies the protocol to be used in ZooKeeper TLS negotiation |
zookeeper.ssl.enabledProtocols | Specifies the enabled protocol(s) in ZooKeeper TLS negotiation (csv). Note ZooKeeper's use of camel-case <code>enabledProtocols</code>, which differs from Kafka. The default value of <code>null</code> means the enabled protocol will be the value of the <code>zookeeper.ssl.protocol</code> configuration property. |
zookeeper.ssl.ciphersuites | Specifies the enabled cipher suites to be used in ZooKeeper TLS negotiation (csv). The default value of <code>null</code> means the list of enabled cipher suites is determined by the Java runtime being used. |
zookeeper.ssl.context.supplier.class | Specifies the class to be used for creating SSL context in ZooKeeper TLS communication |
| Specifies whether to enable hostname verification in the ZooKeeper TLS negotiation process. Disabling it is only recommended for testing purposes. |
| Specifies whether to enable Certificate Revocation List in the ZooKeeper TLS protocols |
| Specifies whether to enable Online Certificate Status Protocol in the ZooKeeper TLS protocols |
ZooKeeper Security Migration CLI
...
The same --zk-tls-config-file parameter will be added if (and only if) direct ZooKeeper connectivity for this tool is not deprecated in this Kafka release.
...
.
ACL Command CLI
The same --zk-tls-config-file parameter will be added if (and only if) direct ZooKeeper connectivity for this tool is not deprecated in this Kafka release (i.e. via KIP-455: Create an Administrative API for Replica Reassignment).
ZooKeeper Shell CLI
A -zk-tls-config-file parameter will be added. Note the use of single-dash as opposed to double-dash here since all of the tool's parameters follow the ZooKeeper project's style and are specified via a single dash.
Proposed Changes
The proposed changes include the public interface changes:
- New Kafka configurations, both non-prefixed as well as prefixed with "
authorizer.
" - A new --zk-tls-config-file parameter in the for:
- ZooKeeper Security Migration Tool
- Config Command CLI (for the special use case of bootstrapping TLS-enabled ZooKeeper)
- ACL Command CLIs (for the special use case of bootstrapping TLS-enabled ZooKeeper)
- A new -zk-tls-config-file parameter in the ZooKeeper Shell (again, note the single dash as opposed to the double-dash used above)
The proposed changes also include the addition of:
- sSystem tests to confirm the hardened/secured configuration for TLS connectivity to ZooKeeper
- The use of ZooKeeper Security Migrator and Kafka Brokers with client certificate authentication both with and without SASL
- Explicit explicit Kafka documentation on how to configure TLS connectivity to ZooKeeper
...
The changes are additions only, and there is no compatibility issue in the broker because the default for the broker config zookeeper.client.secure
is false
.
Test Plan
System tests will cover the following:
- Migrating Zookeeper/Kafka clusters from non-TLS-enabled ZooKeeper to TLS-enabled ZooKeeper
- Invoking the Zookeeper Security Migration tool against TLS-enabled ZooKeeper both with and without ZK SASL authentication enabled
Compatibility testing is unnecessary because Zookeeper TLS is not available in prior versions.
The connection between Kafka and Zookeeper is not on a critical path related to performance – brokers don't repeatedly communicate with Zookeeper as they process messages, for example – so introducing TLS encryption here does not require explicit performance testing.
Rejected Alternatives
N/A