Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Clarify releases

Table of Contents

Status

Current stateUnder DiscussionAdopted

Discussion thread: here

JIRA

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4565
 (0.10.2.0)

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4636
 (0.11.0.0)

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

A new broker config listener.security.protocol.map will be introduced so that we can map a protocol label listener name to a security protocol. The config value should be in the CSV Map format that is currently used by max.connections.per.ip.overrides. The config value should follow map semantics: each key should only appear once, but values may appear multiple times. For example, the config could be defined in the following way to match the existing behaviour:

...

The next step is to change the validation of advertised.listeners and listeners so that the protocol label listener name has to be one of the keys in listener.security.protocol.map (only security protocols are allowed currently). For example, the following would configure a broker with two different host:port pairs mapped to the same security protocol in two cases:

...

We then introduce a second broker config as an alternative to security.inter.broker.protocol:

Code Block
inter.broker.protocollistener.labelname=REPLICATION

It is an error to set both security.inter.broker.protocol and inter.broker.protocollistener.labelname at the same time. inter.broker.protocollistener.label name will be null by default, which means that the PLAINTEXT protocol will be used by default (as is currently the case).

Finally, we make it possible to provide different security (SSL and SASL) settings for each protocol label listener name by adding a normalised prefix (the security label is listener name is lowercased)  to the config name. For example, if we wanted to set a different keystore for the CLIENT protocol labellistener, we would set a config with name protocollistener.labelname.client.ssl.keystore.location. If the config for the protocol label listener name is not set, we will fallback to the generic config (i.e. ssl.keystore.location) for compatibility and convenience. For the SASL case, some configs are provided via a JAAS file, which consists of one or more entries. The broker currently looks for an entry named KafkaServer. We will extend this so that the broker first looks for an entry with a lowercased protocol label listener name followed by a dot as a prefix to the existing name. For the CLIENT protocol label listener example, the broker would first look for client.KafkaServer with a fallback to KafkaServer, if necessary.

...

Version 4 of the broker registration data stored in ZooKeeper will have protocol labels instead listener names instead of security protocols in the elements of the endpoints array and an additional listener.security.protocol.map field. The latter is not strictly needed if we assume that all brokers have the same config, but it would make config updates trickier (e.g. two rolling bounces would be required to add a new mapping from protocol label listener name to security protocol). Also, we add an additional field instead of changing the endpoints schema to allow for rolling upgrades.

Code Block
languagejs
{
	"version": 4,
	"jmx_port": 9999,
	"timestamp": 2233345666,
	"host": "localhost",
	"port": 9092,
	"rack": "rack1",
	"listener._security._protocol._map": {
		"PLAINTEXT": "PLAINTEXT",
		"SSL": "SSL",
		"SASL_PLAINTEXT": "SASL_PLAINTEXT",
		"SASL_SSL": "SASL_SSL"
	},
	"endpoints": [
		"CLIENT://cluster1.foo.com:9092",
		"REPLICATION: //broker1.replication.local:9093",
		"INTERNAL_PLAINTEXT: //broker1.local:9094",
		"INTERNAL_SASL://broker1.local:9095"
	]
}

Protocol

Version 2 3 of UpdateMetadataRequest will be introduced and the elements of the end_points array would also have a protocollistener_label fieldname field.

Code Block
UpdateMetadata Request (Version: 23) => controller_id controller_epoch [partition_states] [live_brokers] 
  controller_id => INT32
  controller_epoch => INT32
  partition_states => topic partition controller_epoch leader leader_epoch [isr] zk_version [replicas] 
    topic => STRING
    partition => INT32
    controller_epoch => INT32
    leader => INT32
    leader_epoch => INT32
    isr => INT32
    zk_version => INT32
    replicas => INT32
  live_brokers => id [end_points]
    id => INT32
    end_points => port host protocol_labellistener_name (new) security_protocol_type
      port => INT32
      host => STRING
      protocollistener_labelname => String (new)
      security_protocol_type => INT16

Client

Protocol labels Listener names only exist in the brokers, clients never see them.

...

We would have to change a number of places in the code that currently use SecurityProtocol as a key to use ProtocolLabel instead the listener name instead. A few examples:

  1. Acceptor thread
  2. Metadata request handler
  3. ReplicaManager
  4. Broker class

...

We would also have to change the various authenticator classes to look for security configs for the relevant protocol label listener name before falling back to the generic ones.

As stated previously, clients never see protocol labels and listener names and will make metadata requests exactly as before. The difference is that the list of endpoints they get back is restricted to the protocol label of listener name of the endpoint where they made the request. In the example above, let's assume that all brokers are configured similarly and that a client sends a metadata request to cluster1.foo.com:9092 and it reaches broker1's 192.1.1.8:9092 interface via a load balancer. The security protocol would be SASL_PLAINTEXT and the metadata response would contain host=cluster1.foo.com,port=9092 for each broker returned.

The exception is ZooKeeper-based consumers. These consumers retrieve the broker registration information directly from ZooKeeper directly and would have to be updated to map from protocol label to security protocoland will choose the first listener with PLAINTEXT as the security protocol (the only security protocol they support).

Compatibility, Deprecation, and Migration Plan

As mentioned previously, the default value of listener.security.protocol.map maps the existing security protocols to a label with listener with the same name to maintain compatibility:

...

For users upgrading, they should only use protocol labels listener names once all the brokers and ZooKeeper-based consumers have been upgraded to a version that supports protocol labelslistener names. ZooKeeper-based consumers will use the first listener with PLAINTEXT as the security protocol, so listener ordering is important in such cases.

Rejected Alternatives

  1. Instead of adding the listener.security.protocol.map config, we could extend the protocol part of the listener definition to include both the protocol label and listener name and security protocol. For example, CLIENT+SASL_PLAINTEXT://192.1.1.8:9092. This is appealing from a clarity perspective (the listeners are fully defined in a single config value), but it may lead to duplication between listeners and advertised.listeners. A way to avoid that issue (at the cost of loss of symmetry) would be for advertised.listeners to only include the protocol label listener name (we can infer the security protocol by looking at the listeners entry with the same protocol labelname).
  2. Assume that listener.security.protocol.map is the same in every broker. The slight benefit in terms of smaller broker registration JSON is not worth the additional operational complexity when it comes to changing the config values in a running cluster (two rolling upgrades would be needed in some simple cases).
  3. Using hard-coded listener domains for internal and replication traffic. The config format is simpler and there's less scope for hard to understand configs. The main disadvantage is that it's a bit too specific and may need to be extended again as more sophisticated use cases appear. The current proposal is more general and it seems like a natural evolution of the existing system.

...