Status
Current state: "Under Discussion"
Discussion thread: here
JIRA: KAFKA-6863
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
co-authored-by: Mickael Maison <mickael.maison@gmail.com>
Motivation
Currently the Kafka clients - in org.apache.kafka.clients.NetworkClient.initiateConnect() -
resolve a symbolic hostname using :
new InetSocketAddress(String hostname, int port)
which only picks one IP address even if the DNS has multiple records for the hostname, as it in turn calls :
InetAddress.getAllByName(hostname)[0]
For some environments where the broker hostnames are mapped by the DNS to multiple IPs, e.g. in clouds where the IPs point to the external load balancers, it is desirable that the client, on failing to connect to one of the IPs, would try the other ones before giving up the connection.
Our use case is for multiple load balancers fronting the Kafka cluster. The Kafka advertised listeners advertise hostnames that the DNS server maps to the IP of all LBs not just one.
If one LBs isn't available, but the client is able to use another IP for the same hostname (it connects to the 2nd LB for example) the service stays available.
Another case would be where brokers are fronted by two proxies in active/standby mode. This KIP would enable using a standby proxy for HA if connection to the active proxy fails.
Although this KIP and KIP-235 both deal with multiple DNS records, they address separate concerns.
Public Interfaces
Client configuration
Introduce a new configuration parameter :
enable.all.dns.ips = true / false
The default value for this parameter is false, there will be no backwards compatibility issue.
Setting the parameter to true will have the client try to connect to all resolved IPs.
Proposed Changes
If the configuration parameter enable.all.dns.ips
is set to true, the network client code will use
InetAddress.getAllByName(hostname)
to obtain from the DNS server all IPs for the broker hostnames.
The NIO client will attempt a connection to one of the IPs. If the connection is refused or times out and the DNS had returned multiple IPs, then rather than failing the connection it will try to connect to the other IPs.
If they all fail to connect then the behavior is like the current client when the only obtained IP had failed to connect:
- at bootstrap, move to the next hostname
- past bootstrap, retry the connection to the given node starting from the resolution of the hostname
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
By default enable.all.dns.ips will be disabled so there will be no impact
Rejected Alternatives
Making the client use all IPs by default as it may have impacted some users