...
Client Behaviors
Clients won’t attempt to resolve the bootstrap addresses upon initialization.
Clients won’t exit fatally if DNS resolution fails.
KafkaConsumer: Users must poll to retry the lookup if it fails.
KafkaAdminClient: Users will need to resend the request if failing.
KafkaProducer: The sender loop should already be polling continuously.
Exception Handling
Failed DNS resolution will result in NetworkException
Case Study
To help illustrate the proposed changes, we provide some examples of how clients might behave in different scenarios.
KafkaConsumer
Case 1: Unable to connect to the bootstrap (For example: misconfiguration)
Suppose the user instantiates a KafkaConsumer with an invalid bootstrap config. When the user invokes assign() and starts poll(), the poll() method will continue to return empty ConsumerRecords and log a warning message.
The user can continue to retry for the configured duration. After the bootstrap timeout expires, the client will throw a BootstrapConnectionException.
Case 2: Transient Network Issue (For example: transient DNS failure)
Now, suppose the user instantiates a KafkaConsumer with a valid bootstrap config, but there is a transient network issue, such as slow DNS resolution.
When the user starts poll(), the poll() method will return an empty ConsumerRecord and log a warning message.
The user can continue to retry, and the network issue will be successfully resolved after some time. The KafkaConsumer will then continue to function normally.
KafkaProducer
Case 1: Unable to connect to the bootstrap (For example: misconfiguration)
...
Suppose the user instantiates a KafkaProducer with an invalid bootstrap config.
...
As the produce is instantiated, the sender thread starts running
...
. A warning message is logged everytime the NetworkClient tries to poll().
If the user tries to produce messages
...
, the producer callback may be completed with a TimeoutException until the bootstrap timeout runs out.
...
Eventually, a BootstrapConnectionException will be thrown.
Case 2: Transient Network Issue (For example: transient DNS failure)
...
Now, suppose the user instantiates a KafkaProducer with a valid bootstrap config, but there is a transient network issue. As the sender thread starts running, a
...
warning message is logged upon trying to bootstrap the client.
...
If the network issue is resolved before the user tries to produce a message
...
, only warning messages will be logged.
If the user tries to produce a message before the issue is resolved
...
, the sender callback will be completed with a TimeoutException if the network issue persists.
...
The send
...
will be completed normally if the network issue is resolved before exhausting the max.block.ms.
AdminClient
Case 1: Unable to connect to the bootstrap (For example: misconfiguration)
...