Current state: Accepted
Discussion thread: here
Released: 2.0.0
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
This KIP addresses a fundamental weakness in Kafka's quota design.
Kafka currently throttles clients that violate quotas by delaying the responses. In other words, a client cannot know that it has been throttled until it receives the response. While this works as long as the throttle time is low and the clients are cooperative, this may not always work when the delay is long. Consider the following case for producers (although the same defect can affect consumers):
This scenario is not far fetched. In fact we have seen this problem on multiple occasions when a MapReduce job tries to push data to a Kafka cluster.
The fundamental problem is that clients see only a timeout on violating quota. Therefore clients have no way to distinguish between timeout scenarios (such as network partitions) and quota violation scenarios.
The current quota mechanism has a few other caveats:
Those two caveats are sort of independent of the lack of communication between brokers and clients. Solving them correctly needs some major changes in the network layer. Therefore this KIP is not trying to address them but only focus on improve the quota communication between brokers and clients.
In order to let the clients know whether the received response is already throttled or not, we will need to bump up all the related request version.
We propose the following changes:
To indicate that the broker has not throttled the response, we will bump up all request version (without wire format change) so that the clients knows whether it should hold back from sending the next request or not.
Although older client implementations (prior to knowledge of this KIP) will immediately send the next request after the broker responds without paying attention to the throttle time field, the broker is protected by virtue of muting the channel for time X. i.e., the next request will not be processed until the channel is unmuted. Although this does not prevent the the socket buffers from being utilized, the broker's request handlers are not recruited to handle the throttled client's request if it has not backed off sufficiently to honor the throttle time. Since this subsequent request is not actually handled until the broker unmutes the channel, the client can hit request.timeout.ms and reconnect. However, this is no worse than the current state.
The change is fully backwards compatible.
Set quota to a very low value to verify that the clients are neither timing out or sending more requests to the broker before throttle time has passed.
One potential solution to the above problem is to set a very high request.timeout.ms on the client side. This is not ideal because the request timeout is not supposed to be set in response to a throttling condition.