Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Network thread utilization (Scenarios 5, 6) will be addressed separately in a future KIP since that is a lot more complex.
  2. This KIP attempts to avoid unintended denial-of-service scenarios from misconfigured application (eg. zero heartbeat interval) or a client library with a bug can cause broker overload. While it can reduce the load in some DoS scenarios, it does not completely protect against DoS attacks from malicious clients. A DDoS attack with large numbers of connections resulting in a large amount of expensive authentication or CPU-intensive requests can still overwhelm the broker..

...

bin/kafka-configs  --zookeeper localhost:2181 --alter --add-config 'io_.thread_.units=2.0--entity-type users

...

All client requests which are not exempt from request throttling will have a new field containing the time in milliseconds that the request was throttled for.

...

  • If O is the observed usage and T is the target usage over a window of W, to bring O down to T, we need to add a delay of X to W such that: O * W / (W + X) = T.
  • Solving for X, we get X = (O - T)/T * W.
  • The response will be throttled by min((X, W)

The maximum throttle time for any single request will be the quota window size (one second by default). This ensures that timing-sensitive requests like heartbeats are not delayed for extended durations. For example, if a user has a quota of 0.001 and a stop-the-world GC pause takes 100ms during the processing of the user's request, we don't want all the requests from the user to be delayed by 100 seconds. By limiting the maximum delay, we reduce the impact of GC pauses and single large requests. To exploit this limit to bypass quota limits, clients would need to generate requests that take significantly longer than the quota limit. If R is the amount of time taken process one request and the user has C active connections, the maximum amount of time a user/client can use in the long term per quota window is max(quota, num.io.threads C * R). In practice, quotas are expected to be much larger than the time taken to process individual requests and hence this limit should be is sufficient. Byte rate quotas will also additionally help to increase throttling in the case where large produce/fetch requests result in larger per-request time. DoS attacks using large numbers of connections is not addressed in this KIP.

Sample configuration in Zookeeper

...

Code Block
languagejs
titleSample quota configuration
// Quotas for user1
// Zookeeper persistence path /config/users/<encoded-user1>
{
    "version":1,
    "config": {
        "producer_byte_rate":"1024",
        "consumer_byte_rate":"2048",
		"io_.thread_.units" : "0.1"
    }
}

 

Co-existence of multiple quotas

...

Consumer requests such as heartbeat are timing sensitive and it is possible that throttling these requests could result in the consumer falling out of the consumer group. However, if we don't throttle some requests, a misbehaving application or a misconfigured application with low heartbeat interval or a client library with a bug can cause broker overload. To protect the cluster from DoS attacks and avoid these overload scenarios, all client requests will be throttled. In most deployments, quotas will be set to a large enough value to provide a safety net rather than very small values that throttle well-behaved clients, so this shouldn't be an issue.

Use separate quotas for timing-sensitive requests such as consumer heartbeat

Two different quotas can configured to ensure that timing-sensitive requests like consumer heartbeat are not impacted by throttling of other requests. But this will be adding more configuration for administrators and more metrics etc. and a single quota will be simpler to use. Since coordinator requests are on their own connection and throttling is performed only after the request is processed, the impact of request quotas should be limited.