Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Producer side quotas are defined in terms of bytes written per second per client id. Consumer quotas as defined in terms of bytes read per second per client id. These 2 quotas will be enforced separately. For example: if a client deletes their consumer offsets and "bootstraps" quickly, this will cause a "read bytes per second" violation and will throttle that consumer group. Producers should not be affected . It should have no effect on any producer.

These metrics should be aggregated over a short period of time (5-10 seconds) before we declare a quota violation. This reduces the likelihood of bursts in traffic.

Quota Overrides

  • It is very useful for cluster operators to dynamically disable/enable quota enforcement for the entire cluster. Such a feature is very useful while rolling out this feature to production environments. 
  • Replication traffic will be exempt from quotas. 
  • Ability to disable quotas on a per-client basis. For example: we may not want to quota mirror makers.

...

Broker Interface

Quota enforcement in the broker is done by the QuotaManager. check() is expected to be called before processing a request. If check() returns false, the broker is expected to take some quota action; otherwise the request is processed as usual.

Code Block
/**
 * This is the basic interface we need to implement in the Kafka broker. This class will record all events processed by the broker. 
 */
public interface QuotaManager {
    /** 
     * The function check should be called before processing any request. If this returns false, then some Quota action has to be taken.
     */
    <T extends RequestOrResponse> boolean check(T request);

    /**
     * This function is called after any client request has been processed and lets the implementation track information about the response. 
     * This is useful for consumer side quotas where we may not know the response sizes and number of messages before processing the request entirely.
     *
     * T : T can be any type of request or response. Typically (FetchRequest or ProducerRequest)
     * C : C is basically the response object. Typically (FetchResponse or ProducerResponse).
     *
     */
	<T extends RequestOrResponse, C extends RequestOrResponse> void onResponse(T request, C response);

	void shutdown();
}

One alternative (proposed by Jun) is to simply use the "fetchSize" parameter on the FetchRequest (per partition) when calling check() to determine if the request is violating the quota or not. This removes the need to have the onResponse method.

However, fetchSize is the maximum amount of data that can be returned for that partition and may not accurately reflect the actual response size which IMO is important. For example: if the fetchSize is set to Integer.MAX_VALUE, every request will get throttled immediately.

Example usage in KafkaApis:

Code Block
// Example for consumers
def handleFetchRequest() {
  val fetchRequest = request.requestObj.asInstanceOf[FetchRequest]
  if(!quotaManager.check(fetchRequest))
    // take quota actions

  // Notify the quota manager after the request has been processed
  def sendResponseCallback(responsePartitionData: Map[TopicAndPartition, FetchResponsePartitionData]) {
      val fetchResponse = FetchResponse(fetchRequest.correlationId, responsePartitionData)
     quotaManager.notify(fetchRequest, fetchResponse);
  }
}

Quota Actions

What do we do when check() returns false? We have a few options:

Immediately return an error: This is the simplest possible option. However, this requires clients to implement some sort of a back off mechanism since there is no purpose of retrying immediately.

Delay the request and return error: If a client is currently exceeding its quota, we can park the request in the purgatory for 'n' milliseconds. After the request expires, the broker can return a QuotaViolationException to the consumer. This is a nice feature because it effectively throttles the client up to the client timeout and makes it less critical for them to implement backoff logic. After the error is caught by the client, they can retry immediately. Note that in case of producers, we should be careful about retaining references to large Message bodies because we could easily exhaust broker memory if parking hundreds of requests.

Delay the response but don't return an error: In this alternative (proposed by Jay), no error is returned to the client. Produce requests will get appended to the log immediately and will be kept in purgatory until this client is no longer throttled, after which the producer returns successfully. Fetch requests will also be deposited into the purgatory and serviced only after that client is no longer violating quota.

Custom Implementations

The interface has been kept very generic to allow multiple implementations of the quota policy. The default policy that we have proposed should suffice for the majority of use cases. Since we pass in the actual request and response objects to the QuotaManager, these implementations should have enough information to build complex policies if required.

...