Status

Current state[One of "Under Discussion", "Accepted", "Rejected"]

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: KAFKA-3492

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

KIP-13 introduced client quotas in Kafka 0.9.0.0. Rate limits on producers and consumers are enforced to prevent clients saturating the network or monopolizing broker resources. The current implementation allocates quotas to client-ids. This works well in single user clusters or clusters that use PLAINTEXT where all users have the same identity. But since client-id is unauthenticated and can be set to any value by the client, multi-tenant secure installations require quotas to be enforced for authenticated principals to guarantee fair allocation of resources and prevent denial-of-service.

This KIP addresses the following extensions to the existing implementation:

  1. The option to apply quotas based on authenticated principal instead of client-id. This prevents users generating heavy traffic from monopolizing resources and impacting the performance of other users in a multi-tenant cluster.
  2. Sub-quotas for clients of an authenticated user.  Like the current client-id implementation, this enables a user to rate-limit some producers or consumers to ensure that they don’t impact other more critical clients.  For instance, users may be able to rate-limit an auditing client running in the background, leaving resources always available for a critical event processing client.

Public Interfaces

Configuration Options

A configuration option quota.secure will be added to choose between the existing client-id based implementation and the new authenticated-principal based implementation.  The default value will be false to be consistent with Kafka 0.9.0.x.

Default quota configs will apply to authenticated user principals if quota.secure=true.

Metrics

When quota.secure=true, quota related metrics will be generated for authenticated principals rather than client-ids.

Tools

kafka-configs.sh will be extended to support authenticated user quotas and sub-quotas for clients of a user.  A new entity type “users” will be added.  The key-value pairs supported for users will be:

Proposed Changes

Authenticated Principal 

Authenticated user principal will be obtained from the Session when quota.secure=true. Base64-encoded hex string version of the Principal will be used so that it can be used as a node name in Zookeeper and as the metric name without placing any restrictions on the characters allowed in the principal. 
 

Quota Configuration

Quotas are currently configured as the total rate limits (p, c) for all the producer or consumers with a specific client-id. Default values are specified in server.properties (quota.producer.default=defaultP, quota.consumer.default=defaultC) for client-ids which don’t have a config override. Producer quotas and consumer quotas can be configured independently and default values are applied when an override is not specified. In the examples below, both are overridden together for simplicity.

  1. Current Implementation: Client-id based quota (quota.secure=false): { clientA : (pA, cA) }
  2. Authenticated-principal based quota (quota.secure=true): { user1 : (p1,c1) }
  3. Hierarchical quotas for clients of a user (quota.secure=true):{ user2 : { total : (p2, c2), clientA : (p2A, c2A), clientB : (p2B, c2B)}}

 

// Quotas for user1 (without sub-quotas)
{
  "version":1,
  "config": {
    "+" : {"producer_byte_rate":"1024","consumer_byte_rate":"2048"}
  }
}
// Quotas for user2 (with sub-quotas)
{
  "version":1,
  "config": {
    "+" : {"producer_byte_rate":"1024","consumer_byte_rate":"2048"},
    "clientA" : {"producer_byte_rate":"10","consumer_byte_rate":"20"},
    "clientB" : {"producer_byte_rate":"30","consumer_byte_rate":"40"}
  }
} 

 

Quota Identifier

Quota configuration and metrics currently use client-id as the unique key, enforcing one quota for all clients with the same client-id. This will be replaced with a new quota-id. Each quota-id is associated with a pair of producer and consumer rate limits which may be config overrides or the default quota.  

quota.secure=false
quota.secure=true

Quota Persistence in Zookeeper

Client-id based quotas will continue be stored under /config/clients. Authenticated user quotas will be stored under /config/users. Only one of these will be processed and watched by the brokers depending on the value of quota.secure. Note that Base64-encoded hex version of the user principal will be used as node name under /config/users to cope with Zookeeper naming restrictions.

Tools

kafka-configs.sh will be extended to support a new entity type "users". Quota configuration for users will be provided as key-value pairs to be consistent with other configuration options. Hence no new command line arguments will be added to the tool. The tool will parse the key-value pairs specifying total user quota and possibly some client quotas, validate these and convert them to the equivalent JSON for persistence in Zookeeper.

Compatibility, Deprecation, and Migration Plan

quota.secure is set to  false as default to be consistent with Kafka 0.9.0.x. Hence the existing quota configurations will apply if new secure quotas are not defined. If quota.secure  is set to true and default or new quotas are configured for users, clients may be throttled based on the quota limits. But no client API changes are necessary to work with the new implementation.

Rejected Alternatives

Unified configuration for client-id and authenticated-principal based quotas

This KIP proposes to use a broker configuration option to switch between client-id based quotas and authenticated-principal based quotas for simplicity. An alternative would be to define a unified configuration where client-id based quotas are a special case of a unified quota config with the same username applied to all clients. The internal quota implementation will use common code for both options with only the quota-id being different. But the externally visible configuration and defaults are much simpler to define with separate options that are consistent with 0.9.0.x since it is unlikely that a cluster would support both.