Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

With the introduction of security in Kafka 0.9, the identity of Kafka clients is the user principal.  User principal is an authenticated user or a grouping of unauthenticated users chosen by the broker using a configurable PrincipalBuilder and is currently used for ACLs. Client-id is a logical grouping of clients with a meaningful name that is used in client metrics and logs. Multi-user systems have a hierarchy - user owns zero or more clients. (user<user-principal, client-id) > defines a safe group of clients. The shorter unsafe client-id is sufficient in client metrics and logs, but quotas should be allocated to safe groups to avoid clients of one user throttling clients of another user with the same client-id.

...

The goal of this KIP is to provide a single unified quota implementation with a simple configuration for client-id based quotas, a similar simple configuration for user quotas and a more complex but flexible configuration for hierarchical quotas.

Public Interfaces

...

Overview

Quotas are currently configured for client-ids. All clients with the same client-id are currently grouped together as a quota entity, enforcing one quota for all clients with the same client-id. This KIP proposes to define quotas for safe client groups which share the same user-principal and client-id. In a single user cluster, this retains the current semantics of client-id quotas.

Configuration Options

Two new configuration options will be added to specify default producer and consumer quotas for users. The existing default configuration options for client-id quotas will be applied only if default user quota is unlimited.

New properties

quota.user.producer.default, quota.user.consumer.default: Default quota for producers/consumers of users without a user quota override. This will be set to unlimited quota (Long.MaxValue) by default.

Changes to existing properties

can be set at <user, client-id>, user or client-id levels. For a given client connection, the most specific quota matching the connection will be applied. For example, if both a <user, client-id> and a user quota match a connection, the <user, client-id> quota will be used. Otherwise, user quota takes precedence over client-id quota.

Quota Entity

Quotas are currently configured for client-ids. All clients with the same client-id are currently grouped together as a quota entity, enforcing one quota for all clients with the same client-id. This KIP proposes to define quotas for safe client groups which share the same user-principal and client-id. In a single user cluster, this retains the current semantics of client-id quotas.

Configuration Options

Two new configuration options will be added to specify default producer and consumer quotas for users. The existing default configuration options for client-id quotas will be quota.producer.default, quota.consumer.default: Default client-id producer/consumer quota  is currently applied to each unique client-id across all users. This will be modified to be a per-user quota for each unique client-id of each user. This client-id default will is applied only if default user quota is unlimited.

Default configuration

...

New properties

quota

...

.user.producer.default, quota.user.consumer.default

...

: Default quota for producers/consumers of users without a user quota override. This will be set to unlimited quota (Long.MaxValue) by default.

Changes to existing properties

quota

...

.producer.default, quota.

...

consumer.default

...

: Default client-id producer/consumer quota  is currently applied to each unique client-id across all users. This will be modified to be a per-user quota for each unique client-id of each user.

...

This client-id default will is applied only if default user quota is unlimited.

Default configuration

  • All clients have unlimited quota by default
  • If quota.user.producer.default, quota.user.consumer.default are set, these default quotas are allocated to each user principal.
  • If quota.user.producer.default, quota.user.consumer.default are unlimited, quota.producer.default, quota.consumer.default are allocated to each unique client-id of each user.

Metrics

Quota related metrics are currently generated for client-ids and use the tag client-id. The metrics tag will be changed to quota-id and the value will include base-64 encoded user principal.

Sensor names are currently a sensor type concatenated with the client id value, eg. FetchThrottleTime-clientA. This will be modified to use quota-id instead of client-id. This is not a public interface change since sensor names are not reflected in JMX metrics.

Tools

kafka-configs.sh will be extended to support authenticated user quotas and sub-quotas for clients of a user.  A new entity type “users” will be added.  The key-value pairs supported for users will be:

  • producer_byte_rate : The total rate limit for the user’s producers
  • consumer_byte_rate : The total rate limit for the user’s consumers

Sub-quotas can be set for a user's clients by specifying the rate limits with both user and client entities in a single command.

The existing entity type "clients" will be retained for backward compatibility. But quotas set for clients are used only for users without a config override and only if default user quota is unlimited.

Proposed Changes

User Principal 

Authenticated user principal will be obtained from the Session object. URL-encoded string version of the Principal will be used so that it can be used as a node name in Zookeeper and in metrics without placing any restrictions on the characters allowed in the principal. Characters that cannot be used for Zookeeper node names or metrics (eg. *) will be percent-encoded. Encoded user principal will be cached in Session. For PLAINTEXT, the principal is "ANONYMOUS" by default and quotas will be applied for that principal. But principal can be overridden using a custom principal builder even for PLAINTEXT, enabling different user quotas,

Metrics

Quota related metrics are currently generated for client-ids and use the tag client-id. The metrics tag will be changed to quota-id and the value will include base-64 encoded user principal.

Sensor names are currently a sensor type concatenated with the client id value, eg. FetchThrottleTime-clientA. This will be modified to use quota-id instead of client-id. This is not a public interface change since sensor names are not reflected in JMX metrics.

Tools

kafka-configs.sh will be extended to support authenticated user quotas and sub-quotas for clients of a user.  A new entity type “users” will be added.  The key-value pairs supported for users will be:

  • producer_byte_rate : The total rate limit for the user’s producers
  • consumer_byte_rate : The total rate limit for the user’s consumers
  • client_producer_byte_rates: Comma separated list of reserved sub-quotas for client-ids of the user (eg. clientA=10,clientB=20). Clients not listed share the remaining quota of the user.
  • client_consumer_byte_rates: Comma separated list of reserved sub-quotas for client-ids of the use. (eg. clientA=30,clientB=40). Clients not listed share the remaining quota of the user.

The existing entity type "clients" will be retained for backward compatibility. But quotas set for clients are used only for users without a config override and only if default user quota is unlimited.

Proposed Changes

User Principal 

Authenticated user principal will be obtained from the Session object. URL-encoded string version of the Principal will be used so that it can be used as a node name in Zookeeper and in metrics without placing any restrictions on the characters allowed in the principal. Characters that cannot be used for Zookeeper node names or metrics (eg. *) will be percent-encoded. Encoded user principal will be cached in Session. For PLAINTEXT, the principal is "ANONYMOUS" by default and quotas will be applied for that principal. But principal can be overridden using a custom principal builder even for PLAINTEXT, enabling different user quotas, for example, for connections from different IP addresses.

...

  1. If client-id sub-quota override is defined for clientX of userN, this sub-quota is allocated for the sole use of (userN<userN, clientX)clientX>.
  2. If user quota override is defined for userN, clientX shares this quota with other clients of userN
  3. If quota.user.producer.default is not unlimited, clientX shares this default quota with other clients of userN
  4. If client-id quota override is defined for clientX, this quota is allocated for the sole use of (userN<userN, clientX)clientX>
  5. If quota.producer.default is configured, this default quota is allocated for the sole use of (userN<userN, clientX)clientX>
  6. Client is not throttled

Use cases:

...

Code Block
languagejava
titleSample Quota configuration in JSONUser quota without sub-quotas
// Quotas for user1 (without sub-quotas).
// Zookeeper persistence path /users/<encoded-user1>
{
    "version":1,
    "config": {
        "producer_byte_rate":"1024",
        "consumer_byte_rate":"2048",
    }
}
// Quotas for user2 with sub-quotas for clients.
// Zookeeper "user_principal" : "user1"
    }
}

Code Block
languagejava
titleUser quota with client-id sub-quotas
// Top-level total quotas for user2
// Zookeeper persistence path /users/<encoded-user2>
{
    persistence path /users/<encoded-user2>
{
    "version":1,
    "config": {
        "producer_byte_rate":"4096",
        "consumer_byte_rate":"8192",
        "client_producer_byte_rates"user_principal" : "clientA=10,clientB=30",
        "client_consumer_byte_rates" : "clientA=20,clientB=40"user2"
    }
} 
// Sub-Quotas for client-id clientA of users without config override if default user quota is unlimited. 
// Zookeeper persistence path <user2, clientA>
// Zookeeper persistence path /users/<encoded-user2>/clients/clientA
{
    "version":1,
    "config": {
        "producer_byte_rate":"10010",
        "consumer_byte_rate":"20030"
    }
}

In the sample configuration above:

  1. Total rate limits for all clients with user principal user1 is (1024, 2048).
  2. Total rate limits for all clients with user principal user2 is (4096, 8192).
    • The rate limits for clients with user principal user2 AND client-id clientA is (10, 20).
    • Clients of user2 with client-id other than clientA and clientB share the remaining quota (4056, 8132).
  3. Total rate limits for all clients of user3  is (quota.user.producer.default, quota.user.consumer.default) configured in server.properties, since no config override is specified.
  4. If default user quota is unlimited, clients of user3 use client-id quota configuration. For example quota for client-id clientA of user3 is (100, 200). And quota for client-id clientB of user3 without a client-id override is (quota.producer.default, quota.consumer.default)
    • In a single-user cluster, this provides the same semantics as the current client-id implementation
    • In a multi-user cluster,  quotas are now per-user, treating clientA of user4 as a different group from clientA of user2.

Quota Identifier

Quota configuration and metrics currently use client-id as the unique key, enforcing one quota for all clients with the same client-id. This will be replaced with a new quota-id that includes user principal. Each quota-id is associated with a pair of producer and consumer rate limits which may be config overrides or the default quota.

  • quota-id is the concatenation of url-encoded user principal and client-id. Clients-ids without a sub-quota override share the user's quota and hence use the encoded user principal as quota-id.
  • In the example (non-encoded user principal is used here for readability):
    • All clients of user1 share the quota-id user1
    • clientA of user2 uses the quota-id user2clientA
    • clientC of user2 uses the quota-id user2 since it does not have a client quota override, sharing a quota with other clients of user2.
    • clientA of user3 uses the quota-id user3clientA

Quota Persistence in Zookeeper

Client-id based quota configuration overrides will continue be stored under /config/clients, but these will be applied only to clients of users without a quota override and only if default user quota is unlimited. Quota configuration overrides for user principals will be stored under /config/users and these will include any sub-quota overrides for clients of each user. Note that url-encoded version of the user principal will be used as node name under /config/users to cope with Zookeeper naming restrictions. The non-encoded user principal will be stored as a property to make it easy to identify the actual user associated with the path.

Tools

// Sub-Quotas for <user2, clientB>
// Zookeeper persistence path /users/<encoded-user2>/clients/clientB
{
    "version":1,
    "config": {
        "producer_byte_rate":"20",
        "consumer_byte_rate":"40"
    }
} 

Code Block
languagejava
titleClient-id quota
// Quotas for client-id clientA of users without config override if default user quota is unlimited. 
// Zookeeper persistence path /clients/clientA
{
    "version":1,
    "config": {
        "producer_byte_rate":"100",
        "consumer_byte_rate":"200"
    }
}

In the sample configuration above:

  1. Total rate limits for all clients with user principal user1 is (1024, 2048).
  2. Total rate limits for all clients with user principal user2 is (4096, 8192).
    • The rate limits for clients with user principal user2 AND client-id clientA is (10, 20).
    • Clients of user2 with client-id other than clientA and clientB share the remaining quota (4056, 8132).
  3. Total rate limits for all clients of user3  is (quota.user.producer.default, quota.user.consumer.default) configured in server.properties, since no config override is specified.
  4. If default user quota is unlimited, clients of user3 use client-id quota configuration. For example quota for client-id clientA of user3 is (100, 200). And quota for client-id clientB of user3 without a client-id override is (quota.producer.default, quota.consumer.default)
    • In a single-user cluster, this provides the same semantics as the current client-id implementation
    • In a multi-user cluster,  quotas are now per-user, treating clientA of user4 as a different group from clientA of user2.

Quota Identifier

Quota configuration and metrics currently use client-id as the unique key, enforcing one quota for all clients with the same client-id. This will be replaced with a new quota-id that includes user principal. Each quota-id is associated with a pair of producer and consumer rate limits which may be config overrides or the default quota.

  • quota-id is the concatenation of url-encoded user principal and client-id. Clients-ids without a sub-quota override share the user's quota and hence use the encoded user principal as quota-id.
  • In the example (non-encoded user principal is used here for readability):
    • All clients of user1 share the quota-id user1
    • clientA of user2 uses the quota-id user2clientA
    • clientC of user2 uses the quota-id user2 since it does not have a client quota override, sharing a quota with other clients of user2.
    • clientA of user3 uses the quota-id user3clientA

Quota Persistence in Zookeeper

Client-id based quota configuration overrides will continue be stored under /config/clients, but these will be applied only to clients of users without a quota override and only if default user quota is unlimited. Quota configuration overrides for user principals will be stored under /config/users. Note that url-encoded version of the user principal will be used as node name under /config/users to cope with Zookeeper naming restrictions. The non-encoded user principal will be stored as a property to make it easy to identify the actual user associated with the path. Sub-quotas for clients of a user will be stored under /config/users/<user>/clients.

Configuration change notifications will be generated for changes to the quota configuration similar to the current notifications for client-id quotas. Changes to client-id sub-quotas of a user specify users as the entity_type and the sub-path of the node containing both user and client-id as entity_name . The changes to sub-quota affect both the sub-quotas of the particular <user, client-id> as well as the remainder quota allocated to the user's clients without a sub-quota override.

Code Block
languagejava
titleSample configuration change notification
// Change notification for user quota of user1
{
    "version":1,
    "entity_type": "users",
    "entity_name": "user1"
}
// Change notification for client sub-quota of <user2, clientA> that impacts clientA as well as clients of user2 without a sub-quota override
{
    "version":1,
    "entity_type": "users",
    "entity_name": "user2/clients/clientA"
 } 

Tools

kafka-configs.sh will be extended to support a new entity type "users". Quota configuration for users will be provided as key-value pairs to be consistent with other configuration options. Hence no new command line arguments will be added to the tool. The tool will parse the key-value pairs specifying rate limits, validate these and convert them to the equivalent JSON for persistence in Zookeeper. The existing entity “clients” will continue to be supported to set client-id quotas for users with unlimited quota. Sub-quotas for clients of a user can be configured by specifying entity types "users" and “clients”  in the same command line . For example, the following command sets quotas for <user2, clientA>:

bin/kafka-configs  --zookeeper localhost:2181 --alter --add-config 'producer_byte_rate=10,consumer_byte_rate=20' --entity-name clientA --entity-type clients --entity-name user2 --entity-type userskafka-configs.sh will be extended to support a new entity type "users". Quota configuration for users will be provided as key-value pairs to be consistent with other configuration options. Hence no new command line arguments will be added to the tool. The tool will parse the key-value pairs specifying total user quota and possibly some client sub-quotas, validate these and convert them to the equivalent JSON for persistence in Zookeeper. The existing entity “clients” will continue to be supported to set client-id quotas for users with unlimited quota.

Compatibility, Deprecation, and Migration Plan

...