You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Status

Current state["Still a DRAFT in progress so not even under discussion yet"]. 

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: KAFKA-1696 [Change the link from KAFKA-1 to your own ticket]

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

We introduced support for security in kafka version 0.9.0. using kerberos as authentication layer. Kafka is designed to work with a lot of producers and consumers so in a secure environment all these clients will need access to a keytab or a TGT to ensure they can communicate with a secure kafka broker. This has few disadvantages:

  • Performance/load on KDC as each client has to go to KDC to get the ticket.

  • Renewal needs to go through KDC and this renewed TGT’s need to be redistributed to all the clients.

  • Blast Radius is large if the TGT is compromised as TGT may grant access to more than just kafka service

  • Only compatible with kerberos authentication scheme.

  • Administration cost as for any new client to work it must have access to keytab or some way to get a TGT from some other node.

Please read http://carfield.com.hk:8080/document/distributed/hadoop-security-design.pdf HDFS section for more detailed explanation of all the disadvatages above. To address the problems listed above we propose to add support for delegation tokens to secure Kafka. Delegation tokens are shared secret between kafka brokers and clients so authentication can be done without having to go through KDC.

Delegation tokens will help processing frameworks to distribute workload to available workers in a secure environment without the added cost of distributing keytabs or TGT. i.e. In case of Storm, Storm’s master (nimbus) is the only node that needs a keytab. Using this keytab Nimbus will authenticate with kafka broker and acquire a delegation token. Nimbus can then distribute this delegation token to all of its worker hosts and all workers will be able to authenticate to kafka using tokens and will have all the access that nimbus keytab principal has.

Public Interfaces

Following new APIs and request/response classes will be added:

getDelegationToken(request: DelegationTokenRequest): DelegationTokenResponse

class DelegationTokenRequest(renewer: option(KafkaPrincipal) = None, maxLifeTime: long = -1)

class DelegationTokenResponse(owner:  KafkaPrincipal,  expiryTimeMillis: long, renewer: option(KafkaPrincipal) = None, maxLifeTime: long = -1, tokenId: String, hmac: byte[])  

renewDelegationToken(request: RenewDelegationTokenRequest): DelegationTokenResponse

class RenewDelegationTokenRequest(hmac: byte[], expiryTimeMillis: long) 

expireToken(request: ExpireTokenRequest)

class ExpireTokenRequest(hmac: byte[], expireAt: long  = Systemtime.currentTimeMillis)

Proposed Changes

Token acquisition

Following steps describe how tokens can be acquired:

  • A client connects with one of the kafka broker. Client must be authenticated using any of the available secure channels.

  • Once a client is authenticated, it will make a broker side call to issue a delegation token.  The request for delegation token will have to contain an optional renewer identity and max lifetime for token. The renewer is the user that is allowed to renew this token before the max lifetime expires. Renewer will default to the owner if not provided and Max life time will allow a token to be renewed for ever if no value is provided but a token will still expire if not renewed by the expiry time. The expiry time will be a broker side configuration and will default to min (24 hours, max lifetime) . A Delegation Token request can be represented as class DelegationTokenRequest(renewer: option(KafkaPrincipal) = None, maxLifeTime: long = -1). The owner is implicit in the request connection as the user who requested the delegation token.

  • The broker generates a shared secret based on HMAC-SHA256(a Password/Secret shared between all brokers, randomly generated tokenId). We can represent a token as scala case class DelegationToken(owner: KafkaPrincipal, renewer: KafkaPrincipal, maxLifeTime: long, id: String, hmac: String, expirationTime: long)

  • Broker stores this token in its in memory cache. Broker also stores the DelegationToken in the zookeeper. This is unsafe as zookeeper does not support SSL so the token itself will be transferred on wire without encryption. An alternative is to store DelegationToken without the hmac in zookeeper. As all brokers share the Password/Secret to generate the HMAC-SHA256, they can read the request info from zookeeper , generate the hmac and store the delegation token in local cache.

  • All brokers will have a cache backed by zookeeper so they will all get notified whenever a new token is generated and they will update their local cache whenever token state changes.

  • Broker returns the token to Client. Client is expected to only make delegation token request over an encrypted channel so the token in encrypted over the wire.

  • Client is free to distribute this token to other clients. It is the client’s responsibility to distribute the token securely.


kafka_token_acquisition.png

Authentication using Token

 We will reuse the current SASL channel but for authentication using delegation we will use DIGEST-MD5.

  • Client will already have the delegation token which it will present during the authentication phase.

  • Server will look up the token from its token cache, if it finds a match and token is not expired it will authenticate the client and the identity will be established as the owner of the delegation token.

  • If the token is not matched or token is expired, broker throws appropriate exception back and does not allow the client to continue.

kafka_authentication_using_tokens.png

Token renewal

  • The client authenticates using Kerberos or any other available authentication scheme. ( Can this authentication be done using delegation token? if it is allowed then we probably do not want to default renewer as owner as anyone with delegation token can actually renew their own token forever. Instead if no renewer is provided we should mark those tokens as non renewable or we should make renewer a mandatory request field during token acquisition and ensure renewer can not be set to owner)

  • Client sends a request to renew a token with an optional renew life time which must be < max life time of token.

  • Broker looks up the token, if token is expired or if the renewer’s identity does not match with the token’s renewer, or if token renewal is beyond the Max life time of token,  broker disallows the operation.

  • If none of the above conditions are matched, broker updates token’s expiry. Note that the HMAC-SHA256 is unchanged so the token on client side is unchanged. Broker updates the expiration in its local cache and on zookeeper so other brokers also get notified and their cache statuses are updated as well.

token_renewal.png

Token expiration and cancellation

 If a token is not renewed by the token’s expiration time or if token is beyond the max life time, it will be deleted from all broker caches as well as from zookeeper. Alternatively an owner or renewer can issue a expiration/cancellation by following a similar process as renewal.

Invalidating all tokens

 In case of a password compromise scenario all the tokens can be deleted from zookeeper and this will result in all the tokens to be invalidated. We can provide a simple CLI tool for this.

Command line tool

 We will provide a CLI to acquire delegation tokens, renew tokens and to invalidate/expire tokens.

Alternatives

Originally we considered to not have any shared Secret at config level. This required us to chose one of the 2 options

  • Let each broker generate a Random secret on each acquisition request and use this secret to generate the hmac. Broker will store the hmac in zookeeper. However as zookeeper does not support SSL the hmac will be on wire unencrypted which is not safe.
  • A client will go to every broker and acquire token for that broker. The tokens will not be stored in zookeeper at all. The downside of this approach is anytime a new broker is added the initial client has to get notified and it has to ensure it gets token for this new broker and distributes it.

 

  • No labels