Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • SSL for access from applications (must)
  • Kerberos for access on behalf of people (must)
  • Hadoop delegation tokens to enable MapReduce, Samza, or other frameworks running in the Hadoop environment to access Kafka
  • LDAP username/password (nice-to-have)

We will use SASL for kerberos and LDAP.

Open question: Does SASL does not actually transmit the bits required for authentication?

LinkedIn security team (Arvind M) believes it does (at least for Kerberos) and there is no need to change the protocol. If it doesn't  we . To handle this we will need to add a new AuthRequest/AuthResponse API to our protocol. This will contain only a simple byte array containing the auth bytes SASL needs.

Generally you would expect authentication to happen at connection time, but I don't think there is really any reason we need to require this. I think instead we can allow it at any time during a session or even multiple times during a session (if the client wishes to change their user a la su). However this is fundamentally attached to the connection, so if the client reconnects they will lose their authentication and need to re-authenticate.

All connections that have not yet been authenticated will be assigned a fake user ("nobody" or "josephk" or something). <- Note, admins should be able to disable fake users - auditors hate those.

Regardless of the mechanism by which you connect and authenticate, the mechanism by which we check your permissions should be the same. The side effect of a successful connection via SSL with a client cert or a successful auth request will be that we store the user information along with that connection. The user will be based along with the request object to KafkaApis on each subsequent request.

Implementation notes

We want 3 separate ports: SSL, SASL and plaintext. Admins should be able to disable any of those in Kafka level.

For TLS we would need a separate TLS port for some configuration. Presumably this would need to be maintained in the cluster metadata so clients can choose to connect to the appropriate port.  This needs to be configurable so brokers that do not want to expose an insecure port can do so. 

This feature requires some co-operation between the socket server and the api layer. The API layer will handle the authenticate request, but the username will be associated with the connection. One approach to implementing this would be to add the concept of a Session object that is maintained with the connection and contains the username. The session would be stored in the context for the socket in socket server and destroyed as part of socket close. The session would be passed down to the API layer with each request and we would have something like session.authenticatedAs() to get the username to use for authorization purposes. We will also record in the session information about the security level of the connection (does it use encryption? integrity checks?) for use in authorization.

All future checks for authorization will just check this session information.

Open Question: Do we want to support all 0.8 producer/consumer API? Or just new producer/consumer? 

 

Authorization

The plan will be to support unix-like permissions on a per-topic level.

...