Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Minor corrections/clarifications

...

The adoption of KIP-255: OAuth Authentication via SASL/OAUTHBEARER in release 2.0.0 creates the possibility of using information in the bearer token to make authorization decisions.  Unfortunately, however, Kafka connections are long-lived, so there is no ability to change the bearer token associated with a particular connection.  Allowing SASL connections to periodically re-authenticate would resolve this.  In addition to this motivation there are two others that are security-related.  First, to eliminate access to Kafka for connected clients, the current requirement is to remove all authorizations (i.e. remove all ACLs).  This is necessary because of the long-lived nature of the connections.  It is operationally simpler to shut off access at the point of authentication, and with the release of KIP-86: Configurable SASL Callback Handlers it is going to become more and more likely that installations will authenticate users against external directories (e.g. via LDAP).  The ability to stop Kafka access by simply disabling an account in an LDAP directory (for example) is desirable.  The second motivating factor for re-authentication related to security is that the use of short-lived tokens is a common OAuth security recommendation, but issuing a short-lived token to a Kafka client (or a broker when OAUTHBEARER is the inter-broker protocol) currently has no benefit because once a client is connected to a broker the client is never challenged again and the connection may remain intact beyond the token expiration time (and may remain intact indefinitely under perfect circumstances).  This KIP proposes adding the ability for SASL clients (and brokers when a SASL mechanism is the inter-broker protocol) to re-authenticate their connections to brokers.  If OAUTHBEARER is the SASL mechanism then a new bearer token will appear on the session, replacing the old one.  This KIP also proposes to add the ability for brokers to close connections that continue to use expired sessions.

...

This KIP proposes the addition of a configuration option to enable the server-side expired-connection-kill feature (the option default results in no functionality change, of course, so there is no change to existing behavior in the absence of an explicit opt-in).  This KIP also proposes bumping the version number for the SASL_AUTHENTICATE API to 1 (with a change in wire format) so that servers can indicate the session expiration time to clients via an additional value on the last round-trip response.  Clients will can use the max SASL_AUTHENTICATE version number supported by the server to determine if they are connected to a broker that supports re-authentication (true if version > 0).  This KIP also adds new metrics as described below.

The configuration option this KIP proposes to add to enable server-side expired-connection-kill is 'connections.max.reauth.ms' – it must be prefixed with listener prefix and SASL mechanism name in lower-case. For example, "sasl_ssl.oauthbearer.connections.max.reauth.ms=3600000".  The value represents the maximum value that could potentially be communicated as part of the new V1 SaslAuthenticateResponse.  The default value is 0, which means there is effectively no maximum communicated (0 will be sent, meaning "none"), server-side kill is disabled, clients are not required to re-authenticate, and whether clients re-authenticate or not and at what interval is entirely up to them.  Existing SASL clients upgraded to v2.1.0 will be coded to not re-authenticate in this scenario.

When explicitly set to a positive number the server will disconnect any SASL connection that does not re-authenticate and subsequently uses the connection for any purpose other than re-authentication at any point beyond the communicated expiration point (which will not exceed the configured maximum value).  For example, if the configured value is 3600000 (1 hour) and the remaining token lifetime at the time of authentication is 45 minutes, then 45 minutes is communicated back to the client and the server would kill the connection if it is not re-authenticated within 45 minutes and it is then actively used for anything other than re-authentication.  As a further example, if the configured value is 3600000 and the mechanism is not OAUTHBEARER (e.g. it is PLAIN, SCRAM-related, or GSSAPI) and the credential lifetime is either unspecified or greater than 1 hour (perhaps due to a delegation token) then 1 hour would be communicated to the client and the server would kill the connection if it is not re-authenticated within 1 hour and it is then actively used for anything other than re-authentication.

...

From a behavior perspective on the client side (again, including the broker when it is acting as an inter-broker client), when a v2.1.0-or-later SASL client connects to a broker that supports re-authentication, the broker will communicate the session expiration time as part of the final SASL_AUTHENTICATE response.  The If this value is positive, then the client will then automatically re-authenticate before anything else unrelated to re-authentication is sent beyond that expiration point.  If the re-authentication attempt fails then the connection will be closed by the broker; retries are not supported.  If re-authentication succeeds then any requests that queued up during re-authentication will subsequently be able to flow through, and eventually the connection will re-authenticate again, etc.

From a behavior perspective on the server (broker) side, when the broker-side expired-connection-kill feature is enabled with a positive value for a particular SASL mechanism the broker will communicate the a session time via SASL_AUTHENTICATE and will close a connection when the connection is used past the expiration time and the specific API request is not directly related to re-authentication (ApiVersionsRequest, SaslHandshakeRequest, and SaslAuthenticateRequest).  In other words, if a connection sits idle, it will not be closed – something unrelated to re-authentication must traverse the connection before a disconnect will occur.

...

The description of this KIP is actually quite straightforward from a behavior perspective – turn the feature on with the configuration options option in both the client and the broker and it just works.  From an implementation perspective, though, the KIP is not so straightforward; a description of how it works therefore follows below.  Note that this description applies to the implementation only – none of this is part of the public API.

...

With respect to compatibility, there is no impact to existing installations because the default is for the server-side connection kill feature to be turned off, older clients never try to re-authenticate because they don't support it, and newer clients that connect to older brokers will know that the broker does not support re-authentication and will therefore not attempt it.

...

  1. Upgrade all brokers to v2.1.0 or later at whatever rate is desired with 'connections.max.reauth.ms' allowed to default to 0.  If SASL is used for the inter-broker protocol then brokers will check the SASL_AUTHENTICATE API version and use a V1 request when communicating to a broker that has been upgraded to 2.1.0, but the client will see the "0" session max lifetime and will not re-authenticate.  Their connections will not be killed.
  2. In parallel with (1) above, upgrade non-broker clients to v2.1.0 or later at whatever rate is desired.  SASL clients will check the SASL_AUTHENTICATE API version and use a V1 request when communicating to a broker that has been upgraded to 2.1.0, but the client will see the "0" session max lifetime and will not re-authenticate.  Their connections will not be killed.
  3. After (1) and (2) are complete, perform a rolling restart of all brokers and check the broker metrics failed-v0-authentication-{rate,total} and successful-v0-authentication-{rate,total} to confirm that they remain at zero.  This gives confidence that (1) and (2) are indeed complete.
  4. Update '[listener].[mechanism].connections.max.reauth.ms' to a positive value and perform a rolling restart of brokers again. 
  5. Monitor the failed-v0-authentication-{rate,total} and successful-v0-authentication-{rate,total} metrics – they will remain at 0 unless an older client connects to the broker.

...