Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Serialized Request Data
  2. Initial Principal Name for audit logging purposeand throttling purpose
  3. Initial Client Id for throttling purposepurpose 


Code Block
languageyml
titleEnvelopeRequest.json
{
  "apiKey": N,
  "type": "request",
  "name": "EnvelopeRequest",
  "validVersions": "0",
  "flexibleVersions": "0+",
  "fields": [
    { "name": "RequestData", "type": "ApiMessage", "versions": "0+",
      "about": "The embedded request data."},
    { "name": "InitialPrincipalName", "type": "string", "ignorable": true,
      "about": "Optional value of the initial principal name when the request is redirected by a broker." },
    { "name": "InitialClientId", "type": "string", "ignorable": true,
      "about": "Optional value of the initial client id when the request is redirected by a broker." },
  ]
}

...

Step 3. Handle the returned EnvelopeResponse
           3 3.1 If the top level error code is NOT_CONTROLLER, retry until timeout
        3.2

In addition, to avoid exposing this forwarding power to the admin clients, the routing request shall be forwarded towards the controller broker internal endpoint which should be only visible to other brokers inside the cluster in the KIP-500 controller. Any admin configuration request with broker principal should not be going through the public endpoint and will be rejected for security purpose. For pre-KIP-500 controller, we would allow broker principal to go through only when the message comes in on the inter-broker listener, which is an indication of a forwarding request. The pre-KIP-500 cluster could not fully prevent malicious client pretending to be a forwarding request, but the attacker must have super user access to gain CLUSTER_ACTION.

Security Access Changes

Broker Authorization Override During Forwarding

To support the authorization of RPCs during redirection, we would let CLUSTER_ACTION to override the following operation principals:

...

Operation

...

Resource

...

API

...

ALTER

...

CreateAcls/DeleteAcls/AlterPartitionReassignments/UpdateFeatures

...

ALTER

...

ALTER_CONFIGS

...

Topic/Cluster

...

AlterConfig/IncrementalAlterConfig

...

CREATE

...

Topic

...

CreateTopics

...

token authentication

...

token

...

Create/Renew/DeleteToken

...

DELETE

...

If the error is CLUSTER_AUTHORIZATION_FAILURE, set top level or resource level error code in the original RPC response.                                                                                                                                    3.3 Merge with other unauthorized resource and return back to the admin client


As suggested in the above process, a new error code shall be implemented for internal authentication failure:

Code Block
languagejava
titleErrors.java
BROKER_AUTHORIZATION_FAILURE(92, "Authorization failed for the request during forwarding. This indicates an internal error on the broker cluster security setup.", BrokerAuthorizationFailureException::new);

Unfortunately for older admin clients they couldn't interpret this code, so an UNKNOWN_SERVER_ERROR will be presented, which is less ideal but still good enough to motivate users to check the broker side log for authorization failure. We intended to avoid returning AUTHORIZATION failure to the old client so that users don't waste time debugging any client side security setup.

To distinguish which request is forwarded, the controller will try to differentiate requests coming from inter broker listener and advertised listener. If the request is from inter broker listener, we treat it as a forwarding request and do the override authentication.

Although some users may configure the same listener name for both client and inter broker communication, which invalidates the differentiation process, this override approach still guarantees no extra security access breach since CLUSTER_ACTION implies either the broker or a super user.

Routing in KIP-500 

In addition, to avoid exposing this forwarding power to the admin clients, the routing request shall be forwarded towards the controller broker internal endpoint which should be only visible to other brokers inside the cluster in the KIP-500 controller. Any admin configuration request with broker principal should not be going through the public endpoint and will be rejected for security purpose. For pre-KIP-500 controller, we would allow broker principal to go through only when the message comes in on the inter-broker listener, which is an indication of a forwarding request. The pre-KIP-500 cluster could not fully prevent malicious client pretending to be a forwarding request, but the attacker must have super user access to gain CLUSTER_ACTION.

This ensures that the forwarding broker could use its own principal to authenticate and proceed on certain ZK mutation operations. To distinguish which request is forwarded, the controller will try to differentiate requests coming from inter broker listener and advertised listener. If the request is from inter broker listener, we treat it as a forwarding request and do the override authentication.

Although some users may configure the same listener name for both client and inter broker communication, which invalidates the differentiation process, this override approach still guarantees no extra security access breach since CLUSTER_ACTION implies either the broker or a super user.

If the authorization still fails on the controller side, it indicates an internal security setup error which should be addressed on the broker cluster, not the client. We shall propagate a new error code to the original client to educate users to fix:

Code Block
languagejava
titleErrors.java
BROKER_AUTHORIZATION_FAILURE(92, "Authorization failed for the request during forwarding. This indicates an internal error on the broker cluster security setup.", BrokerAuthorizationFailureException::new);

Unfortunately for older admin clients they couldn't interpret this code, so an UNKNOWN_SERVER_ERROR will be presented, which is less ideal but still good enough to motivate users to check the broker side log for authorization failure. We intended to avoid returning AUTHORIZATION failure to the old client so that users don't waste time debugging any client side security setup.

New Tag for Principal Name And Client Id

We are also going to add new tag fields to represent the original request principal name and client id to the request header.

Code Block
languagejava
titleRequestHeader.json
{
  "type": "header",
  "name": "RequestHeader",
  // Version 0 of the RequestHeader is only used by v0 of ControlledShutdownRequest.
  //
  // Version 1 is the first version with ClientId.
  //
  // Version 2 is the first flexible version.
  "validVersions": "0-2",
  "flexibleVersions": "2+",
  "fields": [
    { "name": "RequestApiKey", "type": "int16", "versions": "0+",
      "about": "The API key of this request." },
    { "name": "RequestApiVersion", "type": "int16", "versions": "0+",
      "about": "The API version of this request." },
    { "name": "CorrelationId", "type": "int32", "versions": "0+",
      "about": "The correlation ID of this request." },
    ...
    // ----- new optional field ----
    { "name": "InitialPrincipalName", "type": "string", "tag": 0, "taggedVersions": "2+", "ignorable": true,
      "about": "Optional value of the initial principal name when the request is redirected by a broker." },
    { "name": "InitialClientId", "type": "string", "tag": 0, "taggedVersions": "2+", "ignorable": true,
      "about": "Optional value of the initial client id when the request is redirected by a broker." },
    // ----- end new field ---------
  ]
}

For audit logging purpose, we added the initial principal name. 

For KIP-599 throttling requirement, both the initial principal name and the initial client id are required to be present.

Also we need to support optional fields for all the mentioned RPCs above. The following RPCs will bump their version by 1 to introduce flexible versions:

...

AlterConfig

...

Monitoring Metrics

To effectively monitor the admin request forwarding status, we would the following metered metric:

...