Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: refinement

...

  • AlterPartitionReassignment
  • CreatePartition
  • CreateTopics
  • DeleteTopics
  • UpdateFeatures (ongoing with KIP-584)
  • Scram

It is because we are trying to isolate controller access from the admin client in the post KIP-500 world. Old admin clients who are sending requests directly to the controller will be given a random broker id and rely on forwarding of the original requests as well.

...

One thing to note that at the moment the direct ZK access bypasses the CreateTopicPolicy. This is in fact a hole in the topic creation logic that we should fix. From now on, if a MetadataRequest tries to create an internal topic but failed, receiving broker will reply a fatal error to let the client fail fast and populate the message to the users.

Routing Request Security

For ZK mutation requests that need redirection, forwarding broker will just use its own authorizer to verify the principals. When the request looks good, it will just forward the request with its own credentials, so that the controller broker will only validate the broker principal in the forwarded request. The only exceptional case is the controller audit log which needs a principal name of the request, so we will add an optional tag called "InitialPrincipalName" to the header when sending the proxy request.

To better understand how security check works, take AlterConfig for an example, the intended workflow for a KIP-590 broker would be:

Step 1. Filter out resources that are authorized
         1.1 Use traditional principals to verify first. If authorized, continue
         1.2 If not authorized, check whether the request is from the control plane. Note that this is a best-effort to verify whether the request is internal.
         1.3 If the request is not from the control plane, return authorization failure
         1.4 If the request is from the control plane, use CLUSTER_ACTION to verify and determine the result

Step 2. Check the request context to see if this is a forwarding request, by checking whether it is from control plane and uses extra header fields
        2.1 if the resource is authorized, and if this is the active controller, process it
        2.2 if the resource is authorized but this is not the active controller, return NOT_CONTROLLER to the sender (forwarding broker) for retry
        2.3 if the resource is not authorized, return CLUSTER_AUTHORIZATION_FAILURE to propagate back to the original client through forwarding broker

Step 3. If the request is not a forwarding request
        3.1 If the resource is authorized, and this is the active controller, process it
        3.2 If the resource is authorized, but this is not active controller, put the resource into the preparation for a new AlterConfig request for forwarding
        3.3 If the resource is not authorized, reply the original client AUTHORIZATION_FAILURE when the forwarding request is returned

In addition, to avoid exposing this forwarding power to the admin clients, the routing request shall be forwarded towards the controller broker internal endpoint which should be only visible to other brokers inside the cluster in the KIP-500 controller. Any admin configuration request with broker principal should not be going through the public endpoint and will be rejected for security purpose. For pre-KIP-500 controller, we would allow broker principal to go through only when the message comes in on the inter-broker listener, which is an indication of a forwarding request. The pre-KIP-500 cluster could not fully prevent malicious client pretending to be a forwarding request, but the attacker must have super user access to gain CLUSTER_ACTION.

Public Interfaces

Deprecate Client Side Controller Access

We shall remove "ControllerNodeProvider" on the admin client, so that new clients no longer have direct access towards the controller. All admin requests would try to use the same "LeastLoadedNodeProvider" to get a random node to talk to. Thus the active controller is properly isolated from the outside world, according to the KIP-631

Protocol Bump

We also need to bump the Metadata RPC to v10 to propagate internal topic creation policy violation. Specifically:

1. For newer clients, return POLICY_VIOLATION when the topic creation policy is violated. In the application level, we should swap the error message with the actual failure reason such as "violation of topic creation policy when attempting to auto create internal topic through MetadataRequest."

2. For older client, return AUTHORIZATION_FAILED to fail the client quickly as well. It's not a perfect solution as we don't have a notification path for older clients, but at least the system admin could check for broker log when hitting this issue.

To be more strict of protecting controller information, the "ControllerId" field in new MetadataResponse shall be set to a random broker for v0-v9 request, and gets deprecated on v10. Note that only existing clients are using Metadata RPC to get controller info, so it should be safe to deprecate and we would explicitly mention that on the NetworkClient meta comments.

New Envelope RPC

We are also going to add a new RPC type to wrap the original request during the forwarding. We will make corresponding changes to `ApiMessageTypeGenerator` class to recognize the new field `Header` and `ApiMessage` during the auto generation. And for authentication and audit logging purpose, we proposed to add the following fields:

  1. Serialized Principal information
  2. Client host ip address
  3. Listener name
  4. Security protocol being used

...

Public Interfaces

Deprecate Client Side Controller Access

We shall remove "ControllerNodeProvider" on the admin client, so that new clients no longer have direct access towards the controller. All admin requests would try to use the same "LeastLoadedNodeProvider" to get a random node to talk to. Thus the active controller is properly isolated from the outside world, according to the KIP-631

Protocol Bump

We also need to bump the Metadata RPC to v10 to propagate internal topic creation policy violation. Specifically:

1. For newer clients, return POLICY_VIOLATION when the topic creation policy is violated. In the application level, we should swap the error message with the actual failure reason such as "violation of topic creation policy when attempting to auto create internal topic through MetadataRequest."

2. For older client, return AUTHORIZATION_FAILED to fail the client quickly as well. It's not a perfect solution as we don't have a notification path for older clients, but at least the system admin could check for broker log when hitting this issue.

To be more strict of protecting controller information, the "ControllerId" field in new MetadataResponse shall be set to a random broker for v0-v9 request, and gets deprecated on v10. Note that only existing clients are using Metadata RPC to get controller info, so it should be safe to deprecate and we would explicitly mention that on the NetworkClient meta comments.

New Envelope RPC

We are also going to add a new RPC type to wrap the original request during the forwarding. We will make corresponding changes to `ApiMessageTypeGenerator` class to recognize the new field `Header` and `ApiMessage` during the auto generation. And for authentication and audit logging purpose, we proposed to add the following fields:

  1. Serialized Request Data
  2. Initial Principal Name for audit logging purpose
  3. Initial Client Id for throttling purpose


Code Block
languageyml
titleEnvelopeRequest.json
{
  "apiKey": N,
  "type": "request",
  "name": "EnvelopeRequest",
  "validVersions": "0",
  "flexibleVersions": "0+",
  "fields": [
    { "name": "RequestData", "type": "ApiMessage", "versions": "0+",
      "about": "The embedded request data."},
    { "name": "InitialPrincipalName", "type": "string", "ignorable": true,
      "about": "Optional value of the initial principal name when the request is redirected by a broker." },
    { "name": "InitialClientId", "type": "string", "ignorable": true,
      "about": "Optional value of the initial client id when the request is redirected by a broker." },
  ]
}

When receiving an EnvelopeRequest, the broker shall authorize the request with forwarding broker's principal. If the outer request is verified, the broker will continue to unwrap the inner request and handle it as normal, which means it would continue performing authorization for the inner layer principal. For KIP-590 scope, the possible top error codes are:

  • NOT_CONTROLLER as we are only forwarding admin write requests.
  • CLUSTER_AUTHORIZATION_FAILED if the inter-broker verification failed.

The CLUSTER authorization for EnvelopeRequest takes place during the request handling, similar to LeaderAndIsrRequest. This ensures the EnvelopeRequest is not sent from a malicious client pretending to be a fellow broker. For inner request error, it will still be embedded inside the `ResponseData` struct defined in EnvelopeResponse below.

Code Block
languageyml
titleEnvelopeResponse.json
{
  // Possible top level error code:
  //
  // NOT_CONTROLLER
  // CLUSTER_AUTHORIZATION_FAILED
  //
  "apiKey": N,
  "type": "response",
  "name": "EnvelopeResponse",
  "validVersions": "0",
  "flexibleVersions": "0+",
  "fields": [
    { "name": "

...

ResponseData", "type": "

...

ApiMessage", "versions": "0+",

...


      "about": "The

...

 embedded response data."},

...


    { "name": "

...

ErrorCode", "type": "

...

int16", "versions": "0+"

...

,
      "about": "

...

The error code, or 0 if there was no error." },
  ]
}

EnvelopeResponse Handling

When the response contains NOT_CONTROLLER error code, the forwarding broker will keep finding the correct controller until request eventually times out. For CLUSTER_AUTHORIZATION_FAILED, this indicates an internal error for broker security setup which has nothing to do with the client, so we have no other way but returning an UNKNOWN_SERVER_ERROR to the admin client. 

For whatever result the controller replies to the inner request, the forwarding broker won't check. As long as the top level has no error, the forwarding broker will claim the request to be successful and reply the inner response to the admin client for the rest of error handling.

Routing Request Security

For ZK mutation requests that need redirection, forwarding broker will just use its own authorizer to verify the principals. When the request looks good, it will just forward the request as Envelope with its own credentials, so that the controller broker will only validate the broker principal in the forwarded request. The only exceptional case is the controller audit log which needs a principal name of the request, so we will add an optional field called "InitialPrincipalName" as stated in the Envelope template.

To better understand how security check works, take AlterConfig for an example, the intended workflow for a KIP-590 broker would be:

Step 1. Filter out resources that are authorized
         1.1 Use traditional principals to verify first
         1.2 If the resource is authorized, and if this is the active controller, process it
         1.3 Otherwise package the authorized resources and send to the active controller as Envelope
      

Step 2. Check the Envelope request to see if this is a forwarding request, by checking whether it sets initial principal fields and come from privileged listener
        2.1 Use CLUSTER_ACTION to verify, and if the resource is not authorized, return CLUSTER_AUTHORIZATION_FAILURE to propagate back to the original client through forwarding broker
        2.2 if the resource is authorized but this is not the active controller, return NOT_CONTROLLER to the sender (forwarding broker) for retry
        2.3 Process the resource

Step 3. Handle the returned EnvelopeResponse
        3.1 If the top level error code is NOT_CONTROLLER, retry until timeout

3.2



In addition, to avoid exposing this forwarding power to the admin clients, the routing request shall be forwarded towards the controller broker internal endpoint which should be only visible to other brokers inside the cluster in the KIP-500 controller. Any admin configuration request with broker principal should not be going through the public endpoint and will be rejected for security purpose. For pre-KIP-500 controller, we would allow broker principal to go through only when the message comes in on the inter-broker listener, which is an indication of a forwarding request. The pre-KIP-500 cluster could not fully prevent malicious client pretending to be a forwarding request, but the attacker must have super user access to gain CLUSTER_ACTION

EnvelopeRequest Handling

When receiving an EnvelopeRequest, the broker shall authorize the request with forwarding broker's principal. If the outer request is verified, the broker will continue to unwrap the inner request and handle it as normal, which means it would continue performing authorization for the inner layer principal. For KIP-590 scope, the possible top error codes are:

  • NOT_CONTROLLER as we are only forwarding admin write requests.
  • CLUSTER_AUTHORIZATION_FAILED if the inter-broker verification failed.

The CLUSTER authorization for EnvelopeRequest takes place during the request handling, similar to LeaderAndIsrRequest. This ensures the EnvelopeRequest is not sent from a malicious client pretending to be a fellow broker. For inner request error, it will still be embedded inside the `ResponseData` struct defined in EnvelopeResponse below.
EnvelopeResponse.json

...

{
  // Possible top level error code:
  //
  // NOT_CONTROLLER
  // CLUSTER_AUTHORIZATION_FAILED
  //
  "apiKey": N,
  "type": "response",
  "name": "EnvelopeResponse",
  "validVersions": "0",
  "flexibleVersions": "0+",
  "fields": [
    { "name": "ResponseHeader", "type": "Header", "versions": "0+",
      "about": "The embedded response header." },
    { "name": "ResponseData", "type": "ApiMessage", "versions": "0+",
      "about": "The embedded response data."},
    { "name": "ErrorCode", "type": "int16", "versions": "0+",
      "about": "The error code, or 0 if there was no error." },
  ]
}

EnvelopeResponse Handling

When the response contains NOT_CONTROLLER error code, the forwarding broker will keep finding the correct controller until request eventually times out. For CLUSTER_AUTHORIZATION_FAILED, this indicates an internal error for broker security setup which has nothing to do with the client, so we have no other way but returning an UNKNOWN_SERVER_ERROR to the admin client. 

...

.

Security Access Changes

Broker Authorization Override During Forwarding

...

Code Block
languagejava
titleRequestHeader.json
{
  "type": "header",
  "name": "RequestHeader",
  // Version 0 of the RequestHeader is only used by v0 of ControlledShutdownRequest.
  //
  // Version 1 is the first version with ClientId.
  //
  // Version 2 is the first flexible version.
  "validVersions": "0-2",
  "flexibleVersions": "2+",
  "fields": [
    { "name": "RequestApiKey", "type": "int16", "versions": "0+",
      "about": "The API key of this request." },
    { "name": "RequestApiVersionRequestApiKey", "type": "int16", "versions": "0+",
      "about": "The API versionkey of this request." },
    { "name": "CorrelationIdRequestApiVersion", "type": "int32int16", "versions": "0+",
      "about": "The correlationAPI IDversion of this request." },
    ...
    // ----- new optional field ----
    { "name": "InitialPrincipalNameCorrelationId", "type": "string", "tag": 0, "taggedVersions": "2+"int32", "ignorableversions": true"0+",
      "about": "OptionalThe correlation valueID of the initial principal name when the request is redirected by a broker." }, this request." },
    ...
    // ----- new optional field ----
    { "name": "InitialClientIdInitialPrincipalName", "type": "string", "tag": 0, "taggedVersions": "2+", "ignorable": true,
      "about": "Optional value of the initial clientprincipal idname when the request is redirected by a broker." },
     // ----- end new field ---------
  ]
}

For audit logging purpose, we added the initial principal name. 

For KIP-599 throttling requirement, both the initial principal name and the initial client id are required to be present.

Also we need to support optional fields for all the mentioned RPCs above. The following RPCs will bump their version by 1 to introduce flexible versions:

  • AlterConfig

  • AlterClientQuotas

Monitoring Metrics

To effectively monitor the admin request forwarding status, we would the following metered metric:

MBean:kafka.server:type=RequestMetrics,name=NumRequestsForwardingToControllerPerSec,clientId=([-.\w]+)

to visualize how many RPC are inflight from each admin client. It will be added via Yammer metrics.

Compatibility, Deprecation, and Migration Plan

The upgrade path shall be guarded by the inter.broker.protocol (IBP) to make sure the routing behavior is consistent. After first rolling bounce to upgrade the binary version, all fellow brokers are still handling ZK mutation requests by themselves. With the second IBP bump rolling bounce, all upgraded brokers will be using the new routing algorithm effectively described in this KIP. 

For older admin clients which still "try to" send the request to the controller, receiving broker will redirect the request to the active controller as stated in the KIP. The handling should be exactly the same no matter the ZK mutation request is from an old or a new admin client.

Rejected Alternatives

{ "name": "InitialClientId", "type": "string", "tag": 0, "taggedVersions": "2+", "ignorable": true,
      "about": "Optional value of the initial client id when the request is redirected by a broker." },
    // ----- end new field ---------
  ]
}

For audit logging purpose, we added the initial principal name. 

For KIP-599 throttling requirement, both the initial principal name and the initial client id are required to be present.

Also we need to support optional fields for all the mentioned RPCs above. The following RPCs will bump their version by 1 to introduce flexible versions:

  • AlterConfig

  • AlterClientQuotas

Monitoring Metrics

To effectively monitor the admin request forwarding status, we would the following metered metric:

MBean:kafka.server:type=RequestMetrics,name=NumRequestsForwardingToControllerPerSec,clientId=([-.\w]+)

to visualize how many RPC are inflight from each admin client. It will be added via Yammer metrics.

Compatibility, Deprecation, and Migration Plan

The upgrade path shall be guarded by the inter.broker.protocol (IBP) to make sure the routing behavior is consistent. After first rolling bounce to upgrade the binary version, all fellow brokers are still handling ZK mutation requests by themselves. With the second IBP bump rolling bounce, all upgraded brokers will be using the new routing algorithm effectively described in this KIP. 

For older admin clients which still "try to" send the request to the controller, receiving broker will redirect the request to the active controller as stated in the KIP. The handling should be exactly the same no matter the ZK mutation request is from an old or a new admin client.

Rejected Alternatives

  • We discussed about the possibility of immediately building a metadata topic to propagate the changes. This seems aligned with the eventual metadata quorum path, but at a cost of blocking the current API migration towards the bridge release, since the metadata quorum design is much more complicated and requires more iterations. To avoid this extra dependency on other tracks, we should go ahead and migrate existing protocols to meet the bridge release goal sooner.
  • We thought about adding an alerting metrics called request-forwarding-to-controller-authorization-fail-count in an effort to help administrator detect wrong security setup sooner. However, there should already be metrics monitoring request failures, so this metric could be optional.

  • We thought about monitoring older client connections in the long term after bridge release, when we perform some incompatible changes to the Raft Quorum, to better capture the timing for a major version bump. However, KIP-511 also has already exposed metrics like an "unknown" software name and an "unknown" software version which could serve for this purpose

  • We discussed about the possibility of immediately building a metadata topic to propagate the changes. This seems aligned with the eventual metadata quorum path, but at a cost of blocking the current API migration towards the bridge release, since the metadata quorum design is much more complicated and requires more iterations. To avoid this extra dependency on other tracks, we should go ahead and migrate existing protocols to meet the bridge release goal sooner.
  • We thought about adding an alerting metrics called request-forwarding-to-controller-authorization-fail-count in an effort to help administrator detect wrong security setup sooner. However, there should already be metrics monitoring request failures, so this metric could be optional.

  • We thought about monitoring older client connections in the long term after bridge release, when we perform some incompatible changes to the Raft Quorum, to better capture the timing for a major version bump. However, KIP-511 also has already exposed metrics like an "unknown" software name and an "unknown" software version which could serve for this purpose.

  • We discussed about adding a new RPC type called Envelope to wrap the original request during the forwarding. Although the Envelope API provides certain privileges like data embedding and principal embedding, it creates a security hole by letting a malicious user impersonate any resending broker. Passing the principal around also increases the vulnerability, compared with other standard ways such as passing a verified token, but it is unfortunately not fully supported with Kafka security. So for the security concerns, we are abandoning the Envelope approach and fallback to just forward the raw admin requests

    .

  • We discussed about maintaining the client access to the controller, which has conflicts with KIP-631, so we decide to go extra steps to give existing ZK mutation RPC with forwarding power as well.
  • We could embed the original request version as a tag to the forward request and always forward with the minimum version that supports flexible fields, when the original request version is older than min version to support initial principals. While dealing with the request on the controller, it will use the original version for deserialization when defined. This proposal was rejected because of the error-prone approach around the upgrade and downgrade of the RPC during transition. Generally the pattern requires all parties to parse the request and package response correctly.

...