You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Status

Current state: Under Discussion [One of "Under Discussion", "Accepted", "Rejected"]

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Describe the problems you are trying to solve.

The ReassignPartitionsCommand (which is used by the kafka-reassign-partitions.sh tool) talks directly to ZooKeeper. This prevents the tool being used in deployments where only the brokers are exposed to clients (i.e. where the zookeeper servers are intentionally not exposed).

In addition, there is a general push to refactor/rewrite/replace tools which need ZooKeeper access with equivalents which use the AdminClient API.

Thus it is necessary to change the ReassignPartitionsCommand so that it no longer talks to ZooKeeper directly, but via an intermediating broker.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

  • The network protocol and api behavior

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Monitoring

  • Command line tools and arguments

  • Anything else that will likely break existing users in some way when they upgrade

 

A new network protocol API will be added:

  • PartitionAssignmentRequest and PartitionAssignmentResponse

The AdminClient API will have a new method added (plus overloads for options):

  • assignPartitions()

The options accepted by kafka-reassign-partitions.sh command will change:

  • --zookeeper will be deprecated, with a warning message
  • a new --bootstrap-server option will be added

Proposed Changes

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.

kafka-reassign-partitions.sh and ReassignPartitionsCommand

The --zookeeper option will be retained and will:

  1. Cause a deprecation warning to be printed to standard error. The message will say that the --zookeeper option will be removed in a future version and that --bootstrap-server is the replacement option.
  2. Perform the reassignment via ZooKeeper, as currently.

A new --bootstrap-server option will be added and will:

  1. Perform the reassignment via the given intermediating broker.

Using both --zookeeper and --bootstrap-server in the same command will produce an error message and the tool will exit without doing the intended operation.

It is anticipated that a future version of Kafka would remove support for the --zookeeper option.

Internally, the ReassignPartitionsCommand will be refactored to support the above changes to the options. An interface will abstract the commands currently issued directly to zookeeper.

There will be an implementation which makes the current calls to ZooKeeper, and another implementation which uses the AdminClient API described below.

In all other respects, the public API of ReassignPartitionsCommand will not be changed.

AdminClient

The following methods will be added to AdminClient:

 

    public AssignPartitionsResult assignPartitions(Map<TopicPartition, List<Integer>>)
    public AssignPartitionsResult assignPartitions(Map<TopicPartition, List<Integer>>,  AssignPartitionOptions options)

 

Where: 
  public class AssignPartitionsResult {
        // private constructor
        public Map<TopicPartition, KafkaFuture<Void>> values()
        public KafkaFuture<Void> all()
  }

Authorization

With broker-mediated reassignment it becomes possible limit the authority to perform reassignment to something finer-grained than "anyone with access to zookeeper".

The reasons for reassignment are usually operational. For example,  migrating partitions to new brokers when expanding the cluster, or attempting to find a more balanced assignment (according to some notion of balance). These are cluster-wide considerations and so authority should be for the reassign operation being performed on the cluster.

Given the standard form for authorization in Kafka

"Principal P is [Allowed/Denied] Operation O From Host H On Resource R"

the authorized operation will be ClusterAction, on the CLUSTER resource. So to authorize Alice and Bob to perform rebalancing one might need to configure an ACL like this:

  bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 \
    --add --allow-principal User:Bob --allow-principal User:Alice \
    --allow-host 198.51.100.0 --allow-host 198.51.100.1 \
    --operation ClusterAction --cluster

Network Protocol: AssignPartitionsRequest and AssignPartitionsResponse

An AssignPartitionsRequest will initiate the process of partition reassignment

    AssignPartitionsRequest => [PartitionAssignment]
      PartitionAssignment => Topic Partition NodeIds
        Topic => string
        Partition =>int32
        Replicas => [int32]

Where:

  • Topic a topic name
  • Partition a partition of that topic
  • Replicas the broker ids which will host the partition

Possible Error Codes:

  • ClusterAuthorizationFailedCode (31)
  • PartitionReassignmentInProgress (new)

As currently, it will not be possible to have multiple reassignments running concurrently, hence the addition of PartitionReassignmentInProgress 

It is not necessary to send an AssignPartitionsRequest to the leader for a given partition. Any broker will do.

    AssignPartitionsResponse => [PartitionAssignmentResult]
PartitionAssignmentResult => Topic Partition Error
        Topic => string
        Partition => int32
        Error => int16

The AssignPartitionsResponse enumerates those topics and partitions in the request, together with any error for reassigning that partition. The anticipated errors are:

  • INVALID_TOPIC_EXCEPTION (17) If the topic doesn't exist
  • INVALID_PARTITIONS (37) If the partition doesn't exist
  • UNKNOWN_MEMBER_ID (25) If the Replicas in the AssignPartitionsRequest included an unknown broker id

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Existing users of the kafka-reassign-partitions.sh will receive a deprecation warning when they use the --zookeeper option. The option will be removed in a future version of Kafka. If this KIP is introduced in version 1.0.0 the removal could happen in 2.0.0.

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

One alternative is to do nothing: Let the ReassignPartitionsCommand continue to communicate with ZooKeeper directly.

Another alternative is to do exactly this KIP, but without the deprecation of --zookeeper. That would have a higher long term maintenance burden, and would prevent any future plans to, for example, provide alternative cluster technologies than ZooKeeper.

  • No labels