Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
 

The existing re-assign tool requires a lot of manual intervention. The idea is to have fairly balanced consistent result that we can use for partition reassignment.

Status

Current state[One of "Under Discussion", "Accepted", "Rejected"]

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Describe the problems you are trying to solve.Operational 

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

  • The network protocol and api behavior

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Monitoring

  • Command line tools and arguments

  • Anything else that will likely break existing users in some way when they upgrade

Proposed Changes

...

  • Command line tools and arguments

Proposed Changes

Current implementation produces fair replica distribution between specified list of brokers. Unfortunately, it doesn't take
into account current replica assignment.

So if we have, for instance, 3 brokers id=[0..2] and are going to add fourth broker id=3, 
generate will create an assignment config which will redistribute replicas fairly across brokers [0..3] 
in the same way as those partitions were created from scratch. It will not take into consideration current replica 
assignment and accordingly will not try to minimize number of replica moves between brokers.

This should be improved. New output of improved --re-balance algorithm should suite following requirements:

  • fairness of replica distribution - every broker will have R or R+1 replicas assigned;
  • minimum of reassignments - number of replica moves between brokers will be minimal;

Example.
Consider following replica distribution per brokers [0..3] (we just added brokers 2 and 3):

  • broker - 0, 1, 2, 3
  • replicas - 7, 6, 0, 0

The new algorithm will produce following assignment:

  • broker - 0, 1, 2, 3
  • replicas - 4, 3, 3, 3
  • moves - -3, -3, +3, +3

It will be fair and number of moves will be 6, which is minimal for specified initial distribution.

The scope of this issue is:

  • design an algorithm matching the above requirements;
  • implement this algorithm and unit tests;
  • test it manually using different initial assignments;

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?

--generate runs the old logic and --re-balance implements the new logic

  • If we are changing behavior how will we phase out the older behavior?

...

  • When will we remove the existing behavior?

--generate can get removed in 0.8.3  

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.