Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejava
public enum RebalanceProtocol {
    EAGER((byte) 0), COOPERATIVE((byte) 1);

    private final byte id;

    RebalanceProtocol(byte id) {
        this.id = id;
    }

    public byte id() {
        return id;
    }

    public static RebalanceProtocol forId(byte id) {
        switch (id) {
            case 0:
                return EAGER;
            case 1:
                return COOPERATIVE;
            default:
                throw new IllegalArgumentException("Unknown rebalance protocol id: " + id);
        }
    }
}

interface PartitionAssignor {

    // existing interfaces

    short version();                                         // new API, the version of the assignor which indicate the user metadata / algorithmic difference.

    String name();

    List<RebalanceProtocol> supportedProtocols();            // new API, indicate which rebalance strategy it would work with;
                                                             // and associate the protocol with a unique name of the assignor.

    class Subscription {
        public List<String> topics();

        public List<TopicPartition> ownedPartitions();       // new API, on older version 1 should always be empty

        public ByteBuffer userData();
    }

    class Assignment {
        public List<TopicPartition> partitions();

        public ConsumerProtocol.Errors error();             // new API, on older version 1 should always be NONE

        public ByteBuffer userData();
    }
}

...


From the user's perspective, the upgrade path of leveraging new protocols is just the same as switching to a new assignor. For example, assuming the current version of Kafka consumer is 2.2 and "range" assignor is specified in the config. The upgrade path would be:

...

  • Having a first rolling bounce to replace the byte code (i.e. swap the jars); set the assignors to "range, sticky". At this stage, the new versioned byte code will still choose EAGER as the protocol and then sends both assignors in their join-group request, since there are at least one member who's not bounced yet and therefor will only send with "range", "range" assignor will be selected to assign partitions while everyone is following the EAGER protocol. This rolling bounce is safe.
  • Having a second rolling bounce to remove the "range" assignor, i.e. only leave the "sticky" assignor in the config. At this stage, whoever have been bounced will then choose COOPERATIVE protocol and not revoke partitions while others not-yet-bounced will still go with EAGER and revoke everything. However the "sticky" assignor will be chosen since at least one member who's already bounced will not have "range" any more. The "sticky" assignor works even those there are some members in EAGER and some members in COOPERATIVE: it is fine as long as the leader can recognize them and make assignment choice accordingly, and for EAGER members, they've revoked everything and hence did not have any pre-assigned-partitions anymore in the subscription information, hence it is safe just to move those partitions to other members immediately based on the assignor's output.
  • The key point behind this two rolling bounce is that, we want to avoid the situation where leader is on old byte-code and only recognize "eager", but due to compatibility would still be able to deserialize the new protocol data from newer versioned members, and hence just go ahead and do the assignment while new versioned members did not revoke their partitions before joining the group. Note the difference with KIP-415 here: since on consumer we do not have the luxury to leverage on list of built-in assignors since it is user-customizable and hence would be black box to the consumer coordinator, we'd need two rolling bounces instead of one rolling bounce to complete the upgrade, whereas Connect only need one rolling bounce.

    ...