Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Set current state -> Discarded

Table of Contents

Status

Current state: "Under Discussion" Discarded

Discussion thread: here

JIRA: here

...

After Kafka moved the rebalancing responsibilities from brokers to clients, many other applications (Kafka Connect, Kafka Streams, Confluent Schema Registry) have relied on the Group Membership Protocol to implement resource allocation amongst distributed processes.

Kafka Connect's WorkerCoordinator class extends AbstractCoordinator, and uses the rebalance mechanism to distribute tasks to its workers. In the same spirit, Confluent Kafka Registry - and its SchemaRegistryCoordinator - relies in the AbstractCoordinator class for leadership election (which instance of a Schema Registry cluster can accept writes). These two are in addition to the ConsumerCoordinator that the Kafka Consumer uses internally, and indirectly Kafka Streams.

We've adopted this generic, extensible protocol to implement a framework that solves both the distributed resource management and leader election use cases our engineering teams face when building their systems. The powerful primitives exposed by the AbstractCoordinator class have made it relatively easy to build distributed resource management systems on top of Apache Kafka.

A good example of how other projects can leverage AbstractCoordinator APIs to implement resource management is this PR to add HA capabilities to Kafka Monitor. We're looking into adding these capabilities to other systems in our Kafka Infrastructure (Kafka Cruise Control being the next one).

We think it's time for the AbstractCoordinator to become part of Kafka's public API, so we can ensure backwards compatibility in future versions of the client libraries. This idea was addressed In fact, we were inspired by Gwen Shapira on her  's talk at the strangeloop conference in '18 [https://www.youtube.com/watch?v=MmLezWRI3Ys] which gave us a new perspective when redesigning our existing distributed leader election system.

Finally, advertising the Coordinator as one of the features Apache Kafka offers should be another goal, as the we feel that the protocol is well designed and extensible, so advertising Kafka's Coordinator capabilities might be a good idea, as it's applicable to many other use cases.

Public Interfaces

This KIP will add introduces a new Coordinator interface to in the org.apache.kafka.consumer.clients.consumer package.

Code Block
languagejava
titleCoordinator.java
package org.apache.kafka.clients.consumer;

import java.io.Closeable;
import java.nio.ByteBuffer;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import org.apache.kafka.clients.consumer.internals.ConsumerCoordinator;
import org.apache.kafka.common.message.JoinGroupRequestData;
import org.apache.kafka.common.message.JoinGroupResponseData;
import org.apache.kafka.common.requests.JoinGroupRequest;
import org.apache.kafka.common.requests.OffsetCommitRequest;

/**
 * Coordinator handles group management for a single group member by interacting with a
 * designated Kafka broker (the coordinator). Group semantics are provided by extending the
 * {@link org.apache.kafka.clients.consumer.internals.AbstractCoordinator} class.
 * See {@link ConsumerCoordinator} for example usage.
 * <p>
 * From a high level, Kafka's group management protocol consists of the following sequence of
 * actions:
 *
 * <ol>
 *     <li>Group Registration: Group members register with the coordinator providing their own metadata
 *         (such as the set of topics they are interested in).</li>
 *     <li>Group/Leader Selection: The coordinator select the members of the group and chooses one member
 *         as the leader.</li>
 *     <li>State Assignment: The leader collects the metadata from all the members of the group and
 *         assigns state.</li>
 *     <li>Group Stabilization: Each member receives the state assigned by the leader and begins
 *         processing.</li>
 * </ol>
 * <p>
 * To leverage this protocol, an implementation must define the format of metadata provided by each
 * member for group registration in {@link #metadata()} and the format of the state assignment provided
 * by the leader in {@link #performAssignment(String, String, List)} and becomes available to members in
 * {@link #onJoinComplete(int, String, String, ByteBuffer)}.
 * <p>
 * Note on locking: this class shares state between the caller and a background thread which is
 * used for sending heartbeats after the client has joined the group. All mutable state as well as
 * state transitions are protected with the class's monitor. Generally this means acquiring the lock
 * before reading or writing the state of the group (e.g. generation, memberId) and holding the lock
 * when sending a request that affects the state of the group (e.g. JoinGroup, LeaveGroup).
 */
public interface Coordinator extends Closeable {

    /**
     * Unique identifier for the class of supported protocols (e.g. "consumer" or "connect").
     * @return Non-null protocol type name
     */
    String protocolType();

    /**
     * Get the current list of protocols and their associated metadata supported
     * by the local member. The order of the protocols in the list indicates the preference
     * of the protocol (the first entry is the most preferred). The coordinator takes this
     * preference into account when selecting the generation protocol (generally more preferred
     * protocols will be selected as long as all members support them and there is no disagreement
     * on the preference).
     * @return Non-empty map of supported protocols and metadata/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.kafka.clients.consumer;

import java.io.Closeable;
import java.nio.ByteBuffer;
import java.util.List;
import java.util.Map;
import java.util.Objects;
import org.apache.kafka.clients.consumer.internals.ConsumerCoordinator;
import org.apache.kafka.common.message.JoinGroupRequestData.JoinGroupRequestProtocol;
import org.apache.kafka.common.message.JoinGroupResponseData;
import org.apache.kafka.common.message.JoinGroupResponseData.JoinGroupResponseMember;

/**
 * Coordinator interface implements group management for a single group member by interacting with
 * a designated Kafka broker (the coordinator). Group semantics are provided by extending this class.
 * See {@link ConsumerCoordinator} for example usage.
 *
 * From a high level, Kafka's group management protocol consists of the following sequence of actions:
 *
 * <ol>
 *     <li>Group Registration: Group members register with the coordinator providing their own metadata
 *         (such as the set of topics they are interested in).</li>
 *     <li>Group/Leader Selection: The coordinator select the members of the group and chooses one member
 *         as the leader.</li>
 *     <li>State Assignment: The leader collects the metadata from all the members of the group and
 *         assigns state.</li>
 *     <li>Group Stabilization: Each member receives the state assigned by the leader and begins
 *         processing.</li>
 * </ol>
 *
 * To leverage this protocol, an implementation must define the format of metadata provided by each
 * member for group registration in {@link #metadata()} and the format of the state assignment provided
 * by the leader in {@link #performAssignment(String, String, List)} and becomes available to members in
 * {@link #onJoinComplete(int, String, String, ByteBuffer)}.
 *
 * Note on locking: this class shares state between the caller and a background thread which is
 * used for sending heartbeats after the client has joined the group. All mutable state as well as
 * state transitions are protected with the class's monitor. Generally this means acquiring the lock
 * before reading or writing the state of the group (e.g. generation, memberId) and holding the lock
 * when sending a request that affects the state of the group (e.g. JoinGroup, LeaveGroup).
 */
public interface Coordinator extends Closeable {

    /**
     * Unique identifier for the class of supported protocols (e.g. "consumer" or "connect").
     * @return Non-null protocol type name
     */
    JoinGroupRequestData.JoinGroupRequestProtocolCollectionString metadataprotocolType();

    /**
     * InvokedGet priorthe tocurrent eachlist groupof joinprotocols orand rejoin.their Thisassociated ismetadata typicallysupported
 used to perform any
 * by the local member. *The cleanuporder fromof the previousprotocols generationin (suchthe asmap committing offsetsindicates for the consumer)preference
     * @paramof generationthe Theprotocol previous(the generationfirst orentry -1is ifthe theremost was none
     * @param memberId The identifier of this member in the previous group or "" if there was nonepreferred). The coordinator takes this
     * preference into account when selecting the generation protocol (generally more preferred
     */
 protocols will be voidselected onJoinPrepare(int generation, String memberId);

    /**
     * Perform assignment for the group. This is used by the leader to push state to all the membersas long as all members support them and there is no disagreement
     * on the preference).
     * @return Non-empty map of supported protocols and metadata
     */
  of the groupList<JoinGroupMetadata> (e.g. to push partition assignments in the case of the new consumer)
     * @param leaderId The id of the leader (which is this member)
     * @param protocol The protocol selected by the coordinatormetadata();

    /**
     * Invoked prior to each group join or rejoin. This is typically used to perform any
     * cleanup from the previous generation (such as committing offsets for the consumer)
     * @param allMemberMetadatageneration MetadataThe from all members of the groupprevious generation or -1 if there was none
     * @param @returnmemberId AThe mapidentifier fromof eachthis member to their state assignment in the previous group or "" if there was none
     */
    Map<String, ByteBuffer> performAssignment(String leaderId,
  void onJoinPrepare(int generation, String memberId);

    /**
     * Perform assignment for the group. This is used by the leader to push state to all the members
     * of the group (e.g. to push partition assignments in the case of the new consumer)
    String protocol,
* @param leaderId The id of the leader (which is this member)
     * @param protocol The protocol selected by the coordinator
     * @param allMemberMetadata Metadata from all members of the group
     * @return A map from each member to their state List<JoinGroupResponseData.JoinGroupResponseMember> allMemberMetadata);

assignment
     */**
    Map<String, *ByteBuffer> Invoked when a group member has successfully joined a group. If this call fails with an exception,
performAssignment(String leaderId,
                    * then it will be retried using the same assignment state on the next call to {@link #ensureActiveGroup()}.
     *
     * @param generation The generation that wasString joinedprotocol,
     * @param memberId The identifier for the local member in the group
     * @param protocol The protocol selected by the coordinator
     * @param memberAssignment The assignment propagated from the group leader
     */
    void onJoinComplete(int generation,List<AssignmentMetadata> allMemberMetadata);

    /**
     * Invoked when a group member has successfully joined a group.
     *
    String memberId,
* @param generation The generation that was joined
     * @param memberId The identifier for the local member in the group
     String* @param protocol,
 The protocol selected by the coordinator
     * @param memberAssignment The assignment propagated from the group leader
     */
   ByteBuffer memberAssignment);

    /**
void onJoinComplete(int generation,
       * Invoked prior to each leave group event. This is typically used to cleanup assigned partitions;
     * note itString ismemberId,
 triggered by the consumer's API caller thread (i.e. background heartbeat thread would
     * not trigger it even if it tries to force leaving group uponString heartbeatprotocol,
 session expiration)
     */
    default void onLeavePrepare() {}

    /**
     * Ensure that the group is active (i.e. joined and synced) ByteBuffer memberAssignment);

    /**
     */
 Invoked prior to void ensureActiveGroup();

    /**
     * Get the current generation state, regardless of whethereach leave group event. This is typically used to clean up assigned partitions;
     * note it is currently stable.triggered by the consumer's API caller thread (i.e. background heartbeat thread would
     * Notenot thattrigger theit generationeven informationif canit betries updatedto whileforce weleaving aregroup stillupon inheartbeat thesession middleexpiration)
     */
 of a rebalance, afterdefault the join-group response is received.void onLeavePrepare() {}

     /**
     * @return the current generationWrapper for {@link JoinGroupRequestProtocol} protocol class.
     */
    Generation generation();class JoinGroupMetadata {

    /**
    private *final Get the current generation state if the group is stable, otherwise return nullJoinGroupRequestProtocol delegate;

        public JoinGroupMetadata() {
     *
     * @return thethis.delegate current= generation or nullnew JoinGroupRequestProtocol();
     */
   }

 Generation generationIfStable();

    String memberId();

 public JoinGroupMetadata(String name, class Generationbyte[] metadata) {
        public static final Generation NO_GENERATION = new Generation(
 this();
            this.delegate.setName(name).setMetadata(metadata);
        OffsetCommitRequest.DEFAULT_GENERATION_ID,}

        public String name() {
     JoinGroupRequest.UNKNOWN_MEMBER_ID,
       return delegate.name();
        null);}

        public finalByteBuffer int generationId;
metadata() {
          public final String memberIdreturn ByteBuffer.wrap(delegate.metadata());
        public final String protocolName;}

        public JoinGroupMetadata GenerationsetName(int generationId, String memberId, String protocolNamename) {
            this.generationId = generationIddelegate.setName(name);
            return this.memberId = memberId;
        }

    this.protocolName   = protocolName;
public JoinGroupMetadata setMetadata(byte[] v) {
    }

        /**delegate.setMetadata(v);
         * @return true ifreturn this;
 generation  has a valid member id, false}

 otherwise. A member might have an id before@Override
        public *boolean it becomes part of a group generation.equals(Object o) {
         */
   if (this == o) {
 public boolean hasMemberId() {
            return !memberId.isEmpty()true;
        }

    }
    @Override
        public booleanif equals(final Object o) {
== null || getClass() != o.getClass()) {
      if (this == o) return true;
     return false;
      if (o == null || getClass() != o.getClass()) return false; }
            final GenerationJoinGroupMetadata that = (GenerationJoinGroupMetadata) o;
            return generationId ==Objects.equals(delegate, that.generationId &&delegate);
        }

        @Override
    Objects.equals(memberId, that.memberId) &&
  public int hashCode() {
            return   Objects.equals(protocolName, that.protocolNamehash(delegate);
        }

        @Override
        public intString hashCodetoString() {
            return Objects.hash(generationId, memberId, protocolName);
"JoinGroupMetadata{" +
         }

        @Override"delegate=" + delegate +
        public String toString() {
     '}';
       return "Generation{" +
 }
    }

    /**
     * Wrapper for {@link JoinGroupResponseMember} protocol class.
  "generationId=" + generationId +*/
    class AssignmentMetadata {

        private final JoinGroupResponseData.JoinGroupResponseMember delegate;

    ", memberId='" + memberId + '\'' +public AssignmentMetadata() {
            this.delegate = new JoinGroupResponseMember();
     ", protocol='" + protocolName}

 + '\'' +
     public String memberId() {
            '}'return delegate.memberId();
        }

        public String groupInstanceId() {
            return delegate.groupInstanceId();
        }

        public  }
}

Proposed Changes

This KIP proposes to:

...

ByteBuffer metadata() {
            return ByteBuffer.wrap(delegate.metadata());
        }

        public AssignmentMetadata setMemberId(String memberId) {
            delegate.setMemberId(memberId);
            return this;
        }

        public AssignmentMetadata setGroupInstanceId(String groupInstanceId) {
            delegate.setGroupInstanceId(groupInstanceId);
            return this;
        }

        public AssignmentMetadata setMetadata(byte[] metadata) {
            delegate.setMetadata(metadata);
            return this;
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) {
                return true;
            }
            if (o == null || getClass() != o.getClass()) {
                return false;
            }
            AssignmentMetadata that = (AssignmentMetadata) o;
            return Objects.equals(delegate, that.delegate);
        }

        @Override
        public int hashCode() {
            return delegate.hashCode();
        }

        @Override
        public String toString() {
            return "AssignmentMetadata{" +
                "delegate=" + delegate +
                '}';
        }
    }
}


It also moves several classes from the org.apache.kafka.consumer.clients.consumer.internals package to the org.apache.kafka.consumer.clients.consumer package, as detailed on the Proposed Changes section below.

Proposed Changes

For a list of the propose changes, please refer to the the pull-request: KIP-795 Make AbstractCoordinator part of the public API #11515

AbstractCoordinator

  • Move the AbstractCoordinator class from the o.a.k.c.c.consumer.internals to the o.a.k.c.c.consumer package
  • Extract its abstract methods into a new  org.apache.kafka.consumer.clients.consumer.Coordinator interface
  • Remove references to the protocol types from its public APIs
    •  org.apache.kafka.common.message.JoinGroupRequestData.JoinGroupRequestProtocolCollection is replaced by a java.util.Map
    • org.apache.kafka.common.message.JoinGroupResponseData.JoinGroupResponseMember is wrapped into a new AssignmentMetadata type
  • The visibility of some methods have been raised (from default to protected), so classes extending AbstractCoordinator (e.g. ConsumerCoordinator) can make use of them.

Other classes made public

A few other classes are also moved from org.apache.kafka.clients.consumer.internals package to org.apache.kafka.clients.consumer:

  • ConsumerNetworkClient
  • Heartbeat
  • RequestFuture / RequestFutureAdapter / RequestFutureListener

Other changes

Due to the removal of references to the protocol type classes from AbstractCoordinator, some adjustments needed to be made on the ConsumerCoordinator and Kafka Connect classes (ConnectAssignor, EagerAssignor, IncrementalCooperativeAssignor and WorkerCoordinator) to support the new interface.

These classes are all internal to Kafka, so clients should not be impacted at all.

Changes not included on this KIP changes

...

This KIP does not include any changes to the Admin APIs. Potential changes to the Admin/KafkaAdminClient classes (such as adding methods to query for group metadata from brokers) will be addressed in a separate KIP.


Compatibility, Deprecation, and Migration Plan

  • As all members of the new interface are public, clients who extend the existing AbstractCoordinator class will potentially have to change the visibility of the overridden abstract methods.
  • Also, as classes are being relocated, the package names of clients extending AbstractCoordinator will have to change accordingly.
  • No functional changes are planned as part of this KIP, so these changes won't impact the vast majority of clients.

...