You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Status

Current state: "Under Discussion"

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

After Kafka moved the rebalancing responsibilities from brokers to clients, many other applications (Kafka Connect, Kafka Streams, Confluent Schema Registry) have relied on the Group Membership Protocol to implement resource allocation amongst distributed processes.

Kafka Connect's WorkerCoordinator class extends AbstractCoordinator, and uses the rebalance mechanism to distribute tasks to its workers. In the same spirit, Confluent Kafka Registry - and its SchemaRegistryCoordinator - relies in the AbstractCoordinator class for leadership election (which instance of a Schema Registry cluster can accept writes). These two are in addition to the ConsumerCoordinator that the Kafka Consumer uses internally, and indirectly Kafka Streams.

We've adopted this generic, extensible protocol to implement a framework that solves both the distributed resource management and leader election use cases our engineering teams face when building their systems. The powerful primitives exposed by the AbstractCoordinator class have made it relatively easy to build distributed resource management systems on top of Apache Kafka.

We think it's time for the AbstractCoordinator to become part of Kafka's public API, so we can ensure backwards compatibility in future versions of the client libraries. This idea was addressed by Gwen Shapira on her  talk at the strangeloop conference in '18 [https://www.youtube.com/watch?v=MmLezWRI3Ys]. Finally, advertising the Coordinator as one of the features Apache Kafka offers should be another goal, as the protocol is well designed and extensible, so it's applicable to many other use cases.

Public Interfaces

This KIP introduces a new Coordinator interface in the org.apache.kafka.consumer.clients.consumer package.

Coordinator.java
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.kafka.clients.consumer;

import java.io.Closeable;
import java.nio.ByteBuffer;
import java.util.List;
import java.util.Map;
import org.apache.kafka.clients.consumer.internals.ConsumerCoordinator;
import org.apache.kafka.common.message.JoinGroupRequestData;
import org.apache.kafka.common.message.JoinGroupResponseData;

/**
 * Coordinator interface implements group management for a single group member by interacting with
 * a designated Kafka broker (the coordinator). Group semantics are provided by extending this class.
 * See {@link ConsumerCoordinator} for example usage.
 *
 * From a high level, Kafka's group management protocol consists of the following sequence of actions:
 *
 * <ol>
 *     <li>Group Registration: Group members register with the coordinator providing their own metadata
 *         (such as the set of topics they are interested in).</li>
 *     <li>Group/Leader Selection: The coordinator select the members of the group and chooses one member
 *         as the leader.</li>
 *     <li>State Assignment: The leader collects the metadata from all the members of the group and
 *         assigns state.</li>
 *     <li>Group Stabilization: Each member receives the state assigned by the leader and begins
 *         processing.</li>
 * </ol>
 *
 * To leverage this protocol, an implementation must define the format of metadata provided by each
 * member for group registration in {@link #metadata()} and the format of the state assignment provided
 * by the leader in {@link #performAssignment(String, String, List)} and becomes available to members in
 * {@link #onJoinComplete(int, String, String, ByteBuffer)}.
 *
 * Note on locking: this class shares state between the caller and a background thread which is
 * used for sending heartbeats after the client has joined the group. All mutable state as well as
 * state transitions are protected with the class's monitor. Generally this means acquiring the lock
 * before reading or writing the state of the group (e.g. generation, memberId) and holding the lock
 * when sending a request that affects the state of the group (e.g. JoinGroup, LeaveGroup).
 */
public interface Coordinator extends Closeable {

    /**
     * Unique identifier for the class of supported protocols (e.g. "consumer" or "connect").
     * @return Non-null protocol type name
     */
    String protocolType();

    /**
     * Get the current list of protocols and their associated metadata supported
     * by the local member. The order of the protocols in the list indicates the preference
     * of the protocol (the first entry is the most preferred). The coordinator takes this
     * preference into account when selecting the generation protocol (generally more preferred
     * protocols will be selected as long as all members support them and there is no disagreement
     * on the preference).
     * @return Non-empty map of supported protocols and metadata
     */
    JoinGroupRequestData.JoinGroupRequestProtocolCollection metadata();

    /**
     * Invoked prior to each group join or rejoin. This is typically used to perform any
     * cleanup from the previous generation (such as committing offsets for the consumer)
     * @param generation The previous generation or -1 if there was none
     * @param memberId The identifier of this member in the previous group or "" if there was none
     */
    void onJoinPrepare(int generation, String memberId);

    /**
     * Perform assignment for the group. This is used by the leader to push state to all the members
     * of the group (e.g. to push partition assignments in the case of the new consumer)
     * @param leaderId The id of the leader (which is this member)
     * @param protocol The protocol selected by the coordinator
     * @param allMemberMetadata Metadata from all members of the group
     * @return A map from each member to their state assignment
     */
    Map<String, ByteBuffer> performAssignment(String leaderId,
                                                                 String protocol,
                                                                 List<JoinGroupResponseData.JoinGroupResponseMember> allMemberMetadata);

    /**
     * Invoked when a group member has successfully joined a group. If this call fails with an exception,
     * then it will be retried using the same assignment state on the next call to {@link #ensureActiveGroup()}.
     *
     * @param generation The generation that was joined
     * @param memberId The identifier for the local member in the group
     * @param protocol The protocol selected by the coordinator
     * @param memberAssignment The assignment propagated from the group leader
     */
    void onJoinComplete(int generation,
                             String memberId,
                             String protocol,
                             ByteBuffer memberAssignment);

    /**
     * Invoked prior to each leave group event. This is typically used to cleanup assigned partitions;
     * note it is triggered by the consumer's API caller thread (i.e. background heartbeat thread would
     * not trigger it even if it tries to force leaving group upon heartbeat session expiration)
     */
    default void onLeavePrepare() {}

}


It also moves several classes from the org.apache.kafka.consumer.clients.consumer.internals package to the org.apache.kafka.consumer.clients.consumer package, as detailed in the Proposed Changes section below.

Proposed Changes

This KIP proposes to:

  1. Pull the abstract methods from the AbstractCoordinator class into a new interface, which will be part of Kafka's public API.
  2. Change the visibility (from protected to public) of methods that have been added to this interface when needed.
  3. Move these classes from the org.apache.clients.consumer.internals package to org.apache.clients.consumer
    1. org.apache.kafka.clients.consumer.internals.AbstractCoordinator → org.apache.kafka.clients.consumer.AbstractCoordinator
    2. org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient → org.apache.kafka.clients.consumer.ConsumerNetworkClient
    3. org.apache.kafka.clients.consumer.internals.Heartbeat → org.apache.kafka.clients.consumer.Heartbeat
    4. org.apache.kafka.clients.consumer.internals.RequestFuture → org.apache.kafka.clients.consumer.RequestFuture
    5. org.apache.kafka.clients.consumer.internals.RequestFutureAdapter → org.apache.kafka.clients.consumer.RequestFutureAdapter
    6. org.apache.kafka.clients.consumer.internals.RequestFutureListener → org.apache.kafka.clients.consumer.RequestFutureListener
  4. Adjust the method visibility for the moved classes when required.
  5. This KIP does not include any changes to the Admin APIs. Potential changes to the Admin/KafkaAdminClient classes (such as adding methods to query for group metadata from brokers) will be addressed in a separate KIP.

Compatibility, Deprecation, and Migration Plan

  • As all members of the new interface are public, clients who extend the existing AbstractCoordinator class will potentially have to change the visibility of the overridden abstract methods.
  • Also, as classes are being relocated, the package names of clients extending AbstractCoordinator will have to change accordingly.
  • No functional changes are planned as part of this KIP, so these changes won't impact the vast majority of clients.

Rejected Alternatives

The obvious alternative is to keep AbstractCoordinator as an internal API. Clients who use these APIs might break in future versions of the library, and this is a risk that might not be and acceptable for many users.

  • No labels