You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

The following is a draft design that uses a high-available consumer coordinator at the broker side to handle consumer rebalance. By migrating the rebalance logic from the consumer to the coordinator we can resolve the consumer split brain problem and help thinner the consumer client.

Overview:

One of the brokers is elected as the coordinator for all the consumer groups. It will be responsible for:

  1. Watch for new topics and new groups
  2. Watch for consumer group member changes and topic partition changes
  3. Rebalance logic for affected groups in response to watched change events
  4. Communicate the rebalance results to consumers

When a coordinator decides a rebalance is needed for certain group, it will first sends the stop-fetcher command to each consumers in the group (this communication channel is currently implemented using a ZK based queue), and then sends the start-fetcher command to each consumers with the assigned partitions. Each consumers will only receive the partition info of the partitions that are assigned to itself. The coordinator will finish the rebalance by waiting for all the consumers to finish starting the fetchers and respond (currently implemented as cleaning the queue).

Paths Stored in ZK:

Most of the original ZK paths storage are kept, in addition to the following paths:

  1. Coordinator path: stores the current coordinator info./consumers/coordinator --> brokerId (ephemeral; created by coordinator)
  2. ConsumerChannel path: stores the coordinator commands to the consumer./consumers/groups/group/channel/consumer/commandId --> command (sequential persistent; created by coordinator, removed by consumer)

Coordinator:

A. On Coordinator Startup

B. On Coordinator Change/Failover

C. On ZK Watcher Fires

D. On Rebalance Handling

Consumer:

A. On Consumer Startup

B. On Consumer Failover

Coordinator-Consumer Communication

A. Coordinator Commands Format:

CommandCommand {

  option                                  :  String                                                                                              // Can either be "start-fetcher" or "stop-fetcher"

  ownershipMap                  :  Map[topicPartition: String  =>  consumerThread: String]      // a map of owned partitions to consumer threads, only available for "start-fetcher"

}

B. Coordinator-Side Sending Commands to Consumer

C. Consumer-Side Handling of Coordinator Commands

  • No labels