Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. On startup or on co-ordinator failover, the consumer sends a ConsumerMetadataRequest to any of the brokers in the "bootstrap.brokers" list. In the ConsumerMetadataResponse, it receives the location of the co-ordinator for it's group and the co-ordinator's generation id.
  2. If the returned generation id of the co-ordinator is greater than the last known generation id, it indicates that the co-ordinator has initiated a rebalance. The consumer then stops fetching data, commits offsets and sends a JoinGroupRequest to it's co-ordinator broker. In the JoinGroupResponse, it receives the list of topic partitions that it should own.
  3. At this time, group management is done and the consumer starts fetching data and (optionally) committing offsets for the list of partitions it owns.

Co-ordinator

  1. In steady state, the co-ordinator tracks the health of each consumer in every group through it's failure detection protocol.
  2. Upon election or startup, the co-ordinator reads the list of groups it manages and their membership information from zookeeper. If there is no previous group membership information, it does nothing until the first consumer in some group registers with it.
  3. Upon election, if the co-ordinator finds previous group membership information in zookeeper, it waits for the consumers in each of the existing groups to re-register with itsend a HeartbeatRequest. Once all known and alive consumers register with in a group heartbeat the co-ordinator, it marks the rebalance completed. 
  4. Upon election or startup, the co-ordinator also starts failure detection for all consumers in a group. Consumers that are marked as dead by the co-ordinator's failure detection protocol are removed from the group, the group's generation id is incremented in zookeeper and the co-ordinator marks the triggers a rebalance operation for a group completed by communicating the new partition ownership to the remaining the consumer's group.
  5. Rebalance is triggered by killing the socket connection to the consumers in the group.
  6. In steady state, the co-ordinator tracks the health of each consumer in every group through it's failure detection protocol.
  7. If the co-ordinator marks a consumer as dead, it triggers a rebalance operation for the remaining consumers in the group. Rebalance is triggered by killing If the rebalance is caused due to a new consumer, the co-ordinator only breaks the socket connection to the remaining rest of the consumers in the group. Consumers notice the broken socket connection and trigger a co-ordinator discovery process (#1, #2 in the Consumer section above). Once all alive consumers re-register with the co-ordinator, it communicates the new partition ownership to each of the consumers in the RegisterConsumerResponse, thereby completing the rebalance operation.
  8. The co-ordinator tracks the changes to topic partition changes for all topics that any consumer group has registered interest for. If it detects a new partition for any topic, it triggers a rebalance operation (as described in #5 above). It is currently not possible to reduce the number of partitions for a topic. The creation of new topics can also trigger a rebalance operation as consumers can register for topics before they are created.

...