Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Given that with KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum Kafka is moving away from ZooKeeper, it seems worthwhile to introduce a new broker registration mechanism that is not less reliant on ZooKeeper.


{{ TODO: We need the metadata update RPC/mechanism to rely on it as a heartbeat }}

Abstract

Introduce an inter-broker registration mechanism where brokers actively register themselves to the controller and maintain heartbeat sessions.

Public Interfaces

Register Broker Request

...

We are introducing a new, global state for each broker. To start with, each broker can have one out of 4 such states

  • Offline - when the broker process is in the Offline state, it is either not running at all, or in the process of performing single-node tasks needed to starting up such as initializing the JVM or performing log recovery.
  • Fenced - when the broker is in the Fenced state, it will not respond to RPCs from clients.  The broker will be in the fenced state when starting up and attempting to fetch the newest metadata.  It will re-enter the fenced state if it can't contact the active controller.  Fenced brokers should be omitted from the metadata sent to clients
  • Online - when a broker is online, it is ready to respond to requests from clients.
  • Stopping - brokers enter the stopping state when they receive a SIGINT.  This indicates that the system administrator wants to shut down the broker. When a broker is stopping, it is still running, but we are trying to migrate the partition leaders off of the broker. Eventually, the active controller will ask the broker to finally go offline, by returning a special result code in the MetadataFetchResponse.  Alternately, the broker will shut down if the leaders can't be moved in a predetermined amount of time.

Register Flow

Once a broker starts up it goes through its single-node tasks - initializing certain classes and performing log recovery.
After that is over, it sends a Register request to the controller. The controller creates a znode for the broker (or changes its state if a znode already exists), setting it to a FENCED state and responds to the broker. Upon receiving a response, it checkpoints its current metadata and enters its fenced state where metadata catch up begins.

...