Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Because the Kafka Controller as described in KIP-631 will use events/deltas to represent some of it its in-memory state changes the amount of data that needs to be persistent written to the log and replicated to all of the brokers can be reduced. As an example if a broker goes offline and we assume that we have 1 million topic 100K topics each with 10 partitions with a replica factor of 3, 100 brokers, an IsrChange message . In a well balanced cluster we can expect that no broker will have 2 or more partitions for the same topic. In this case the best we can do for IsrChange message and DeleteBroker message are the following message sizes:

  1. IsrChange has a size of 40 bytes

...

  1. . 4 bytes for partition id, 16 bytes for topic id, 8 bytes for ISR ids, 4 bytes for leader id, 4 bytes for leader epoch, 2 bytes for api key and 2 bytes for api version

...

  1. .
  2. DeleteBroker has a size of 8 bytes

...

  1. . 4 bytes for broker id, 2 bytes for api key and 2 bytes for api version

...

  1. .

When a broker is shutdown and if events/deltas (DeleteBroker) are not allowed the Kafka Controller uses IsrChange to include this information in the replicated log for the __cluster_metadata topic partition then the Kafka Controller and each broker needs to persist 40 bytes per topic partitions hosted by the broker. Each broker will be a replica for 30,000 partitions, that is  is 1.14 MB that needs to get persisted written in the replicated log by each broker. The Active Controller will need to replicate 1.14 MB to each broker in the cluster (99 brokers) or 113.29 MB.

...