Status
Current state: Under Discussion
Discussion thread: TBD
JIRA:
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Today group coordinator will take in unlimited number of join group requests into the membership metadata. There is a potential risk described in
where too many illegal joining members will burst broker memory before session timeout GC them. To ensure stability of the broker, we propose to enforce a hard limit on the size of consumer group in order to prevent explosion of server side cache/memory.Public Interfaces
We propose to add a new configuration into KafkaConfig.scala, and its behavior will affect the following coordinator APIs:
def handleJoinGroup(...) def handleSyncGroup(...)
Proposed Changes
We shall add a config called group.max.size on the coordinator side.
val GroupMaxSizeProp = "group.max.size" ... val GroupMaxSize = 1000000 ... .define(GroupMaxSizeProp, INT, Defaults.GroupMaxSize, MEDIUM, GroupMaxSizeDoc)
The default value 1_000_000 proposed here is based on a rough size estimation of member metadata (100B), so the max allowed memory usage per group is 100B * 1_000_000 = 100 MB which should be sufficient large number for most use cases I know. Welcome discussion on this number.
Compatibility, Deprecation, and Migration Plan
- This is backward compatible change.
Rejected Alternatives
Some discussion here proposed other approaches like enforcing memory limit or changing initial rebalance delay. We believe that those approaches are "either not strict or not intuitive" (Quote from Stanislav), compared with group size cap which is very easy to understand and config by end user in the customized manner.