Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

 

Status

Current state"Draft"Discussion

Discussion thread: here

JIRA:

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4682

...

A more viable solution for KAFKA-4682 can be achieved by changing how group offset expiration works: preserve committed offsets as long as the group is active (has consumers). The expiration clock timer should start ticking the moment all group members are gone and the group transitions into an Empty state. This expiration semantics implies that there is no longer a need to enforce individual offset retention times and keep individual expiration timestamps for each topic partition in the group. This is because all committed offsets in the group will expire at the same time. As a result, the expireTimestamp field will be removed from the offset metadata message.

The group’s offsets expiration time will be when the group becomes Empty plus retention time of offsets.retention.minutes (assuming during that time the group does not become active again). When the group is in Empty state and the clock timer reaches the expiration time (set in that state), the group transitions to Dead state, and all group offsets expire and will be removed.

...

As mentioned above, as a result of this new semantics, all offsets in a group expire at the same time. The broker config offsets.retention.minutes determines when they expire (after the group becomes Empty). Older clients that prefer to honor their individual retention overrides (through OffsetCommitRequest) will need to set offsets.retention.minutes to the maximum of those overrides and the value of offsets.retention.minutes. This simply guarantees that the offsets will not expire any earlier now with the new semantics (because the expiration clock timer for a partition now starts no earlier - and perhaps later - than before). This also simplifies handling of clients that use different versions of the API.

Rejected Alternatives

...

  1. Preserving the partition-level offset expiration and expiring offsets at different times once the group is empty: The alternative presented above seems to be cleaner and more intuitive.
  2. Starting the expiration timer for individual offsets once they are no longer being consumed: This likely requires the rejected alternative #1 to be in place too. Having all offset expire at once is simpler. If the growth of metadata cache is a concern the proposal can change or we can think about other ways to reduce the size (example).