You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Status

Current state: Under Discussion

Discussion thread:

JIRA:


Motivation

People use Kafka, among many reasons, because they need to be sure their messages are correctly processed by their applications. Classic configuration is to have 3 replica, and commit the offset of a message once it has been correctly processed. Developers use this configuraton because it is important not to lose any messages.

Nevertheless, there are some situations where messages are lost silently:

  • Message expires before being consumed due to topic retention time.
  • Message expires before being consumed due to topic size limit.

I propose to build a mechanism to log a warning when a message is going to/has been removed due to topic time/size retention settings, for a set of consumer groups especified on the topic configuration.


The kafka brokers know the information needed to achieve the goal:

  • offset of the message that will be removed.
  • last offset consumed from a consumer group.

Public Interfaces

The kafka-topic.sh tool must understand a new property on --config property:

  • notify.groups.on.expiration : comma separated list of groups that will be notified on offset expiration.

Proposed Changes

The modifications introduced are in blue on the following list:

  • The scheduler is triggered
  • The scheduler will search for the logs to be deleted.
  • Read the last offset consumed by all groups specified on notify.groups.on.expiration.
  • The scheduler will remove the log.
  • If the offset that has been removed is lower that the last consumed offset for each group, log a line:
    • "message with offset %d partition %d topic %s has been removed without being consumed by group %s"

Compatibility, Deprecation, and Migration Plan

There is no impact on existing features.

Rejected Alternatives


  • No labels