You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Status

Current state: Under Discussion

Discussion thread: here

JIRA: here

Pull Request: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Records are explicitly deleted once they have been fully consumed. Currently, this is done every time the Task is committed, resulting in "delete records" requests being sent every commit.interval.ms milliseconds.

When commit.interval.ms is set very low, for example when processing.guarantee is set to exactly_once_v2, this causes delete records requests to be sent extremely frequently, potentially reducing throughput and causing a high volume of log messages to be logged by the brokers.

Public Interfaces

A new configuration option, delete.interval.ms will be added.

Proposed Changes

Adding a new configuration option, delete.interval.ms, that configures the frequency these explicit record deletions are sent will resolve the issue, by enabling users to tune the commit.interval.ms and delete.interval.ms separately.

We will still wait for a commit before explicitly deleting repartition records, but we will only do so if the time since the last record deletion is at least delete.interval.ms. This means the lower-bound for delete.interval.ms  is effectively capped by the value of commit.interval.ms.

Compatibility, Deprecation, and Migration Plan

  • Default value for delete.interval.ms will be set to 30 seconds , the (current) default value of commit.interval.ms . This ensures that users who do not modify either setting will retain the existing behaviour.
    • For users that use EOS, the default commit.interval.ms is automatically reduced to 100ms. The default value of delete.interval.ms will not be reduced when EOS is enabled, to ensure that these users benefit from the improved performance of these changes.

Rejected Alternatives

Modifying the explicit deletion of records to be completely independent of commits such that delete.interval.ms is strictly adhered to, irrespective of the value of commit.interval.ms was not explored, as the increased complexity of the changes may introduce bugs, with little additional benefit.

  • No labels