Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: VotingAdopted

Discussion thread: https://lists.apache.org/thread.html/4ed8071a77cd8ddf7a7fe1feb4473134b2e7272a45ec180a23a6ab1e@%3Cdev.kafka.apache.org%3E 

...

The automated consumer offsets sync will be controlled enabled by a new config in MM 2.0 configuration file, called "sync.consumergroup.offsets.enabled", together with "emit.checkpoints.enabled". Setting it both to true will launch a background task with MM 2.0 to periodically make the existing "MirrorCheckpointTask" to additionally sync the selected and translated consumer group offsets (e.g. that are not active in target cluster) to the target cluster. The , the frequency of sync depends on another relevant configuration, called "sync.consumer.offsets.interval.seconds"offset sync is same as the frequency of emitting checkpoints.

By default, the automated consumer offsets sync is not enabled. Here is an example of how to enable the one-way sync from the cluster labelled "primary" to the cluster labelled "backup":

Code Block
languagebash
titleenable automated consumer offset sync
primary->backup.sync.consumergroup.offsets.enabled = true
primary->backup.sync.consumer.offsets.interval.seconds = 20

Proposed Changes

Overall, this KIP will not change the existing behaviors and functionalities of MM 2.0.

...

This is the new implementation introduced by this KIP. Current MM 2.0 already provided an interface to read and translate the consumer offsets. The next thing to do is to write the translated consumer offsets to the target cluster each time when the sync task was running. Only selected consumer offsets are written and the initial criteria are (1) only write offsets for the consumers who are inactive in target cluster. This will avoid the situation when the two consumer instances (with same consumer group ID) are running both at primary and backup clusters, the offsets at target cluster will be overwritten by the sync task. (2) if the "watermark" of the consumer offsets at target cluster is higher than the offsets at primary cluster, do not write the lower 'watermark" to target cluster. This will avoid the situation when the consumption progress at primary cluster is slower than the progress at backup cluster, writing lower 'watermark" will rewind the consumer to previous offsets, leading to consuming duplicate messages.

A new background task

MM 2.0 can already launch multiple background task, e.g. checkpoint, heartbeat.., the KIP proposes to use the same way of launching background task with configurable interval.

Compatibility, Deprecation, and Migration Plan

...