Status
Current state: Under Discussion
Discussion thread: here
JIRA:
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
MirrorMaker currently inherits the default value for `auto.offset.reset`, which is `latest`.
While for most consumers this is a sensible default, MirrorMakers are specifically designed for replication, so they should default to replicating topics from the beginning.
A specific scenario where this really matters is when a MirrorMaker is subscribed to a regex pattern. If auto-topic creation is enabled on the cluster, and you start producing to a non-existent topic that matches the regex, then there will be a period of time where the producer is producing before the new topic's partitions have been picked up by the MirrorMaker. Those messages will never be consumed by the MirrorMaker because it will start from latest, ignoring those just-produced messages.
Proposed Changes
This would add a MirrorMaker default consumer property of `auto.offset.reset==earliest`. Users can still override this in the MirrorMaker consumer config file.
Compatibility, Deprecation, and Migration Plan
This will be a silent breaking change since it flips the behavior around.
Anyone who starts a mirrormaker on an existing topic will start replicating the partitions from the beginning, rather than from the partition's current highwater mark.
If this behavior is unexpectedly applied to a very large partition/topic, it will replicate far more data than expected.
Rejected Alternatives
Leaving it as-is. As noted in the description, the existing state of affairs produces data gaps for anyone replicating topics using a regex pattern.