Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Enhance log compaction to support more than just offset comparison, so the insertion order isn't always dictating which records to keep (in effect, allowing for a form of OCC);
  • The current behavior should remain as the default in order to minimize impact on already existing clients and avoid any migration efforts;
  • New Configurations:Add new Kafka configuration
    • Global:
          • "log.cleaner.compaction.strategy"
        to toggle the compaction strategy to this approach;Add new Topic configuration
            • The active compaction strategy to use;
            • Accepts values "offset", "timestamp" and "header", allowing for further strategies to be added in the future as needed;
          • "log.cleaner.compaction.strategy.header"
            • Configuration sub-set to use when the strategy is set to "header";
        • Topic:
          • "compaction.strategy"
        representing
            • Represents the same as
        above, with reserved values "offset" and "timestamp";
      • The default value of these configurations should be "offset", which represents the previous behavior;
            • "log.cleaner.compaction.strategy", but for a specific topic;
          • "compaction.strategy.header"
            • Represents the same as "log.cleaner.compaction.strategy.header", but for a specific topic;
      • Compaction Strategies:
        • "offset"
          • The previous behavior is active, compacting the logs purely based on offset;
          • Also used when the configuration is either empty or not present, making this the default strategy
        • "timestamp"
          • The
        When this configuration is set to "timestamp", then the
          • record timestamp will be used to determine which record to keep, in a 'keep-highest' approach;
          • When
        this configuration is set to anything other than "offset" or "timestamp", then the record headers are scanned for a key matching this value. If this header is found then this
          • both records being compared contain an equal timestamp, then the record with the highest offset will be kept;
        • "header"
          • Searches the record for a header key that matches the configured value on "compaction.strategy.header";
          • If the "compaction.strategy.header" configuration is not set (or is blank), then the compaction strategy will fallback to "offset";
          • If a header key that matches the configuration exists, then the header value (which must be of type "long") will be used to determine which record to keep, in a 'keep-highest' approach;
        When
          • If both records being compared
        contain a "compaction value" that is considered equal, and the configuration is set to either "timestamp" or using a custom header
          • do not have a matching header key, then the record with the highest offset will be kept;
        When
          • If both records being compared
        do not have a "compaction value" at all
          • contain an equal header value, then the
        latest
          • record with the highest offset will be kept;
        When
          • If only one of the records being compared has a
        "compaction value"
          • matching header, then this record is kept
        (
          • , as the other record is considered to be anomalous
        )
          • ;

      ...

      Compatibility, Deprecation, and Migration Plan

      ...