Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Enhance log compaction to support more than just offset comparison, so the insertion order isn't always dictating which records to keep (in effect, allowing for a form of OCC);
  • The current behavior should remain as the default in order to minimize impact on already existing clients and avoid any migration efforts;
  • New Configurations:
    • Global:
      • "log.cleaner.compaction.strategy"
        • The active compaction strategy to use;
        • Accepts values "offset", "timestamp" and "header", allowing for further strategies to be added in the future as needed;
      • "log.cleaner.compaction.strategy.header"
        • Configuration sub-set to use when the strategy is set to "header";
    • Topic:
      • "compaction.strategy"
        • Represents the same as "log.cleaner.compaction.strategy", but for a specific topic;
      • "compaction.strategy.header"
        • Represents the same as "log.cleaner.compaction.strategy.header", but for a specific topic;
  • Compaction Strategies:
    • "offset"
      • The previous behavior is active, compacting the logs purely based on offset;
      • Also used when the configuration is either empty or not present, making this the default strategy;
    • "timestamp"
      • The record timestamp will be used to determine which record to keep, in a 'keep-highest' approach;
      • When both records being compared contain an equal timestamp, then the record with the highest offset will be kept;
    • "header"
      • Searches the record for a header key that matches the configured value on "compaction.strategy.header";
      • If the "compaction.strategy.header" configuration is not set (or is blank), then the compaction strategy will fallback to "offset";
      • If a header key that matches the configuration exists, then the header value (which must be of type "long") will be used to determine which record to keep, in a 'keep-highest' approach;
      • If both records being compared do not have a matching header key, then the record with the highest offset will be kept;
      • If both records being compared contain an equal header value, then the record with the highest offset will be kept;
      • If only one of the records being compared has a matching header, then this record is kept, as the other record is considered to be anomalous;

...