You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Status

Current state: Under discussion

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

MirrorMaker2 uses a few internal topics to store some of its metadata. Records in these topics use binary formats so they can't be directly displayed but first need to be parsed and formatted. A way to easily parse and format records in these topics would be handy when investigating issues.


Kafka already has the concept of MessageFormatters used for other internal topics using binary format. We have OffsetsMessageFormatter, GroupMetadataMessageFormatter, TransactionLogMessageFormatter and the console consumer tool is able to use them to dump the content of __consumer_offsets for example.

The proposal is to provide similar Formatters for the MirrorMaker2 topics: checkpoints, heartbeats and offset-sync. At the same time, I propose publicly exposing a MessageFormatter interface that replaces the internal MessageFormatter trait we currently have. That way the new Formatters can be in the mirror project.

Additionally, having a public interface will also enable users to build their own formatters that can be reused with the console-consumer tool. For example, one could create a formatter that works with a schema registry, or a formatter that hides some fields based on the user identity.

Public Interfaces

1) The MessageFormatter interface


It makes sense to reuse the existing MessageFormatter interface. However at the moment, it's not public and it is in the core project. I propose making this interface public and moving it in the org.apache.kafka.common package in clients:

package org.apache.kafka.common;

import java.io.PrintStream;
import java.util.Properties;

import org.apache.kafka.clients.consumer.ConsumerRecord;

/**
 * This interface allows to define Formatters that can be used to parse and format records read by a
 *  Consumer instance for display.
 * The kafka-console-consumer has built-in support for MessageFormatter, via the --formatter flag.
 * 
 * Kafka provides a few implementations to display records of internal topics such as __consumer_offsets, 
 * __transaction_state and the MirrorMaker2 topics.
 * @author mickael
 *
 */
public interface MessageFormatter {

    /**
     * Initializes the MessageFormatter
     * @param props Properties to configure the formatter
     */
    default public void init(Properties props) {}

    /**
     * Parses and formats a record for display
     * @param consumerRecord the record to format
     * @param output the print stream used to display the record
     */
    public void writeTo(ConsumerRecord<byte[], byte[]> consumerRecord, PrintStream output);

    /**
     * Closes the formatter
     */
    default public void close() {}
}

2) MirrorMaker2 formatters

The 3 formatters will be in a new package named org.apache.kafka.connect.mirror.formatters in the mirror project:

  • HeartbeatFormatter
  • CheckpointFormatter
  • OffsetSyncFormatter

Proposed Changes

The existing Scala MessageFormatter trait will be deleted (it's not public) and the existing implementation will be updated to implement the new Java interface.

Compatibility, Deprecation, and Migration Plan

None as existing MessageFormatters are not changing and continue to be used the same way.

Rejected Alternatives

  • Provide a tool or a new mechanism to format binary topics: While this may not require add a new class in the public API, it would not be consistent with the tools users and administrators are already used to.
  • No labels