Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: Under Discussion

Discussion thread: here

JIRA: here

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

At the same time, diagnosing issues in MM2 distributed mode can be quite challenging. Each flow has a dedicated Connect group, and for each flow, the Connectors have the same name. This causes the following monitoring and diagnostic issues with MM2:

  1. The producers and admins managed by the Connect framework for the tasks will get the exact same client.ids in each flow. In the MM2 logs, the logs of such Kafka clients cannot be distinguished easily/at all.The Connector log context provided by Connect contains the same Connector name/Task ID in each flow. In the MM2 logs, lines coming from inside the context cannot be distinguished easily/at all.
  2. The toString method of some entities (such as WorkerConnector, WorkerTask) used in logs internal Connect thread names only contain the Connector name/Task ID, and those logs cannot be distinguished easily/at all.

MM2 distributed mode should also provide the replication flow in the the client.ids logs and the logs thread names to address this problem.

Public Interfaces

MM2 distributed mode

...

logging change

  • New configuration in MM2 distributed mode: add.MDC context key available: flow.context (boolean, default: false) - when enabled the following changes are applied

MM2 distributed mode client.id and log change

  • The client.id configuration of the Connect-managed Kafka clients (used by the Tasks) will be prefixed with the flow, using the following pattern: <SOURCE>-><TARGET>| e.g. "primary→backup|".Entities having a toString implementation used in logs and relying on Connector name/Task ID will also contain the flow . (WorkerConnector, WorkerTask and all their subclasses)The Connector context (named "connectorflow.context" in the MDC) will be also prefixed with exported, and it will contain the flow , using the following pattern: <SOURCE>-><TARGET>| e.g. "primary→backup|".
  • The Connect internal thread names will contain the flow next to the Connector name/Task ID.

Proposed Changes

Connect code changes

  • LoggingContext: all a new context-creator methods will now accept a nullable "String prefix" parameter, which if not null, will be prepended to the connector.context value with the "|" separatorstatic method: LoggingContext customContext(Map<String, String> contexts) which exports an arbitrary number of key-value pairs into MDC.
  • WorkerConfig: add a 2 new method methods with the following signature: String contextPrefixsignatures:
    • Map<String, String> customContexts()
    • String threadNamePrefix()
    • The
    implementation
    • implementations in WorkerConfig will return null, DistributedConfig will not override it either.
  • Worker: if WorkerConfig.contextPrefix customContexts returns non-null, use export it to prefix the client.ids of the managed Kafka clients with the "|" separator.in the MDC in all places where LoggingContext is used
  • Worker: if WorkerConfig.threadNamePrefix returns non-null, add it to all internal thread names.WorkerConnector, WorkerTask and their subclasses: store contextPrefix, and if not null, toString also contains the field
  • Important note: since all of the above relies on WorkerConfig.contextPrefix customContexts and WorkerConfig.threadNamePrefix returning non-null, Connect itself will not change.

...

  • New class: MirrorMakerWorkerConfig, subclass of DistributedConfig - the only added logic is that MirrorMakerWorkerConfig has a String contextPrefix flowContext field (set in its constructor), and
    • in
    its String contextPrefix
    • the customContexts() implementation, returns
    the field value.MirrorMakerConfig: new supported config flag add.flow.context
    • "flow.context" → this.flowContext.
    • in the threadNamePrefix() implementation, returns "flowContext|"
  • MirrorMaker: instead of using DistributedConfig, use MirrorMakerWorkerConfig. If add.flow.context is true, pass Pass the flow (SourceAndTarget.toString) to the MirrorMakerWorkerConfig constructor.

...

  • Connect is not impacted at all.
  • MM2 distributed mode is not impacted, as the new functionality is protected with a feature flag. Users need to opt-in after the upgrade by setting add.flow.context to trueimpact is minimal:
    • Connect internal thread names will be changed.
    • The flow context is only added if the user opts-in by referring to the flow.context MDC key in their logging configuration.

Test Plan

Unit tests.

Rejected Alternatives

...

In the context of Connect, this doesn't make sense - in a single Connect cluster, Connector names are unique, and the existing client.id and logging context ensures that the diagnostic information is distinguishable.

Tracking the flow under a separate MDC context key

Instead of modifying the existing "connector.context", an extra context key could be introduced. This extra context value could be exported in the same scope as the connector.context. This would allow opting in through changing the logger configuration instead of changing the MM2 configs.

This would take the same amount of work as the proposed solution, only the configuration method would be different.