Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

MM2 distributed mode should also provide the replication flow in the the logs log context and the thread names to address this problem.

Public Interfaces

MM2

...

dedicated mode logging change

  • New MDC context key available: flow.context

MM2 distributed mode client.id and log change

  • The flow context (named "flow.context " in the MDC) will be exported, and it will contain the replication flow using the following pattern: [<SOURCE>-><TARGET>] e.g. "[primary→backup]".
  • flow.context will be exported in all locations where connector context information is currently exported.
  • The Connect internal thread names will contain the flow next to the Connector name/Task ID.

Proposed Changes

Connect code changes

  • LoggingContext: a new context-creator static method: LoggingContext customContext(Map<String, String> contexts) which exports an arbitrary number of key-value pairs into MDC.
  • WorkerConfig: add 2 new methods with the following signatures:
    • Map<String, String> customContexts()
    • String threadNamePrefix()
    • The implementations in WorkerConfig will return null, DistributedConfig will not override it either.
  • Worker: if WorkerConfig.customContexts returns non-null, export it in the MDC in all places where LoggingContext is used
  • Worker: if WorkerConfig.threadNamePrefix returns non-null, add it to all internal thread names.
  • Important note: since all of the above relies on WorkerConfig.customContexts and WorkerConfig.threadNamePrefix returning non-null, Connect itself will not change.

MM2 code changes

...

  • in the customContexts() implementation, returns "flow.context" → this.flowContext.
  • in the threadNamePrefix() implementation, returns "flowContext|"
  • connect-log4j.properties file will be changed by adding a comment about the flow.context key, and its usage in MM2:
# The `%X{connector.context}` parameter in the layout includes connector-specific and task-specific information
# in the log messages, where appropriate. This makes it easier to identify those log messages that apply to a
# specific connector.
#
# The `%X{flow.context}` parameter can be used in the layout in MM2 dedicated mode. flow.context includes flow-specific information
# in the log messages, where appropriate. This makes it easier to identify those log messages that apply to a
# specific replication flow.
#
connect.log.pattern=[%d] %p %X{connector.context}%m (%c:%L)%n

Thread name changes

The Connect internal thread names will contain a |A->B suffix in MM2 mode. The following Connect internal thread names will change:

Only in MM2 dedicated mode:

  • SourceTaskOffsetCommitter,commitExecutorService (current: SourceTaskOffsetCommitter-0, proposed: SourceTaskOffsetCommitter-0|primary→backup)
  • WorkerTask thread name (current: task-thread-MyConnector-0, proposed: task-thread-MyConnector-0|primary→backup)
  • KafkaStatusBackingStore.sendRetryExecutor (current: status-store-retry-mm2-status.primary.internal, proposed: status-store-retry-mm2-status.primary.internal|primary→backup)
  • KafkaBasedLog.thread (current: KafkaBasedLog Work Thread - mm2-configs.primary.internal, proposed: KafkaBasedLog Work Thread - mm2-configs.primary.internal|primary->backup)

Both in Connect and in MM2 dedicated mode:

  • AbstractHerder.connectorExecutor (current: unnamed, proposed for Connect: connector-executor, proposed for MM2: connector-executor|primary→backup)
  • Worker.executor (current: unnamed, proposed for Connect: worker-executor, proposed for MM2: worker-executor|primary->backup)

Sidenote: the following Connect internal thread names will NOT change:

  • DistributedHerder internal threads - their names already contain the client.id, which contains the flow in MM2 dedicated mode.

Proposed Changes

TBD: log line examples before/after change

...

Compatibility, Deprecation, and Migration Plan

  • The impact on Connect is

    not impacted at all

    minimal, only internal thread names are changed.

  • MM2 distributed mode impact is minimal

    :

    • Connect internal thread names will be changed.

    • The flow context is only added if the user opts-in by referring to the flow.context MDC key in their logging configuration.

Test Plan

Unit tests focusing on the new MDC key flow.context.

Thread names are not tested, as they are not part of any contract (user-facing or programmatic).

Rejected Alternatives

Supporting the same feature in Connect, making it configurable

Updating the existing connector.context MDC value with the flow information

Instead of exporting a separate flow.context MDC value, the existing connector.context can be updated (prefixed/suffixed with the flow)To have a unified feature set, we could also support the same prefixing strategy in Connect. This would require an extra Connect configuration to specify the prefix.In the context of Connect, this doesn't make sense - in a single Connect cluster, Connector names are unique, and the existing client.id and logging context ensures that the diagnostic information is distinguishableextra configurations to allow users to opt-in into the new feature. Additionally, it is less flexible, as it would be tied to the connectors/tasks, while the flow information can be used in broader context (e.g. in Connect worker internal logging).