Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Nevertheless, as Kafka Connect adoption grows, runtime exceptions due to version mismatches might arise more often, and therefore present could become a burden in the development and deployment of connectors with Kafka Connect. 

...

The only publicly visible changes change that are is proposed in order to implement class loading isolation in Kafka Connect is the addition of two a new Connect worker config propertiesproperty:

This is a These are framework level configuration properties property that will affect all the connectors running in within a Connect worker. 

The new configuration option property module.path will accept a list of locations, represented as strings and separated by commas. The strings representing locations should be able to be transformed into URLs , in order to offer maximum flexibility in terms of module discovery. Nevertheless, in the first implementation of class loading isolation such locations will be paths to the local filesystem, that, in turn, can be easily transformed into URLs. 

Examples of valid values for module.path:

  • module.path=/usr/local/share/java
  • module.path=/usr/share/java, /usr/local/share/java, /opt/connectors

When a filesystem path is used as location for imported modules (connectors, transformations and converters), the convention is that each module is storing all its required dependencies in a single directory under the each location path listed in module.pathFor instance, if the location /usr/local/share/java is given in module.path then modules such as my-kafka-source-connectormy-kafka-sink-connectormy-connect-smt and my-converter should store all of their dependencies (usually in the form of jar files, but possibly as raw class files with the appropriate java package structure) under directories as immediately below each module.path entry as follows:

/usr/local/share/java

/my-kafka-source-connector/(connector jars and dependency jars)

/my-kafka-sink-connector/(connector jars and dependency jars)

/my-connect-smt/(smt jars and dependency jars)

/my-converter/

The new configuration option module.isolation.enabled will accept a boolean value that will turn the feature of class loading isolation on and off. 

(converter jars and dependency jars)

A more nested storing of module and dependency jars will not make them discoverable.

Again with module.path set to /usr/local/share/java an incorrect example would be: 

 /usr/local/share/java

/more-modules/another-kafka-source-connector/(connector jars and dependency jars)

In this case, in order to make this module discoverable,  /usr/local/share/java/more-modules should be added to module.path instead.

Additionally, specifically regarding Connect modules, each connector, transformation or converter jar should be listed only once across the module path. If multiple copies of the same version of modules exist under module.path the selection of which module will be loaded will be deterministic but implementation specific.

While the introduction of the new configuration property module.path is the only change While the introduction of these two new configuration properties are the only changes proposed to the public interfaces of Kafka Connect, next are also described the main implementation steps that will be carried out within the framework to support class loading isolation.

Proposed Changes

Add module.path and module.isolation.enabled configuration properties property for workers running both in standalone and distributed mode. By default module.path is empty and module. In this case isolation .enabled is set to false.is not active and loading of connectors depends on the CLASSPATH. In general, when a user wants to run a connector is isolation, its packages along with its dependencies should be stored under a directory that is listed in module.pathOtherwise if non-isolated execution is desired, the connector jars should be listed in the CLASSPATH. 

Given that module.path is set appropriatelyBased on those two configuration properties, when class loading isolation is enabled, the Connect framework will be able to instantiate a custom module classloader for each module under the list of locations supplied in the module.pathThe main characteristics of such a module classloader are that:

  • it filters out classes belonging to the java library and the Connect framework and delegates their loading delegates loading of Connect framework classes as well as java library classes to the appropriate classloader. Optionally, warnings could be issued when a module path contains Connect framework classes.
  • it applies a child-first policy for the rest of the classes, aiming to load module specific dependencies directly.

...

  • the Connect framework controls the threads that run module code (e.g. connector tasks).
  • module classes and dependencies are required to be supplied explicitly through module.path.

Compatibility, Deprecation, and Migration Plan

  • Existing users will not be impacted since isolation will be disabled by defaultan non-existent/empty module.path means that isolation is not in effect
  • Users enabling class loading isolation might experience higher demands in memory usage due to additional loading of otherwise common classes. However this increase is not expected to be prohibitive in most cases.

...

  • Targeted Unit tests will be developed to test the components that will implement class isolation in the framework. Additionally, all other tests will be set to run with class loading isolation turned on. 
  • System tests will be written to test class loading isolation when explicitly conflicting dependencies are introduced by connectors. 
  • Microbenchmarks will be designed to make sure the effect of classloader context switching is negligible. 

Future Work

In a subsequent version of class loading isolation, the following enhancements will be targeted: 

  • Support versioned modules. Users will have the ability to load and run different versions of the same module. Although connectors are versioned already, this capability needs to be enabled for transformations and converters as well. 
  • Extend module.path to support module locations beyond the local filesystem. Such locations might include web addresses, packages repositories (e.g. maven repositories) and other locations. 

Rejected Alternatives

  • Use OSGi. OSGi. Along with its much broader scope, an OSGi implementation would bring significant implementation complexity, both in the framework and the connector development. 

  • Design for project Jigsaw and wait. With this KIP we describe an implementation path whose execution is expected to be focused and efficient. Major upgrades, such as an upgrade to the next Java version are not expected to move faster than the proposed implementation.

...