Target release1.2.0
EpicNIFI-3380
Document statusDRAFT
Document owner

Matt Gilman

Designer
Developers
QA

Goals

Run multiple versions of the same processor, controller service, or reporting task in the same NiFi instance

Background and strategic fit

Currently NiFi does not capture the version of a component (processor, controller service, reporting task). If two NARs are placed in the lib directory, and each have a component with the same class name in the same package, only one of them will be loaded and a warning will be logged for the other (this is currently non-deterministic).

In order to eventually support an extension registry, it will become important to know the version of a component and to support running different versions of the same component at the same time.

NAR Maven Plugin

The NAR Maven Plugin currently produces a MANIFEST file with the following content, using the nifi-hadoop-nar as an example:

Manifest-Version: 1.0

Archiver-Version: Plexus Archiver

Built-By: bbende

Nar-Id: nifi-hadoop-nar

Nar-Dependency-Id: nifi-hadoop-libraries-nar

Created-By: Apache Maven 3.3.3

Build-Jdk: 1.8.0_74

The Nar-Id is the id of the NAR where the MANIFEST exists, and the Nar-Dependency-Id is an optional id of a NAR that the given NAR is dependent on. In this case, the nifi-hadoop-nar contains the HDFS processors, which is dependent on the nifi-hadoop-libraries-nar which contains the Maven dependencies for the Hadoop client.

The NAR Maven Plugin will be updated to include additional versioning information:

Manifest-Version: 1.0

Archiver-Version: Plexus Archiver

Built-By: bbende

Nar-Group-Id: org.apache.nifi

Nar-Id: nifi-hadoop-nar

Nar-Version: 1.1.0

Nar-Dependency-Id: nifi-hadoop-libraries-nar

Nar-Dependency-Version: 1.1.0

Created-By: Apache Maven 3.3.3

Build-Jdk: 1.8.0_74

Build-Tag=HEAD

Build-Branch=master

Build-Revision=31ec01b

Build-Timestamp=2017-01-20T14:32:56Z


The NAR Maven Plugin will populate Nar-Group-Id, Nar-Id, and Nar-Version from their corresponding Maven properties, and will provide a way to specify a different version for cases where someone would like to publish multiple “versions” of the same Maven version.

The combination of the Nar-Group-Id, Nar-Id, and Nar-Version will be considered the component coordinates for an instance of a component created from the given NAR.

Class Loading

NarClassLoaders identifies all the NARs in the lib directory and creates a map from Nar-Id to a ClassLoader for the given NAR. It is also responsible for creating the link between a NAR’s ClassLoader and the dependent NAR’s ClassLoader, if one exists.

NarClassLoaders will be updated to read the Nar-Goup-Id, Nar-Id, and Nar-Version id from the MANIFEST of each NAR, and use this information when creating the mapping to class loaders, and when creating the link to a dependent NAR’s ClassLoader.

The ExtensionManager is given all of the class loaders from NarClassLoaders and uses the Java ServiceLoader to find all classes implementing the extensions points. In addition, it maintains a map of class names to class loaders.

The ExtensionManager will be updated to receive the Nar-Group-Id, Nar-Id, and Nar-Version from NarClassLoaders, and will use this information to map component coordinates to a class loader. Additional methods will also be added to allow obtaining a class and component coordinates.

If NiFi detects multiple NARs with the same exact Nar-Group-Id, Nar-Id, and Nar-Version, then start-up will fail.

Creating Components

When creating components the user will be able to see the component coordinates for each processor name. For example, if two version of the nifi-hadoop-bundle were deployed to the lib directory, the user would be able to choose between different versions of the PutHdfs processor:

  • PutHdfs org.apache.nifi:nifi-hadoop-nar:1.0.0

  • PutHdfs org.apache.nifi:nifi-hadoop-nar:1.1.0

In addition, once a component has been created there will be some way to see component coordinates, possibly from the configuration screen.

This will require updating REST endpoints and DTOs to pass along additional information.

Serialization

The component coordinates will be captured in the serialized representation of the flow.xml, templates, and versioned flows. For example, if we had a PutHdfs processor from nifi-hadoop-nar 1.0.0 and a PutHdfs processor from nifi-hadoop-nar 1.1.0, the flow.xml would show something like the following:

<processor>

  <id>1</id>

  <name>PutHdfs</name>

  <class>org.apache.nifi.hadoop.PutHdfs</class>

  <componentCoordinates>

    <groupId>org.apache.nifi</groupId>

    <artifactId>nifi-hadoop-nar>/artifactId>

    <version>1.0.0</version>

  </componentCoordinates>

</processor>

<processor>

  <id>2</id>

  <name>PutHdfs</name>

  <class>org.apache.nifi.hadoop.PutHdfs</class>

  <componentCoordinates>

    <groupId>org.apache.nifi</groupId>

    <artifactId>nifi-hadoop-nar>/artifactId>

    <version>1.1.0</version>

  </componentCoordinates>

</processor>

This will help to solve NIFI-925 which was a request to run multiple versions of the HDFS processors, for talking to two different Hadoop distributions. 

Assumptions

  • Different versions of the same extension could not be bundled in the same NAR
  • Multiple version will only be supported for processors, controller services, and reporting tasks. If someone deploys two NARs that contain the same class in the same package for another extension point (repos, authorizers, etc) this should be considered a failure during start-up.

  • Processors with custom UIs will need to be updated to handle the context path to the custom UI.

Requirements

#TitleUser StoryImportanceNotes
1    

User interaction and design

Questions

Below is a list of questions to be addressed as a result of this requirements document:

QuestionOutcome

Not Doing