Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

New contributors to NiFi often have the same question: Where can I start?

...

the different components that make up the platform interact with one another.

 

FlowFile

We will begin the discussion with the FlowFile. This is the abstraction that NiFi provides around a single piece of data.

...

about without parsing the content.

 

Processor

This is the most commonly used component in NiFi and tends to be the easiest place for newcomers to jump in.

...

The Processor is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new

major release of NiFi.

 

Processor Node

The Processor Node is essentially a wrapper around a Processor and maintains state about the Processor itself. The Processor

...

managed by the framework.

 

Reporting Task

A Reporting Task is a NiFi extension point that is capable of reporting and analyzing NiFi's internal metrics in order to provide the information to external

...

The Reporting Task is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new

major release of NiFi.

 

Controller Service

The Controller Service is a mechanism that allows state or resources to be shared across multiple components in the flow. The SSLContextService, for instance,

...

The Controller Service is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new

major release of NiFi.

 

Process Session

The Process Session (often referred to simply as a "session") provides Processors access to FlowFiles and provides transactional behavior across

...

a relationship for which multiple connections have been established). 

 

Process Context

The Process Context provides a bridge between a Processor and its associated Processor Node. It provides information about about the Processor's current

...

so that Processors are able to take advantage of shared logic or shared resources.

 

FlowFile Repository

The FlowFile Repository is responsible for storing the FlowFiles' attributes and state, such as creation time and which FlowFile Queue

...

minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.

 

Content Repository

The Content Repository is responsible for storing the content of FlowFiles and providing mechanisms for reading the contents

...

minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.

 

Provenance Repository

The Provenance Repository is responsible for storing, retrieving, and querying all Data Provenance Events. Each time that a FlowFile is

...

minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.

 

Process Scheduler

In order for a Processor or a Reporting Task to be invoked, it needs to be scheduled to do so. This responsibility belongs to the Process Scheduler. In addition to

...

determine which Scheduling Strategy to use (Cron Driven, Timer Driven, or Event Driven), as well as the scheduling frequency.

 

FlowFile Queue

Though it sounds sufficiently simple, the FlowFile Queue is responsible for implementing quite a bit of logic. In addition to queuing the FlowFiles for another component

...

with the ability to prioritize the data so that the most important data is always sent first and the less important data eventually expires.

 

FlowFile Prioritizer

A core tenant of NiFi is that of data prioritization. The user should have the ability to prioritize the data in whatever order makes sense for

...

The Prioritizer is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new

major release of NiFi.

 

Flow Controller

In order for NiFi's User Interface to display the wealth of information that it renders, it must have some place to gather that information.

...

is responsible for coordinating connection and participation in a NiFi cluster.

 

Cluster Manager

Whereas the Flow Controller is responsible for maintaining system-wide state about a particular node, the Cluster Manager is responsible

...

configuration found in the nifi.properties file.

 

Authority Provider

When NiFi is configured, it can be configured to run in secure mode, using SSL to access the web endpoints or to run in non-secure mode,

...

The Authority Provider is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new

major release of NiFi.

 

*-Resources

NiFi's User Interface shows only information that is available via the NiFi RESTful API. This is accomplished by accessing the different endpoints

...

All of the Resource components are found in the nifi-web-api module, in the org.apache.nifi.web.api package.

 

Bootstrap

In order for an organization to be able to depend on NiFi to handle their dataflows in an automated fashion, the organization needs to be able

...

to ensure that the application continues to provide reliable dataflow.

 

NarClassLoader

In a containerized environment like NiFi, it is important to allow different extension points to have arbitrary dependencies without those dependencies

...