Overview
New contributors to NiFi often have the same question: Where can I start?
...
the different components that make up the platform interact with one another.
FlowFile
We will begin the discussion with the FlowFile. This is the abstraction that NiFi provides around a single piece of data.
...
about without parsing the content.
Processor
This is the most commonly used component in NiFi and tends to be the easiest place for newcomers to jump in.
...
The Processor is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new
major release of NiFi.
Processor Node
The Processor Node is essentially a wrapper around a Processor and maintains state about the Processor itself. The Processor
...
managed by the framework.
Reporting Task
A Reporting Task is a NiFi extension point that is capable of reporting and analyzing NiFi's internal metrics in order to provide the information to external
...
The Reporting Task is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new
major release of NiFi.
Controller Service
The Controller Service is a mechanism that allows state or resources to be shared across multiple components in the flow. The SSLContextService, for instance,
...
The Controller Service is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new
major release of NiFi.
Process Session
The Process Session (often referred to simply as a "session") provides Processors access to FlowFiles and provides transactional behavior across
...
a relationship for which multiple connections have been established).
Process Context
The Process Context provides a bridge between a Processor and its associated Processor Node. It provides information about about the Processor's current
...
so that Processors are able to take advantage of shared logic or shared resources.
FlowFile Repository
The FlowFile Repository is responsible for storing the FlowFiles' attributes and state, such as creation time and which FlowFile Queue
...
minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.
Content Repository
The Content Repository is responsible for storing the content of FlowFiles and providing mechanisms for reading the contents
...
minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.
Provenance Repository
The Provenance Repository is responsible for storing, retrieving, and querying all Data Provenance Events. Each time that a FlowFile is
...
minor versions of NiFi. It is, therefore, not recommended that implementations be developed outside of the NiFi codebase.
Process Scheduler
In order for a Processor or a Reporting Task to be invoked, it needs to be scheduled to do so. This responsibility belongs to the Process Scheduler. In addition to
...
determine which Scheduling Strategy to use (Cron Driven, Timer Driven, or Event Driven), as well as the scheduling frequency.
FlowFile Queue
Though it sounds sufficiently simple, the FlowFile Queue is responsible for implementing quite a bit of logic. In addition to queuing the FlowFiles for another component
...
with the ability to prioritize the data so that the most important data is always sent first and the less important data eventually expires.
FlowFile Prioritizer
A core tenant of NiFi is that of data prioritization. The user should have the ability to prioritize the data in whatever order makes sense for
...
The Prioritizer is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new
major release of NiFi.
Flow Controller
In order for NiFi's User Interface to display the wealth of information that it renders, it must have some place to gather that information.
...
is responsible for coordinating connection and participation in a NiFi cluster.
Cluster Manager
Whereas the Flow Controller is responsible for maintaining system-wide state about a particular node, the Cluster Manager is responsible
...
configuration found in the nifi.properties
file.
Authority Provider
When NiFi is configured, it can be configured to run in secure mode, using SSL to access the web endpoints or to run in non-secure mode,
...
The Authority Provider is an extension point, and its API will not change from one minor release of NiFi to another but may change with a new
major release of NiFi.
*-Resources
NiFi's User Interface shows only information that is available via the NiFi RESTful API. This is accomplished by accessing the different endpoints
...
All of the Resource components are found in the nifi-web-api
module, in the org.apache.nifi.web.api
package.
Bootstrap
In order for an organization to be able to depend on NiFi to handle their dataflows in an automated fashion, the organization needs to be able
...
to ensure that the application continues to provide reliable dataflow.
NarClassLoader
In a containerized environment like NiFi, it is important to allow different extension points to have arbitrary dependencies without those dependencies
...