Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Establishes a two process mechanism similar to that of existing NiFi:
    1. Bootstrap Process: controls the instantiation and execution of the flow process and aids in receiving configuration changes (products of design and deploy approach)
    2. Flow Process:  handles the actual collection and transmission of data
  • Makes use of a configured state to drive the process of starting a flow, this should be extensible to allow various implementations of inputs

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = MINIFI AND component = "Agent Configuration/Installation" AND resolution = Unresolved ORDER BY key DESC
serverId5aa69414-a9e9-3523-82ec-879b028fb15b

Registration/Announcement

Agents will have a defined taxonomy and capabilities associated with them. These properties will aid in the agent being able to communicate what items are possible and aid flow designers in the process of creating flows for various agent classes. Said capabilities will be communicated with a manager for the sake of understanding what is possible with various agents. Capabilities and capacities may change over time and this information will be continually registered with associated systems 

Agent Classes

Longer term, agents should be able to convey their capabilities as a result of items such as environment, version of software, networking, and hardware for establishing configuration of flow and collected data from a manager perspective. 

...

  • ConfigurationChangeListener - Provides the handling of updates to the agent from an external source
    • In the simplest case, this would be evaluating changes to a configuration file

 

Configuration - Processing Flow

  • Design and Deploy driven where the associated flow is provided via the bootstrap process

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = MINIFI AND component = "Agent Configuration/Installation" AND resolution = Unresolved AND priority = Major ORDER BY key DESC
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
 

Data Format

The FlowFile format has been the core serialization format of NiFi and provides structure that allow for ease of files traversing a given flow and exploit pass by reference semantics in routing operations.  Of interest is the handling of information with the core FlowFile format as metadata is transmitted from the agent to a receiving node/system.  This may be out of band or as an augmentation to the FlowFIle format.   

ProvenanceProvenance 

Agent Statistics/Heartbeat

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = MINIFI AND component = "Data Format" AND resolution = Unresolved AND priority = Major ORDER BY key DESC
serverId5aa69414-a9e9-3523-82ec-879b028fb15b

Data Ingress

Provides a means for introducing data into the system and currently maps data to existing processors in the system. Given the desire to make use of existing libraries and functionalities when developing the initial agent offering, focus will be provided to the core use cases, mapping to existing processors, this would be comprised of:

  • Files (Tail, Get)
  • Logs (Listed Syslog, UDP

Data Egress

Egress is viewed as high level terminology for getting data from an agent to an associated system. The complexity and needs for this functionality may vary across environments and may have complex networking schemes required 

Communication and Protocols

For the existing proof of concept and for establishing an agent to make larger architectural decisions, the Java agent can make use of the existing Site to Site protocol and functionality to communicate with an endpoint system. 

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = MINIFI AND component = "Data Transmission" AND resolution = Unresolved AND priority = Major ORDER BY key DESC
serverId5aa69414-a9e9-3523-82ec-879b028fb15b


Glossary

Agent

A lightweight process, capable of being constructed for acquiring information from a host system(s) and providing this information to another system for consumption. This process provides provenance, a directed graph of processing, and extensibility to map to various data formats, schemas, and protocols.

Capability

Functionality that a given agent is able to perform. In some contexts, this may be communicating with specific devices, handling a certain nature or complexity of data, compute power, or serving specific roles in the data ingress/egress process from generation to consumer

Class

An aggregation of one or more capabilities that allows specific agents to carry out a given processing graph. For example, a high-level view of a "File Forwarder" class would require the capability to both interact with the file system to get files and additionally have one or more egress methods to return information to a desired consumer

Egress

A generic term for providing information from an agent to one or more consumers. In simplest form, this is a direct line through networking to send data to a desired target. In more complex environments, this may require an n-hop network relying on several other agents to relay the data throughout the network traversal.