This article will walk through some of the configuration options within MiNiFi CPP.  As discussed in the Readme, MiNiFi C++ is configured through two files, a config YAML file that includes the flow configuration and general properties for the MiNiFi process. . Users can use the MiNiFi toolkit converter to help create flow configurations from a template exported from a NiFi instance, but need to check the generated processors against PROCESSORS.md, because some properties are not supported, or are named differently. This document will assume that the YAML configuration will be generated by the toolkit. The Flow Configuration Options section, below, provides some information on each section within the flow configuration file. Note that the examples below are taken from the Github Readme file. System properties is a child page that defines all system properties that are specified in minifi.properties. 

System Properties

  System properties table defines system properties and provides the default value, if one exists for each. 

Flow Configuration options

Processors

Processors are defined within the YAML configuration, below. You must specify the number of concurrent tasks, the scheduling strategy, the NiFi class  and properties, if there are any to define.The id is a unique identifier

that must be defined for each processor. When specifying connections, you will reference the success relationships and this identifier as either the source and/or destination ID. Note that if the properties do not exist

for the processor, it will fail to run. 

Processors:
    - name: GetFile
      id: 471deef6-2a6e-4a7d-912a-81cc17e3a206
      class: org.apache.nifi.processors.standard.GetFile
      max concurrent tasks: 1
      scheduling strategy: TIMER_DRIVEN
      scheduling period: 1 sec
      penalization period: 30 sec
      yield period: 1 sec
      run duration nanos: 0
      auto-terminated relationships list:
      Properties:
          Input Directory: /tmp/getfile
          Keep Source File: true


Connections

Connections define the relationships as input and output ports within the defined flow. The example, below, uses the processor defined in the previous section and passes it to a remote processing

group upon success. Options within the connection limit the maximum work queue size and the amount of time flowfiles may spend in the queue. 

Connections:
    - name: TransferFilesToRPG
      id: 471deef6-2a6e-4a7d-912a-81cc17e3a207
      source name: GetFile
      source id: 471deef6-2a6e-4a7d-912a-81cc17e3a206 
      source relationship name: success
      destination id: 471deef6-2a6e-4a7d-912a-81cc17e3a204
      max work queue size: 0
      max work queue data size: 1 MB
      flowfile expiration: 60 sec


Remote processing Groups

Remote processing groups are processors define input and output ports for a NiFi instance using the Site to Site protocol. In this example, the URL defines a local instance of NiFi and an input port. This

input port will define the UUID of the input port on the NiFi instance. In this case, properties are defined as port and host name within the input port. The number of concurrent tasks can be configured

to limit or increase the concurrency in the data being sent to NiFi.  

Remote Processing Groups:
    - name: NiFi Flow
      id: 471deef6-2a6e-4a7d-912a-81cc17e3a208
      url: http://localhost:8080/nifi
      timeout: 30 secs
      yield period: 10 sec
      Input Ports:
          - id: 471deef6-2a6e-4a7d-912a-81cc17e3a204
            name: From Node A
            max concurrent tasks: 1
            Properties:
                Port: 10001
                Host Name: localhost


Provenance Reporting

To add Provenance Reporting to config.yml, define a configuration block like the one below. Define your RPG host, port, and UUID using

the Site To Site protocol. The batch size will limit the number of provenance reports that are sent to the aforementioned NiFi instance. 

Provenance Reporting:
  scheduling strategy: TIMER_DRIVEN
  scheduling period: 1 sec
  port: 10001
  host: localhost
  port uuid: 471deef6-2a6e-4a7d-912a-81cc17e3a204
  batch size: 100

Controller Services

If you need to reference a controller service in your config.yml file, use the following template. In the example, below, ControllerServiceClass is the name of the class defining the controller Service. ControllerService1 is linked to ControllerService2, and requires the latter to be started for ControllerService1 to start.

Controller Services:
  - name: ControllerService1
    id: 2438e3c8-015a-1000-79ca-83af40ec1974
  	class: ControllerServiceClass
  	Properties:
      Property one: value
      Linked Services:
        - value: ControllerService2
  - name: ControllerService2
    id: 2438e3c8-015a-1000-79ca-83af40ec1992
  	class: ControllerServiceClass
  	Properties: