Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand

Q: Where can I find information about upgrading to a new NiFi version?

  • A: We still need to put together a real upgrade guide; however, in the meantime, there are some notes on upgrading in the System Properties section of the Administrator Guide. These tips have to do with setting up your current NiFi in such a way that will make upgrading easier.

Q: Where can I find information about the REST API?

  • A: The REST API documentation is included in the "help" documentation within the application and also on our web site here. To get to the documentation within the application, click on the "help" link in the upper-right corner of the NiFi user interface. Then, in the pane on the left-hand side, scroll down to the very bottom, where you will see a Developer section, with links to the Developer Guide and the REST API documentation.

Q: What is the base endpoint for the NiFi REST API?

  • A: http://[server]:8080/nifi-api

Q: How do I select multiple items on the graph at the same time, such as if I want to select and move a group of processors?

  • A: You can either select one item, and then hold down the Shift key and select other items, or you can click anywhere on the blank graph, outside what you want to select, then hold down the shift key and drag a selection box around what you want to select.

Q: Am I correct in assuming that I can transit large volumes of data through NiFi flows in and out of Hadoop?

  • A: Yes, you are correct that large payloads can be moved through NiFi. As data moves through NiFi, a pointer to the data is being passed around, referred to as a FlowFile. The content of the FlowFile is only accessed as needed.
    The key for large payloads would be to operate on the payload in a streaming fashion so that you don't read too many large payloads into memory and exceed your JVM memory. As an example, a typical pattern for bringing data into HDFS from NiFi, is to use a MergeContent processor right before a PutHDFS processor. MergeContent can take many small/medium size files and merge them together to form an appropriate size file for HDFS. It does this by copying all of the input streams from the original files to a new output stream, and can, therefore, merge a large amount of files without exceeding the memory of the JVM.

Q: What happens to my data if there is a power loss and the system goes down?

  • A: NiFi stores the data in the repository as it is traversing through the system. There are three key repositories: The FlowFile Repository, the Content Repository, and the Provenance Repository. As a Processor writes data to a flowfile, that is streamed directly to the content repository. When the processor finishes, it commits the session (essentially marks a transaction as complete). This triggers the Provenance Repository to be updated to include the events that occurred for that processor and then the FlowFile repository is then updated to keep track of where in the flow the FlowFile is. Finally, the FlowFile can be moved to the next queue in the flow. This way, if power is lost at any point, NiFi is able to resume where it left off.This, however, glosses over one detail, which is that by default when we update the repositories, we write the information to disk but this is often cached by the operating system. If you truly have a complete loss of power, it is possible to lose those updates to the repository. This can be avoided by configuring the repositories in the nifi.properties file to always sync to disk.  This, however, can be a significant hinderance to performance. Simply killing NiFi, though, will not be problematic, as the operating system will still be responsible for flushing that data to the disk.

Q: How do I enable debug logging for a specific processor, rather than system-wide?

  • A: The logging is configured in <NIFI_HOME>/conf/logback.xml. In that file, you can specify the fully qualified class name for the processor you want to enable specific logging for. For example:

    <logger name="org.apache.nifi.
    processors.standard.GetFile" level="DEBUG"/>

    This would enable DEBUG-level logging for every GetFile processor  in your flow.

...