Target release
Epic
Document statusDRAFT
Document owner

Aldrin Piri

Designer
Developers
QA

Goals

  • Provide a feature rich environment for aiding in the design, deployment and management of flows in MiNiFi instances

Background and strategic fit

Being responsive to changing and evolving needs of data collection and aggregation requires changes to be adaptable to changing needs an organization has for the information being collected.  

This is two faceted in terms of needed functionality.  First, a user experience and interface for the designing and versioning of flows.  The second, a means of making flows available for instances to receive causing updated processing to occur.

User Experience and Flow Design

This could be an extension of the core NiFi interface but a separate workspace and feel.  At its core, a minifi-api could be introduce which functions similarly to the nifi-api, a REST API that drives the user interface and core design functionality.  The reason for a separate module is to allow arbitrary enabling/disabling of the MiNiFi functionality in a NiFi instance.  While a similar user experience to NiFi in terms of design is extremely valuable, the context and palette available is very much discrete.  To that end, the workspace approach would allow a separate context for users to carry out the task of managing their MiNiFi flows with unique tooling to that workflow.

Users could create flows on a per MiNiFi class basis.  A class is defined as a group of MiNiFi instances that share a common flow. Using an approach similar to that content outlined with the Configuration Management of Flows.

Users would also be able to select the current, or active, flow for a given class of instances and make this available for deployment.  At minimum, metadata would include a hash or signature of the flow as well as an identifier 

Command & Control - Flow Deployment/Updating

The other scenario to be supported for MiNiFi is more application focused and provides the needed infrastructural components.  At its core, this introduces a Command and Control API (C&C API) which is inherently a defined set of REST endpoints and resources that could be implemented in any language of choice.  An initial implementation could be created in Java in a manner analogous to that of the aforementioned nifi-api and minifi-api modules.

An important note is the positioning and nature in which the C&C API could be deployed and utilized.  As systems extend farther from core infrastructure and networking, the means by which communication occurs increases in complexity inclusive of items such as availability, bandwidth, NAT traversal and organizational and security policies. As a result, there may be varying tiers of access and the need for a common API to be available and consumable in a distributed and possibly localized manner.  In NiFi environments, the idea of the Flow Persistence Provider could provide a façade to a more canonical repository of flows or cache and provide a subset of those flows locally.

Specific implementations of the C&C API could provide sophisticated provisioning of flows to subgroups of classes akin to split testing based upon individual MiNiFi instance metadata. 

Command & Control – Flow Consumption & Data Tagging

Flows could be consumed through various means driven by the Configuration Change Notifier/Listener approach currently provided in an initial implementation and design in the MiNiFi codebase.  This allows MiNiFi to be amenable to the mechanism in which flows could be transferred to a given set of instances.  The desired mechanism would be to make use of the C&C API directly, but in some cases may require a file to be delivered to a specific directory.  While there may be advantageous paths as default means of transport, the C&C API in conjunction with extensible Configure Change Notifiers allows instances to be adaptable to realities of an organization’s network and compute infrastructure.

Making use of immutably versioned flows provided to instances would allow the tagging of FlowFile data and/or provenance events generated, tied to a specific flow version.  This empowers the destination systems of MiNiFi data to make determinations on the inherent worth of the data received.  For those instances where data is collected/generated by a system that has an outdated flow, it may be of little or no value or require additional/separate processing.  

Assumptions

Requirements

#TitleUser StoryImportanceNotes
1
2    

User interaction and design

Questions

Below is a list of questions to be addressed as a result of this requirements document:

QuestionOutcome

Not Doing

 

Flow Authorship Details

10 Comments

  1. I think this writeup is a great start! I do have a few questions so far though.

    1. How different will nifi-api and minifi-api be? I can see them having different processors, slightly different capabilities, and deployment models but does this warrant a completely separate api layer? Would it be possible instead to keep the nifi-api but have it use appropriate design-time features based on context?
    2. Will the first iteration of the C&C api have high availability as a goal? If we have peer listing as a defined endpoint, each MiNiFi instance could curate a list of reachable peers in case of failure or a change in network topology.
    3. Do we intend for the Configuration Change Notifier/Listener interfaces to be extensible by dropping in a nar? This would be more flexible for users of MiNiFi. The C&C implementation could still poll REST endpoints but a hierarchical or peer to peer push model could be desirable if near real-time updates are desired and the network topology supports it.

      1. I think there may be a good bit of overlap, but found it cleaner to just keep that endpoint separate from nifi-api so that it could be enabled/disabled separately and, if needed, have its own configuration in terms of network access.  No strong preferences otherwise.
      2. I think that seems fair.  When we start getting into this distributed process it is a beginning to not just be about the software but also about the networking and infrastructure supporting it.  Certainly having several endpoints that can support a given instance would be fair, but could see this being something that is pushed more on the client initially where it fails over to another point.
      3. Not sure.  While I do like them as an extension point, I wonder if it needs that same level of extension.  I think the cases you mention are certainly good targets, but wonder if maybe there is a kind of fixed set of protocols that make sense and that is enough.  The NAR is primarily useful for the classloader isolation, which I feel could get heavy in the context of smaller instances.  

      Thanks!

       

      1. In regards to number 3, on the surface I think it's an interesting idea but I don't think extensibility of notifiers is worth the down-sides of the classloader isolation (ie. increased footprint). Also I don't foresee the ability to drop in new notifiers in arbitrary MiNiFi versions particularly useful (kinda like queue prioritizers in NiFi). Lastly, if we kept it how it is it would also allow us to remain flexible in regards to refactoring the notifier API between minor versions, which could be very helpful given how early we still are in terms of design.

  2. Great start Aldrin!

    one small feedback:

    It would be great to distinguish command and control from a management point of view. I can envision a lot of shops using something like ansible, puppet, chef, whatever to deploy MiNiFi. In such cases, users should be able to still edit the flow but not to push it, see the status but not to manipulate it and so it goes.

    Such strategy woul ease managing staged rollouts in alignment with host changes (instead of solely based on flow or nar changes)

    1. Hey Andre,

      Thanks for the feedback, and agree entirely on this front.  Would also invite you to scope out Configuration Management of Flows#FlowVersioning if you have not had the opportunity to do so yet.  That notion is definitely included where a flow could be designed/saved but not necessarily deployed.  The point and case you call out though is certainly one to keep in mind and track.

      Thanks!

  3. Just echoing Andre and Bryan's praise, thanks for getting this started Aldrin!

     

    One thing that isn't mentioned but would go a long way in terms of updating agents on the edge, MiNiFi version and Nar updates. After deploying 1000s+ of agents on the edge you don't want to have to manually update each one when a new version comes out. Being able to automatically update the MiNiFi version as well as deploying new Nars would be a game-changer. 

     

    From the MiNiFi agent perspective, NAR updates probably wouldn't be that hard since they are designed to be isolated in a logical package. All it would require is receiving/pulling them (like with a new flow), adding it to the lib dir, wiping the work dir and restarting the underlying instance. Then on the centralized side, it would probably be a part of the extension registry?

     

    Updating the MiNiFi version would be much harder though as it would require updating the bootstrap module itself (which currently handles flow changes). On the centralized side, it probably makes sense to have it in the extension registry as well. 

     

    We should focus first on Flow C&C but nar and MiNiFi version updating are good longer term goals to keep in mind.

  4. All,

    I have played with MiNiFi on a slightly more production like environment and here's my feedback regarding C&C:

    1. It is VERY likely MiNiFi will use TLS to link with secure NiFi clusters. Therefore, simplified provisioning of certificates so to enable mutual authenticaon should be paramount. 
      I believe we should consider the following use cases:
      1. Certificate lifecycle are managed externally (e.g. FreeIPA). No need to reinvent the wheel, let people do it.
      2. Integrated certificate lifecycle. We should possibly look at implementing RFC 7030 or jscep servers in NiFi and respective clients in MiNiFi.
        1. Support a basic CA (by CA I mean, an interface that can be used to sign, renew, revoke certificates), 
        2. In some cases clients will want to use an existing CA, in these cases, NiFi should be able to direct the RFC 7030 / jscep messages to the final CA using suitable protocols.

    2. Flow update:
      I like the idea of giving great freedom to the DFM to define how to update the agents, preferably using the NiFi canvas itself. The "C&C canvas" could be a special instance of a process group, requiring just a way to clients to communicate. To this point, perhaps we could extend the Site to Site protocol or simply extend the HandleHttpRequest / HandleHttpResponse to provide a series of pre-canned endpoints such as:

      1. GET /minificc/TargetFlowVersion (to ship a template for example)
      2. GET /minificc/TargetMiNiFiVersion (to ship new NiFi version)
      3. GET /minificc/TargetNarVersion (to ship a particular NAR bundle)
      4. POST /minificc/AgentStatus (to ping with status)
      5. anything else (to allow MiNiFi to be extended).

      And literally let the DFM use NiFi dataflows to manage the agent fleet? 

      1. Client Registry is important? No issues, write the data provided in AgentStatus into the registry using a processor (PutCouchbase for example). A standard registry and UI may be offered but instead of being prescriptive, we give the ability to the DFM to create their own using the DataFlow.
      2. Update flows is important? Well, point the GET TargetFlowVersion to GetFile $NIFI_HOME/templates/${some_identifier} and ship a template we just exported.
    3. This way we could focus exclusively on the agent side behaviour that is critical:

      1. Acquiring connectivity with the cluster (i.e. the provisioning certs if necessary)
      2. Ensure there's always a version of the flow that is operational (e.g. bootstrap looking for a "upgrade_required status, plus flow version history current and new)
      3. Provide properties (via registry) that can be used by the control endpoints to classify the agent (e.g. uuid, device name, UserDefinedValue1, UserDefinedValue2, UserDefinedValue3).
      4. Something more?

    To be honest, I suspect I oversimplified things (devil always lies on the details) but this idea doesn't seem too far away from what I understand AWS has implemented as part of AWS IoT (MTQQ backed by rules engine, Lambda and DynamoDB). I would suggest the difference seems that we provide users with an API but also an flexible agent.

     

    Thoughts?

    1. "At its core, this introduces a Command and Control API (C&C API) which is inherently a defined set of REST endpoints and resources that could be implemented in any language of choice.  An initial implementation could be created in Java in a manner analogous to that of the aforementioned nifi-api and minifi-api modules. "

       

      Implies to me a rather heavy weight protocol, but much less so than TLS. 

      Would it be useful to define classes of MiNiFi agents? 

      I imagine a situation where you have some MiNiFi agents that could support TLS, while there are others whose throughput may be impacted by the addition of TLS. Has there been discussion of how we view different classes of MiNiFi agents and when/if security can be viewed differently?

      Further, C2C typically has some investment in routing protocols. I don't see much in terms of a direction for how routing will occur across the API or whether or not this is something to be left to an external agent. 

  5. With a centralized C2 would you support decentralizing distribution of C2 commands such that you limit entry points into a network with a single node(s) distributing C2 commands on behest of the C2 central server?

  6. Can you define "In NiFi environments, the idea of the Flow Persistence Provider could provide a façade to a more canonical repository of flows or cache and provide a subset of those flows locally." in regards to MiNiFi. It makes complete sense in relation to NiFi and installations thereof, but it begs the question of survivability of said data when evaluating the variability of agents.