You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Target release
Epic
Document statusDRAFT
Document owner

Andy LoPresto

Designer
Developers
QA

Goals

  • Document planned security feature roadmap
  • Solicit community feedback on goals, obstacles, user experience, and solutions

**Note: I wanted to get my thoughts documented here but all of this is very early stage and I welcome lots of community feedback here. What are the challenges users and developers face with security, what are the trade-offs people are willing to make, what do they not understand, etc.? I will clean up the formatting and as we feel these features become better described and captured, I will break them out to individual feature proposals. I need to correlate the existing Jiras as well and link them here**

Community Security Features Roadmap

TLS
* Refactor and consolidate {{SSLContextFactory}}, {{SSLContextService}}
* Refactor internals of TLS Toolkit
- "1-click" deployment with scripts for client certificate import to browser, etc.
* Individual configurations for
- UI/API
- Cluster
- Site to Site
- Processors ({{SSLContextService}})
* Mozilla Labs integration
- Easy assessment
- Easy config
- Automatic cipher suite upgrades/deprecations

Encrypted Config
* Login Identity Provider coverage
* Integration with Ambari
- CET read password/key from file descriptor
- Clean up original properties file after encryption
- Reversibility?
* Integration with Variable Registry
* Remove master key from {{bootstrap.conf}}
* Process monitor opacity
* Other provider integrations
- HSM
- Hadoop {{CredentialsProvider}} API
- Vault
- KeyWhiz

Provenance
* Signature on each event record (all metadata of event record, content claim ID/hash of content -> {{HMAC/SHA-256}} with per-node unique secret)
* Signature on chain (concatenation of all ER sigs -> {{HMAC/SHA-256}})
- CR key could be unique per-node
+ Node A -> Node B requires chain sigs on Node B to be {{S(S(A1|A2, KAC)|B1, KBC)}}; can't verify {{S(A1|A2, KAC)}} on Node B
+ *KAc*, *KBc*, etc. could be public/private key pairs derived from constant master key across all nodes (would allow cross-node verification)
- CR key could be same across all nodes in cluster/S2S deployment
+ Verifiability extends across entire lifespan of lineage data
- Investigate Axolotl (Double Ratchet), blockchain/alternative chain,
* UI display of provenance trust
- Per-event and per-chain
- Alerts for manipulated/modified provenance data

Repositories
* Transparent encryption of repository before persistence to file system
- {{Provenance}}
- {{Content}}
- {{Flowfile}} (attributes)
- {{Log}}? (intercept, filtering, hash/obfuscation?)
- {{Bulletin}}? (currently volatile impl only)
- {{ComponentStatus}}? (currently volatile impl only)
- {{Counter}}? (currently volatile impl only)
* Balance performance with security (retrieval has high cost)
* At which layer does the encryption/decryption occur (closest to file system/actual implementation of {{*Repository}} interface/AOP?)

Sensitive Attributes
* Mark attributes as sensitive (i.e. masked in UI, restricted to specific user access policy)
- Per-processor (e.g. all attributes originating from any {{EncryptContent}} processor)
- Per-instance (e.g. attributes originating from a specific UpdateAttribute processor which is handling PII)
* Encrypt before persisting even if not using {{EncryptedProvenanceRepository}}/{{EncryptedFlowfileRepository}}
* Existence of attribute could even be sensitive (e.g. SSN attribute, "security level" attribute)
* Cannot be modified/updated/removed by other processors?

Sensitive Content
* Data that enters system and needs to be immediately encrypted/anonymized/filtered
* Provenance provides access to raw input (e.g. before/after {{EncryptContent}})
* Ability to mark on processor to restrict/mask input of any flowfile that enters processor and provide access only to users with restricted access control policy
- Extends to provenance/content repositories

Dangerous Processors
* Processors which can directly affect behavior/configuration of NiFi/other services
- {{GetFile}}
- {{PutFile}}
- {{ExecuteScript}}
- {{InvokeScriptedProcessor}}
- {{ExecuteProcess}}
- {{ExecuteStreamCommand}}
* These processors should only be creatable/editable by users with special access control policy
* Marked by {{@Restricted}} annotation on processor class
* All flowfiles originating/passing through these processors have special attribute/protection

Flow Sensitivity Analysis
* Application-level intelligence to analyze flows (based on flow graph or flowfile provenance lineage) and determine existence of "dangerous processors" or "security processors" and proactively enable encrypted repositories/sensitive attributes for data traversing that flow

Visual indicators of security state
* UI panel displaying current server security state ("dashboard"/"quick view")
- TLS config for UI/API
+ Authentication mechanisms in place (client certificate, Kerberos, LDAP)
- TLS config for cluster
- TLS config for S2S
- Encryption status of repositories
- Users with admin/DFM/"dangerous" access

Extension Repository
* Cryptographic signatures of extensions verified by application to ensure no malicious interception/replacement of installed packages

Schema Repository
* Cryptographic signatures of schema handlers verified by application to ensure no malicious interception/replacement of installed packages

Variable Registry
* Detection of VR EL before template export (e.g. variable token {{${mysql_password}}} vs. literal password {{${mysql_password}}})
* Prevent enumeration of available values by unauthorized processors (e.g. {{RouteOnAttribute}} processor does not need access to {{mysql_password}})

  • No labels