Apache Ranger is getting a lot of momentum in the open source community, and 0.5 release promises to be a bigger step towards the vision of providing comprehensive security for Hadoop. The 0.5 release is focused on the following areas
Release Theme | Description | Benefit to users | Apache JIRA# |
---|---|---|---|
Extensibility - Ranger Stacks | Complete re-architecting Ranger to enable new plugins easily | Easily add custom plugins and use Ranger to support multiple datatores | |
Hooks for dynamic access control | Users need to support dynamic access control conditions such as geo, time etc | Users can add dynamic rules in addition to static RBAC policy evaluation | |
Authorization and auditing support for YARN | Provide ability to manage queue level authorization within Yarn and also audit | Users can manage Yarn ACLs along with other Hadoop component in single UI | |
Authorization and auditing support for Kafka | Manage Kafka authorization policies in Ranger and also audit Kafka | Like Yarn, users can manage Kafka security through the centralized security console that other Hadoop components are using | |
Audit Optimization | Couple of things
| Ranger audit would expand into newer components. With audit summarization, we would be able to manage audit volumes for large event systems like Kafka while still maintaining the traceability required by auditors and compliance teams | |
Metadata tags and tag based policies | As complexity of data increases, it is important to classify and tag data it is coming into Hadoop. This feature provides a method to create security policies based on the metadata tags | Users can classify data as "sensitive" or "PII" and then would be able to create policies in Ranger at a tag level. Ranger can then enforce policies for any resources classified under that tag | |
Ranger support for HDFS Transparent Encryption | HDFS Transparent was introduce in Hadoop 2.6. The encryption feature included a key provider interface and open source KMS. Ranger would provide an implementation of open source KMS, with credential and keys stored in a server | Users can potentially used HDFS encryption integrated with Ranger KMS in a production scenario, enabling them to identify sensitive data and encrypt them. Encryption adds in layer of security and is a must in many compliance driven environments | |
Query audit stored in HDFS | Currently, Ranger portal provides audit queries for audit data stored in RDMBS. Ranger introduced storage of audit logs in HDFS as part of 0.4 release. In this release, Ranger is moving away from storing audit logs in RDBMS and enabling audit query directly over HDFS data | HDFS storage of audit provides a scalable model for storing audit data and have security built to protect the data. This feature provides users an easy method to query audit logs in HDFS and remove dependency with RDBMS |