Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: restricted commands/statements

...

Hive also has support for storage based authorization, which is commonly used to add authorization to metastore server api calls. It can now (as of hive 0.12?) be used on the client side as well. While it can protect metastore against changes by malicious users, it does not support fine grained access control (column or row level).

...

SQL standards based authorization option (introduced in hive 0.13) provides a third option for authorization in hive. This is recommended because it allows Hive to be fully SQL compliant in its authorization model without causing backward compatibility issues for current users. As users migrate to this more secure model, the current default authorization could be deprecated. This authorization mode can be used in conjunction with storage based authorization on the metastore server. Like the current default authorization in hive, this will also be implemented on the client sideenforced at query compilation time. To provide security through this option, the client will have to be secured. This can be done by allowing users access only through hive server2, and by restricting the user code and non sql commands that can be run. The checks will happen against the user who submits the request, but the query will run as the hive server user. The directories and files for input data would have read access for this hive server user. For users who don’t have the need to protect against malicious users, this could potentially be supported through the hive command line as well.

The goal of this work has been to comply with SQL standard as far as possible, but there are deviations from the standard in the implementation. Some deviations were made to make it easier for existing hive users to migrate to this authorization model, some were made considering ease of use (in such cases we also looked at what many widely used databases do).

Under this authorization model, users who have access to hive-cli, hdfs commands, pig commandline, 'hadoop jar' command etc are considered privileged users. In an organization, it is typically only the teams that work on ETL workloads that need such access. These tools don't access the data through HiveServer2, and as a result their access is not authorized through this model. For hive-cli, pig and mapreduce users, access to hive tables can be controlled using storage based authorization enabled on the metastore server.

Most users such as business analysts tend to use SQL and odbc/jdbc through HiveServer2 and their access can be controlled using this authorization model.

Restrictions on hive commands,statements

Commands such as dfs,add,delete,compile,reset are disabled when this authorization is enabled.

The set commands used to change hive configuration are restricted to a smaller safe set. This is controlled using hive.security.authorization.sqlstd.confwhitelist configuration parameter. If this set needs to be customized, the HiveServer2 admin can set a value for this configuration parameter in its hive-site.xml.

Privilege to add/drop functions and macros are restricted to the admin user.

To enable users to use functions, the ability to create permanent functions has been added.

Privileges

● SELECT privilege - gives read access to object

...