Status of Hive Authorization before hive 0.13

The default authorization in hive is not designed with the intent to protect against malicious users from accessing data they should not be accessing. It only helps in preventing users from accidentally doing operations they are not supposed to. It is also incomplete because it does not have authorization checks for many operations including the grant statement. The authorization checks happen during hive query compilation. But as the user is allowed to execute dfs commands, user defined functions and shell commands, it is possible to bypass the client security checks.

Hive also has support for storage based authorization, which is commonly used to add authorization to metastore server api calls. It can now (as of hive 0.12?) be used on the client side as well. While it can protect metastore against changes by malicious users, it does not support fine grained access control (column or row level).

The default authorization model in hive can be used to provide fine grained access control by creating views and granting access to views instead of the underlying table.

SQL Standards based hive authorization

SQL standards based authorization option (introduced in hive 0.13) provides a third option for authorization in hive. This is recommended because it allows Hive to be fully SQL compliant in its authorization model without causing backward compatibility issues for current users. As users migrate to this more secure model, the current default authorization could be deprecated. This authorization mode can be used in conjunction with storage based authorization on the metastore server. Like the current default authorization in hive, this will also be implemented on the client side. To provide security through this option, the client will have to be secured. This can be done by allowing users access only through hive server2, and by restricting the user code and non sql commands that can be run. The checks will happen against the user who submits the request, but the query will run as the hive server user. The directories and files for input data would have read access for this hive server user. For users who don’t have the need to protect against malicious users, this could potentially be supported through the hive command line as well.

The goal of this work has been to comply with SQL standard as far as possible, but there are deviations from the standard in the implementation. Some deviations were made to make it easier for existing hive users to migrate to this authorization model, some were made considering ease of use (in such cases we also looked at what many widely used databases do).

Privileges

● SELECT privilege - gives read access to object

● INSERT privilege - gives ability to add data to object (table)

● UPDATE privilege - gives ability to run update queries on object (table)

● DELETE privilege - gives ability to delete data in object (table)

● ALL PRIVILEGES - gives all privileges

Objects

● The privileges will apply to table and views. The above privileges are not supported on databases.

Object ownership

For certain actions, the ownership of the object (table/view/database) determines if you are authorized to perform the action.

Users and Roles

Users and roles

Configuration

References

For information on the SQL standard for security see:

● ISO 9075 Part 1 Framework sections 4.2.6, 4.6.11

● ISO 9075 Part 2 Foundation sections 4.35 and 12

Space shortcuts

Child pages

Status of Hive Authorization before hive 0.13

SQL Standards based hive authorization

Privileges

Objects

Object ownership

Users and Roles

Configuration

Space shortcuts

Child pages

SQL Standard based hive authorization (New in Hive 0.13)

Status of Hive Authorization before hive 0.13

SQL Standards based hive authorization

Privileges

Objects

Object ownership

Users and Roles

Configuration