Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Page properties


StateIn Progress
Discussion Thread
Vote Thread
Vote Result Thread
Progress Tacking (PR/GitHub Project/Issue Label)
Date Created

Created

Version Released
Authorsvincbeck



Motivation

Today, the user management is part of core Airflow. Users, roles and permissions are stored in the Airflow metastore and managed through Flask-AppBuilder (FAB). Any additional feature in the user management part of Airflow means modifying core Airflow and more importantly, verifying it fits everyone needs from individuals to teams within enterprises.

For context, this had been brought up in a discussion regarding multi tenancy. It had been suggested that, instead of adding new features in the user management part of Airflow (such as having tenants), to extract this part of core Airflow and move it to a new component in Airflow: the user manager. The target is, like executors, to have a generic interface defining the common API/functions of all user managers which they need to implement. This way, Airflow would offer a pluggable/extensible way to define and use the user manager that suits user needs. As opposed to the generic interface, the different user manager implementations are not part of core Airflow and reside, depending on the service used underneath, in their respective provider (e.g AWS, Google) if it exists or in a new provider if it does not.

Proposal

The proposal is to extract the whole user management part of Airflow outside of core Airflow and introduce the user manager. The goal of the user manager is to manage all features and resources related to users, roles and permissions. This way users could simply chose between a very minimalist/simple user manager and a more advanced one with notion of groups/tenants. Everything under the FAB security manager as it exists today is extracted out from core Airflow and handled by the user manager.

The user manager interface (or base user manager) is an interface each user manager needs to inherit from. This interface defines the common API of a user manager and is the only integration point with core Airflow. In other words, any action related to user management is done through classes inheriting from this interface.
Since it is impossible to forecast what feature/view each user manager is going to offer, the “Security” tab in the nav bar will be configured by each user manager.
User managers are “pluggable”, meaning you can swap them based on your installation needs. Airflow can only have one user manager configured at a time; this is set by the user_manager option in the [core] section of the configuration file.

Implementations

In order to explain more into details how user managers work, I decided to take two different implementations of user manager:

  • FAB user manager. This user manager offers the exact same features and experience as the current user management in Airflow. The implementation of this user manager is part of this AIP.
  • KeyCloak user manager. This user managers leverages KeyCloak to manager users and roles. The implementation of this user manager is not part of this AIP. I still decided to include diagrams and explanations about this user manager in this AIP to increase clarity and understanding about the potential different implementations of user manager.

Minimalist FAB user manager (backward compatible)

The target of the FAB user manager is to offer a backward compatible experience to the users. To put it simple, it moves the FAB security manager out of core Airflow to a new provider: the FAB provider. All the different pages are still served through the web server. The “Security” tab is configured to be as it is today. End users should see no difference between before user managers and after.

KeyCloak user manager

The target of the KeyCloak user manager is to delegate the user management to KeyCloak. The whole user management part is delegated to KeyCloak and admins have to configure roles and permissions in KeyCloak directly. A new provider KeyCloak needs to be created and contain only the KeyCloak user manager. This user manager will not be part of this AIP.

Authentication flow

The authentication flow allows a user to log in Airflow. The flow follows the oauth 2.0 protocol.

To simplify the example diagrams below, we consider the user is not logged in and the authentication on the backend side succeed.

FAB user manager

FAB user manager is different from the other user managers. Instead of delegating the login experience to an external service, it includes and defines the login page within the manager. The page is still served through the web server. The goal is to have the login page as it is today.

KeyCloak user manager

Authorization flow

The is_authorized API is the API each user manager needs to implement to check whether the current user has permissions to make a specific action. Here are some examples of usage:

  • Has the current user permissions to list variables? is_authorized([(permissions.ACTION_CAN_READ, permissions.RESOURCE_VARIABLE)])
  • Has the current user permissions to read a specific DAG? is_authorized([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG)], "dag_id")

In order to understand how this API is implemented in different user managers, let’s take the use case of “User clicks on Variables in the Admin menu”.

FAB user manager

The is_authorized API in the FAB user manager checks if the current user has the specified permissions. The implementation is really close to check_authorization in the security manager.

KeyCloak user manager

When logging in using KeyCloak, users are issued an access token stored in the metastore. This access token is used by KeyCloak to figure out if the current user has permissions to access a given resource.

Airflow Rest API

As part of the Rest API, some resources are no longer managed by core Airflow but by user managers: roles and users. Therefore, these APIs will be removed:

However, some user managers might need to define additional Rest API for their own needs. FAB user manager is an example, in order to be backward compatible, the APIs listed above that are removed from core Airflow need to be redefined/moved from core Airflow to FAB user manager. By default, no additional Rest API is defined in the base user manager.

Airflow CLI

Among the sub-commands exposed by Airflow CLI, roles and users, similarly to the Rest API, need to be removed from core Airflow. Like the Rest API, some user managers might need to define additional CLI commands (e.g. FAB user manager).

UI

The different UI pages used to manage users and roles are no longer part of Core Airflow and moved to user managers. Depending on the user manager and its service/tool used underneath, two options are possibles:

  • Use the UI provided by the service/tool directly to manage users and roles. This is the preferred option.
  • Create UI pages in the user manager to manage users and roles. This is the option chosen for the FAB user manager.

Even though the preferred option is to delegate entirely the user management to user managers, the second option is necessary to implement the FAB user manager.

User manager API

All user managers have a common API defined in the user manager interface. You can find in the table below the common API needed from all user managers. The different categories are just for documentation and grouping purposes but might not be reflected in the architecture/code.

CategoryNameDescription
UIget_tab_configuration()

Returns the tab configuration

Example 1 (FAB)

Code Block
{
	"title": "Security",
	"action": [
		{
			"title": "List Users",
			"action": "/users/list/"
		},
		...
	],
}

Example 2 (KeyCloak)

Code Block
{
	"title": "Users",
	"action": "<KeyCloak console url>"
}


get_url_account()Returns URL to access my account/profile
get_user_name()Returns the user name
Coreget_url_login()Returns URL to sign in
get_url_logout()Returns URL to sign out
is_logged_in()Return true if the current user is logged in
post_login()Post login operations needed depending on the user manager used. e.g. Storing the access token
is_authorized()Is the user authorized to make an action on a given resource. See section "Authorization API" for more details
Additional resources (legacy)rest_apis()Define additional Rest APIs
cli_commands()Define additional CLI commands
views()Define additional views
The "Additional resources (legacy)" section are methods needed to build the backward compatible FAB user manager. The long term plan is, once the FAB user manager is deprecated, to deprecate these methods as well.

Future work

Here are some examples of task that are not part of the AIP but can be done as follow-up once the AIP is completed.

  • Create KeyCloak provider and KeyCloak user manager within it
  • Additional providers (e.g. AWS user manager, Google user manager)

Considerations

What problem does it solve?

It makes user management component of Airflow pluggable and extensible by introducing a user manager interface in the core Airflow that can be extended by any provider package who want to support user management natively. An extensible and pluggable user management would open up the potential for a more advanced user management features than there is today in Airflow such as group of users (or tenants). 

Why is it needed?

Having a user management which fits everyone needs (from individuals to teams within enterprises) is impossible. Users need to have an extensible and pluggable way to use and define the user management they want.

Native user management support in cloud providers means that roles can be mapped directed to identity provider.  Currently, Airflow operators have to work around this and are unable to provide seamless RBAC in Airflow when running it on different cloud platforms.

Which users are affected by the change?

All users are impacted by the change. Though, by default Airflow would use the FAB user manager that is backward compatible and users should not see any difference. Of course, if an admin decides to change the user manager to use another one, then the whole user management experience of the environment would change.

What defines this AIP as "done"?

  • The user manager interface defined
  • New provider FAB provider created
  • FAB user manager inhering from the user manager interface defined. This FAB user manager is part of the new provider: FAB provider. By default Airflow uses this user manager