You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Status

StateDraft
Discussion Thread

https://github.com/apache/airflow/discussions/29986

Vote Thread
Vote Result Thread
Progress Tacking (PR/GitHub Project/Issue Label)
Date Created

$action.dateFormatter.formatGivenString("yyyy-MM-dd", $content.getCreationDate())

Version Released
Authorsvincbeck

Motivation

Today, the user management is part of core Airflow. Users, roles and permissions are stored in the Airflow metastore and managed through Flask-AppBuilder (FAB). Any additional feature in the user management part of Airflow means modifying core Airflow and more importantly, verifying it fits everyone needs from individuals to teams within enterprises.

For context, this had been brought up in a discussion regarding multi tenancy. It had been suggested that, instead of adding new features in the user management part of Airflow (such as having tenants), to extract this part of core Airflow and move it to a new component in Airflow: the user manager. The target is, like executors, to have a generic interface defining the common API/functions of all user managers which they need to implement. This way, Airflow would offer a pluggable/extensible way to define and use the user manager that suits user needs. As opposed to the generic interface, the different user manager implementations are not part of core Airflow and reside, depending on the service used underneath, in their respective provider (e.g AWS, Google) if it exists or in a new provider if it does not.

Proposal

The proposal is to extract the whole user management part of Airflow outside of core Airflow and introduce the user manager. The goal of the user manager is to manage all features and resources related to users, roles and permissions. This way users could simply chose between a very minimalist/simple user manager and a more advanced one with notion of groups/tenants. Everything under the FAB security manager as it exists today is extracted out from core Airflow and handled by the user manager.

The user manager interface (or base user manager) is an interface each user manager needs to inherit from. This interface defines the common API of a user manager and is the only integration point with core Airflow. In other words, any action related to user management is done through classes inheriting from this interface.
Since it is impossible to forecast what feature/view each user manager is going to offer, the “Security” tab in the nav bar will be configured by each user manager.
User managers are “pluggable”, meaning you can swap them based on your installation needs. Airflow can only have one user manager configured at a time; this is set by the user_manager option in the [core] section of the configuration file.

Implementations

Minimalist FAB user manager (backward compatible)

The target of the FAB user manager is to offer a backward compatible experience to the users. To put it simple, it moves the FAB security manager out of core Airflow to a new provider: the FAB provider. All the different pages are still served through the web server. The “Security” tab is configured to be as it is today. End users should see no difference between before user managers and after.

KeyCloak user manager

The target of the KeyCloak user manager is to delegate the user management to KeyCloak. The whole user management part is delegated to KeyCloak and admins have to configure roles and permissions in KeyCloak directly. A new provider KeyCloak needs to be created and contain only the KeyCloak user manager.

Common API

All user managers have a common API defined in the user manager interface. You can find in the table below the common API needed from all user managers.

CategoryNameDescription
Nav barget_tab_title()Returns the tab title in the nav bar. Currently "Security"
get_tab_menu()Returns the different items when hovering the tab
URLsget_url_login()Returns URL to sign in
get_url_logout()Returns URL to sign out
get_url_account()Returns URL to access my account/profile
APIsis_logged_in()Return true if the current user is logged in
get_user_name()Returns the user name
post_login()Post login operations needed depending on the user manager used. e.g. Storing the access token
is_authorized()Is the user authorized to make an action on a given resource. See section "Authorization API" for more details

Authentication flow

The authentication flow allows a user to log in Airflow. The flow follows the oauth 2.0 protocol.

To simplify the example diagrams below, we consider the user is not logged in and the authentication on the backend side succeed.

FAB user manager

FAB user manager is different from the other user managers. Instead of delegating the login experience to an external service, it includes and defines the login page within the manager. The page is still served through the web server. The goal is to have the login page as it is today.

KeyCloak user manager

Authorization flow

The is_authorized API is the API each user manager needs to implement to check whether the current user has permissions to make a specific action. Here are some examples of usage:

  • Has the current user permissions to list variables? is_authorized([(permissions.ACTION_CAN_READ, permissions.RESOURCE_VARIABLE)])
  • Has the current user permissions to read a specific DAG? is_authorized([(permissions.ACTION_CAN_READ, permissions.RESOURCE_DAG)], "dag_id")

In order to understand how this API is implemented in different user managers, let’s take the use case of “User clicks on Variables in the Admin menu”.

FAB user manager

The is_authorized API in the FAB user manager checks if the current user has the specified permissions. The implementation is really close to check_authorization in the security manager.

KeyCloak user manager

When logging in using KeyCloak, users are issued an access token stored in the metastore. This access token is used by KeyCloak to figure out if the current user has permissions to access a given resource.

Additional providers

Here are some examples of additional provider which could be offered in Airflow:

  • AWS user manager. This user manager would be part of the Amazon provider package
  • Google user manager. This user manager would be part of the Google provider package
  • ...

Considerations

What problem does it solve?

It extracts out the user management from core Airflow and follow the approach "Airflow as a platform". The user management would be extensible and pluggable allowing creating more advanced user management features than there is today in Airflow such as group of users (or tenants)

Why is it needed?

Having a user management which fits everyone needs (from individuals to teams within enterprises) is impossible. Users need to have an extensible and pluggable way to use and define the user management they want.

Which users are affected by the change?

All users are impacted by the change. Though, by default Airflow would use the FAB user manager that is backward compatible and users should not see any difference. Of course, if an admin decides to change the user manager to use another one, then the whole user management experience of the environment would change.

What defines this AIP as "done"?

  • The user manager interface defined
  • New provider FAB provider created
  • FAB user manager inhering from the user manager interface defined. This FAB user manager is part of the new provider: FAB provider. By default Airflow uses this user manager
  • New provider KeyCloak provider created
  • KeyCloak user manager inhering from the user manager interface defined. This KeyCloak user manager is part of the new provider: KeyCloak provider
  • Having POC of a user manager from existing providers (e.g. AWS, Google, ...). The purpose of these POCs is to verify that the user manager interface is compatible with the different providers to define new user managers
  • No labels