Status
This document captures the design of REST API for Apache Airflow.
Summary of Changes
N.o. | Version | Date | Description |
1 | v20200226 | 2020-02-26 | First version |
2 | |||
3 | |||
4 | |||
5 | |||
6 | |||
7 |
Background
We currently have one experimental API, but despite its existence for 2 years, it has not reached a stable level. There are many critical aspects to this implementation including: no defined schema, very narrow range of functions, unsafe default configuration, no HATEOAS and many others.
In the past, Drewstone began work on REST API based on OpenAPI. It's described by AIP-13: OpenAPI 3 based API definition. However, it was not completed due to the lack of interest from the author and the Kerberos configuration problem (It was at a time when Breeze was still developing, so configuring all dependencies, including Kerberos, was a problem). It had a narrow range of features that are expected by users e.g. access to XCOM data and management for connection is missing,
The Polidea and Google teams together with the community want to make another attempt based on our and community experience. Airflow deserves new stable solutions.
Goal and non-goals
This chapter sets out the success criteria and limits of this change. It only right for this AIP, but does not mean that the project may have additional goals that depend on the API. API is a complex component, so determining the scope of work is very important.
Goal
Create solid fundamentals
We want to develop a solution that meets the following basic requirements:
- It should be easy to maintain - provides unified methods for major technical issues e.g. CRUD operations the database objects are carried out in the same way; used libraries are widely supported.
- It should be trustworthy - It will be done by validation of requests and responses with the schema file;
- It should be extensible - the API schema should not change between Airflow versions and should not limit the development of the project
- it should be secure - To invoke the API action, the client should have the required permissions;
Complete and universal
The API will allow you to perform all operations that are available through Web UI and experimental API and those commands in CLI that are used by typical users. For example: we will not provide an API to change the Airflow configuration (this is possible via CLI), but we will provide an API to the current configuration (this is possible via Web UI).
The API will not provide operations that are completely new, but are expected by customers e.g. isolation of workers from the scheduler.
The API will be intended for use by any third party. It should not be related to a specific application, e.g. a React UI
Develop a privilege model that will be usable by Web UI and API
Update the permission model to use the new API.
Update the Web UI and the experimental API to use the new permission model.
Create the extension points for authorization
We want to create extension points that will enable the development of authorization mechanisms, e.g. OpenID, Kubernetes, LDAP in an independent manner. Specific implementations may occur during development, but they will not be discussed in this document.
Support Python and SQLAlchemy objects
The objects in Airflow are divided into two types:
- SQL Alchemy - They always have a known structure. They are permanently saved to the database.
- Python objects e.g. DAG/BaseOperator - They can be created dynamic and they reside only in memory. They have no direct matches in the database. In the database, they only have simplified equivalents.
We want to build an API that will provide information about objects in the database and simplified Python objects. In the second case, read-only.
Built with the community
We do not want to build a solution by one person, but work with all interested persons to develop the best solution. After reaching the consensus, on the mailing, we will create tickets in JIRA.
Non-goal
API Client
The API will not depend on the specific client implementation. Customers can use any language and technology to send requests to the API server. It will even be possible to use CURL/Bash to send API requests.
Update WebUI or CLI to use the API
While the ultimate goal is that the Web UI and CLI use the API, updating these components will require a lot of work. However, it is not necessary to achieve the other goals.
Delete the current API
This API has existed for a very long time and has become part of a large number of solutions. For this reason, this deprecation should take place with a transitional period.
Aggregated data
The goal is not to develop an API that is tailored to a specific case. Solutions will be developed that will enable the addition of new endpoints independently.
API for optional components
The goal is not to develop an API for non-fundamental components. There are expectations to develop an API that provides additional features. For example:
- allows access to node information when using CeleryExecutor,
- for deeper integration between Airflow and Kubernetes when KubernetesExecutor is used
- for monitoring
Integration with these components will be covered in other documents.
Create authorization plugins
The goal is only to develop extension points. The plugins will be developed independently. Each user has different requirements and it is not possible to choose the best solution. We also can't choose it, but we should support all common mechanisms. However, there are also users who do not need any authorization and they are the first candidates for users.
Technology
We use HTTP and JSON. These are the most common technologies. Protobuf is also popular but has compatibility limitations e.g. you can't use easily with CURL.
OpenAPI specification
OpenAPI specification is available on Github:
https://github.com/PolideaInternal/airflow/pull/653
It's not fully complete yet but contains the most important elements e.g. we would to add HATEOS, but we can define it after specifying the endpoints.
The collection identifier segments in a resource name use the plural form of the noun used for the resource. (For example, a collection of Connection resources is called connections in the resource name.
Permission model
In this chapter, we'll consider how to limit permissions in the API as well as in Web UI.
Current approach
Airflow currently uses view-based permissions based on Flask App Builder. We have the following permissions:
can_add can_blocked can_chart can_clear can_code can_conf can_dag_details can_dag_edit can_dag_read can_dag_stats can_dagrun_clear can_delete can_duration can_edit | can_gantt can_get_logs_with_metadata can_graph can_index can_landing_times can_list can_log can_paused can_refresh can_rendered can_run can_show can_success can_task | can_task_instances can_task_stats can_tree can_tries can_trigger can_varimport can_version can_xcom clear menu_access muldelete set_failed set_running set_success |
New approach
We keep integrations with Flask App Builder. New permission are defined according to the following pattern
can_{action}_{resource} |
where:
- action - describes the current operation e.g. view, edit, create
- resource - describe the resource name using the snake_case convention
Example:
can_edit_variable can_edit_user |
There will also be special permissions that will allow more detailed permission control, e.g. can_api_access, or can_web_access. However, giving these permissions will not allow you to perform any operations. Permissions for specific resources will always be required.
Old approach makes adding new views difficult. Always requires the addition of new permissions or reusing the permission from other views. For example: if we want to add a display of dependencies between DAGs, we will have to add a new permission or use a new one (in this case can_index). The new approach means that if a user has access to DAG, they can view the DAG list as well as see the dependencies between DAG. User will have the same permission in CLI, when we update CLI to use the new API.
We also need to prepare a tool for automatic migration of the permission model.
Implementation
We use connexion. This is the most stable and mature solution. It supports Flask and is therefore compatible with our application.
Other alternatives presented in the "Rejected Proposals" section were also considered.
Documentation
We will have the following documentation:
- OpenAPI specification
- REST API Reference
- Guide "How to use API"
- Migration guide from the experimental API to the REST API
- Migration guide for new permission model
API Reference will be generated based on the openapi.yaml file. This file will also be used in tests, so we will always have correct documentation. It will also help to maintain its good condition. Guides will explain basic operations and facilitate the first use of the API.
Contribution
We invite everyone to contribute. Most design discussions will take place on the mailing list - dev@airflow.apache.org
Discussion thread: TODO
We also have #sig-api to talk about non-key decisions and to coordinate our work.
Registration link: https://apache-airflow-slack.herokuapp.com/
All changes are tracked under Epic Issue:
Rejected proposals
This section will summarize discussions about solutions that have not been implemented. This has been moved here to increase the readability of the document.
API generator based on the database model
There are ready solutions that allow you to quickly create an API based on a database model. They have the following advantages:
- allow us to create an API quickly with a small amount of code.
- allow flexible filtering
- have built-in permission control
However, these are not the features that are most important to us. Airflow is a product that is very mature used in complex solutions, including integration platforms. Other systems often integrate via API. This makes API stability very important. We can't afford to break backwards compatibility. However, this will not be possible if we choose this approach. If the API will allow any filters, then when we change the structure of the database, e.g. by dividing tables or completely changing the data storage (Redis vs. SQL), then filters will not work properly.
However, if the filtering is done by the user, this will not be a problem. This will also not be a big problem for the user, because users do not expect an immediate response if Airflow is part of a complex platform.
This type of generator is useful if you are building a two-component application - frontend and backend and you can guarantee that these applications will be deployed simultaneously. In this cases, the API is rarely used by third parties. However, we want to build an API for use by third parties mostly.
FlaskAppBuilder
ModelRestApi has the limitation presented in the above section. We could use BaseAPI also. It doesn't have the above problems, but it doesn't have support for OpenAPI schema verification. But most important for me. It's less popular for building an API. New contributors that will make changes will need to learn FAB to make changes. However, people are lazy, so changes will be made based on a partial understanding of FAB. There are not many FAB REST API experts. On other hand, Connexion is a stable, reliable and trustworthy solution. There is a high probability that contributors know this framework from other projects. Connexion also has support for Flask and Tornado, which will reduce our dependence on one framework. In the future, this will enable the use of the API server in asynchronous mode. However, this is not part of this AIP.
Disclaimer
This document assumes you are already familiar with Airflow codebase and may change over time based on feedback.