You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Status

State: Draft

Discussion thread:

JIRA:

Motivation

Airflow runs arbitrary code across different workers. Currently, every task has full access to the Airflow database including connection details like usernames, passwords etc. This makes it quite hard to deploy Airflow in environments that are multi-tenant or semi-multi-tenant. Next to that there is no mechanism in place that ensures that what the scheduler thinks it is scheduling is also the thing that is running at the worker. This creates to additional operational risk of running an out of date task that does something else than expected, aside from the security risk of a malicious task.

Challenges

DAGs can be generated by DAG factories essentially creating a kind of sub-DSL in which DAGs are defined. From Airflow’s point of view this creates a challenge as DAGs therefore are able to pull in arbitrary dependencies. 

Envisioned workflow

DAG submission

The DAG directory should be fully managed by Airflow. DAGs should be submitted by authorized users and ownership 


  • No labels