Page History

Status

State: Draft

Discussion thread:

JIRA:

Motivation

By default, users launch one scheduler instance for Airflow. This brings up a few concerns, including

High Availability: what if the single scheduler is down.
Scheduling Performance: the scheduling latency for each DAG may be long if there are many DAGs.

It would be ideal for Airflow to support multiple schedulers, to address these concerns.

Considerations

1. `scheduler_lock` is already there in DagModel, but it's not used in current implementation of Airflow (as of now, https://github.com/apache/airflow/tree/45d24e79eab98589b1b0509e920811cbf778048b). We should leverage leverage it and modify the scheduler code accordingly.

2. To avoid the leader-selection issueproblem, we may not want to use master-slave architecture for schedulers. Instead, we simply start multiple schedulers.

The probability of schedulers competing on the same DAG is easy to calculate since it's a typical Birthday Problem, and it is reasonably low if # of DAGs/ # of schedulers is not too low (the probability that there are schedulers competing on the same DAG is 1-m!/((m-n)! * (m^n)) , m is the number of DAGs and n is the number of schedulers).

Let’s say we have 200 DAGs and we start 2 schedulers. At any moment, the probability that there is schedulers competing on the same DAG is only 0.5%. If we run 2 schedulers against 300 DAGs, this probability is only 0.33%.(https://lists.apache.org/thread.html/389287b628786c6144c0b8e6abf74a040890cd9410a5abe6e968eb55@%3Cdev.airflow.apache.org%3E)

3. To avoid the "correlation" between schedulers, we may want to consider random sort list of DAG files before it's passed to scheduler process (https://lists.apache.org/thread.html/e21d028944092b588295112acb9a3e203c4aea7fae50978f288c2af1@%3Cdev.airflow.apache.org%3E)

4. One important scope of this AIP is to intensively test whether running multiple schedulers would cause any issue (after all concerns above are addressed).

Space shortcuts

Page tree

Versions Compared

Old Version 4

New Version 5

Key

Status

Motivation

Considerations