Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status


Page properties


State
:
Draft
Discussion ThreadSupport for integration tests in Airflow
JIRA

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyAIRFLOW-3081

Created

Created



Par of this AIP is being implemented in AIP-47 New design of Airflow System Tests thread: JIRA: AIRFLOW-3081

Motivation

In the Apache Airflow project, the contributors have a need to run integration system tests (integration with GCP) automatically, specifically with external systems (for example Google Cloud Platform) automatically. Specifically before merging any of the pending changes to the main repository. Such integration tests are being worked on during the quest to add multiple GCP operators to Airflow. Currently integration tests are added during the GCF deploy/delete implementation in  this pull request and in airflow-breeze (which is an easy to setup Dockerized environment for Airflow) in this pull request .  Both PRs are in internal review stage and support running integration tests via command line, but this is not yet integrated in CI scripts.

We already have a community-shared way to run unit tests automatically for Apache Airflow. The approach for contributing to Airflow (as described in in CONTRIBUTING documentation documentation) is to create your own fork with own copy of TravisCI project running unit tests automatically. 

There are CI scripts and environment in Airflow, that allow allow Travis CI to  to run unit tests automatically, but there is no execution of integration system tests nor any other tests that require communication with a real external system such as Google Cloud Platform project.


But there is no way currently (in a way shared with the Community) to run such System Tests automatically.

Example of such System Test DAGs are those developed during development of Google Cloud Platform operators (this is currently in CLOUD_BUILD branch which will hopefully soon be merged to master):

Those DAGs are used for two purposes:

  • they are used as example documentation sources. For example the documentation of Google Compute Environment operators is generated using the examples.
  • they are actually runnable examples - providing that the environment variables are configured properly and authentication works.

The tests  can be run through airflow and they should succeed by performing full lifecycle of the service in question (Compute Instance, Cloud Function etc.). Running those examples have been wrapped in unit-tests-like system test classes that are ignored by default but when proper variables are set, they can be run automatically. They also have helpers that allow to setup and teardown costly environment for such service tests automatically.

As part of the Google Cloud Operators implementation, also a Cloud Build configuration was implemented that allows to run all the System Tests automatically. Using a privately owned/billed Google Cloud Platform project. Such build requires also an integration with Airflow Breeze Development environment which was developed for this specific purpose - to help with faster development of Google Cloud related operators. Design of the Breeze environment is here and it covers two usages for the environment - support for Cloud Build but also support for local development workflow which might become the base for or be merged with AIP-7 Simplified development workflow work.

GCP project. It would be great improvement in quality of the library, if integration we can have such system tests are executed automatically before any merge to main project. Running integration tests in this case mandate

Running System Tests for Google Cloud Platforms mandates use of a shared GCP project Google Cloud Platform project with billing enabled and creating an appropriate service account accounts that has have necessary permissions to perform GCP those operations. There are soon more than 30 operators for GCP to be added to Airflow (including GCE which allows to start/stop new machines). Virtually all of the operators could benefit from such automation of integration test executionThis can be either a private account of developer/team developing the operators, or eventually Apache Airflow community could have a shared GCP project to run such tests before merge automatically on approved pull requests.

Similar approach could be reused for other cloud/external service operators, not only for GCPGoogle Cloud Platform. 

Considerations

Requirements/Constraints

  • For now we can focus only on GCP Google Cloud Platform  operators and later reuse the learnings for other clouds/external services.

  • There will be many more GCP operators (not only GCF) are several services (and more coming) for Google Cloud Platform sharing this project (and service account(s) associated) is potentially dangerous if anyone can get credentials and use the service accounts. This means that fork forked/private repositories should use their own GCP projects and service accounts to setup Travis CI to use those for test executions. There This should be configurable but easy to share in the team working on the same fork.

  • Eventually however a shared GCP project/service account that will be executed in against might be used to run tests for the main repository before the merge to master happens. That would be sanity check that could verify that there are no special/forgotten setup in the personal GCP projects that prevents those tests from running for others.

  • The tests in the main GCP projects should only be run after at least code review and possibly some kind of automated “vulnerability” inspections that could prevent approaches to abuse the GCP environment. Adversary attacks on open-source infrastructure had recently become a powerful hacking techniques as is recognised as a powerful vector of attacks as it is traditionally difficult to prevent - community/open-source projects are often rather relaxed about security, but they are used in sometimes millions commercial installations. Attacking open-source infrastructure is usually much simpler than attacking the commercial installation directly. The threat is real and is actively exploited. Some high-profile example is recent  Gentoo repo hack. There is a nice short write-up about looming dangers in OS infrastructure,

  • Integration System tests tend to run much slower than UI tests. There should be very few of those tests, but even if there are few they will take several minutes rather than seconds that is usual for unit tests

Proposed changes to the workflow/infrastructure

  • Possible even now, without special shared GCP project Google Cloud Project and big changes to the workflow:

    • System Tests automation as implemented with Airflow Breeze can be run by anyone who has a billable GCP project

    • Cloud Build integration with GCP is an optional step - only if the team working on their fork have your own GCP project and setup Cloud Build Integration

    • System Tests execution is already conditional and disabled by default, unless credentials are properly setup for Cloud Build

    • There is already a bootstrapping process that creates appropriate service accounts and sets up the GCP project to be able to run the tests automatically

    • System

    • Integration tests automation should be added as an extra step in CI job

    • It should be possible to run the CI job with or without integration tests

    • Integration tests execution should be conditional, based on whether credentials are properly setup in CI environment

    • There should be an easy way to create appropriate service account and generally make the GCP project ready to become target for Integration tests

    • Unit tests should run for all pushes to repo, but Integration tests should only be prerequisites for pull requests to become mergeable because it takes a lot of time and resources to run them

  • Requires common GCP project/service account and workflow adaptation

    • Integration System tests using shared credentials in main repository of Airflow should be only run after code from forks have been reviewed and approved but before merge happens - to verify that they will be runnable by everyone.