Beam Testing Tools FAQ

Below you can find the most Frequently Asked Questions asked about the testing infrastructure and performance tests in Beam.

Do we have a common GCP project for testing purposes?

Yes. Beam has a dedicated GCP project with "apache-beam-testing" id. It is currently used to run Google Cloud Dataflow jobs, host testing infrastructure (Google Cloud Storage, Google Kubernetes Engine, Google Cloud Dataproc), store metrics (BigQuery) or host dashboards of any kind (community metrics dashboards, performance tests dashboards).

How do I set up a Flink cluster for Beam tests? Where can I find the scripts to do this?

There's a Google Cloud Dataproc setup available that you might find interesting. It consists of several bash "init actions" that install necessary software (flink, docker) on virtual machines. There's also a "flink_setup.sh" script that let's you bootstrap the whole cluster without digging into it's implementation details. See flink_setup.sh for running instructions.

How do I set up a database or a file system for Beam integration tests? Where can I find the scripts to do this easily?

The easiest way to set up a database or file system for testing purposes is to use Kubernetes. It lets you set up and teardown databases/file systems easily in an isolated way so that you could use it in your tests without worrying that it will be a shared resource or that it will contain some "garbage" data from previous runs.

There are several scripts in Beam's repo to set up various kinds of datastores and use them in tests on a daily basis. You can find them in .test-infra/kubernetes folder. If you want to use them, you might find kubernetes.sh script useful (it decorates common kubectl operations and setups the datastore in its own namespace - this may be handy, especially on Jenkins).

What runner setups are available?

At the state of writing (25 November 2019) we have proven to successfully run tests on the following infrastructure:

portable Flink jobs using Dataproc Flink setup
non-portable Flink jobs using the same Dataproc Flink setup
non-portable Dataflow jobs
non-portable Direct runner jobs

How do I collect runtime metrics from my testing pipeline?

You can use Metrics API and distribution metric to do that. We've created special DoFns that can be attached in any place in between the other transforms of the pipeline to collect execution time. After collecting the metric we can get min() and max() value from it and subtract those to get the execution time of a job.

See:

TextIOIT or any IOIT for the usage example.
Java SDK: TimeMonitor.java
Python SDK: MeasureTime.py

In the future, it might be worth to utilize portable system metrics for the jobs. Ticket for the investigation: BEAM-8826

Where do we store metrics collected from periodically running tests (i.e. Jenkins jobs)?

Currently, all collected metrics can be stored in a BigQuery database hosted in an apache-beam-testing GCP project. You can use special helper classes to save your test results:

Java SDK: BigQueryResultsPublisher.java
Python SDK: MetricsReader.py

How to add a new Jenkins job for running tests?

You can add a new job definition using Jenkins job DSL and helper classes located in .test-infra/jenkins directory. A new job can be added either in existing job_* files or a new one added by you.

Please keep in mind that Jenkins does not automatically detect your newly added/modified Jenkins job definitions on your PR branch. To be able to run the new or modified job in a PR you have to reload it using a "seed job" (see below).

What is a "seed job" and when should I invoke it?

The seed job is a special "meta" job that is used for reloading the existing job configurations. It runs periodically (every 6 hours) or can be started on-demand in a Pull Request (by typing a comment: "Run seed job"). Whenever you add or modify a Jenkins job, you should invoke the seed job in a PR to be able to test it and to see if compiles.

Important note: currently our Jenkins jobs configuration is centralized and being held in one place (not per branch - one centralized config). The seed job reloads all jobs. Therefore:

Invoking a seed job reload all jobs for all branches including the master branch. To undo the changes that you introduced by running a seed job you can either wait for Jenkins to invoke seed job automatically or run it against master.
2 developers modifying the same job files and running seed jobs is a race condition. You've been warned. :)

How to test Jenkins jobs that I'm currently developing before creating a Pull Request?

Although we do not have automatic tests for Jenkins jobs, there's a dockerized Jenkins instance that mimics the Apache's one and can be used for testing a job before submitting a PR. It's an easy way of tracking down all typos or errors in job definitions. You also have admin access in the dockerized instance, so you can see more than you would in Apache's Jenkins.

Look for more detailed instructions on how to run and configure the dockerized Jenkins instance in .test-infra/dockerized-jenkins.

Are there any dashboards with test metrics? Where are they?

We currently use Perfkit Explorer to display dashboards with metrics for load tests, IO tests, and Nexmark suites. You can find them here: https://apache-beam-testing.appspot.com/dashboard-admin

How do I add a new dashboard for my tests?

Just click "create" in the dashboard administrator.

For more info see PerfKitExplorer docs and the demo project in particular (Perfkit Explorer demo).

Is there any anomaly detection mechanism to automatically detect regressions/improvements?

No. Currently, we do not detect anomalies in an automated fashion. We do it manually by looking at the graphs in PerfkitExplorer.

Note that we're planning to change it. We introduced a proposal to use Grafana + InfluxDb + Kapacitor stack. This setup will allow us to define alerts, trigger them via slack + email. Moreover, the goal is to keep the above as code (infrastructure as code approach).

Link to the proposal: https://s.apache.org/test-metrics-storage-corrected

What kinds of performance tests do we have in Beam?

Most notable examples of performance test types that we currently have continuously running in Beam are:

Other than that, please see the Contribution Testing Guide in general.

Can I write a custom testing pipeline that does not fit in the categories of tests that we have so far?

Yes! All the infrastructure/framework that is there in Beam does not impose any opinionated way of creating tests. It is up to you how you want to write your tests. If any of the building blocks we provided is helpful to you that's great! :)

Where can I find video resources to learn about integration/performance testing in Beam?

There's a presentation from Warsaw Apache Beam Meetup that is a great intro to what tools we have in Beam for testing purposes (starts at: 1:05:15):

Space shortcuts

Page tree