Source of benchmarks: https://github.com/apache/flink-benchmarks

Benchmarks WebUI: http://flink-speed.xyz

If you merge a performance critical change (e.g. code paths executed per record, state entry etc) to master, you should verify that it did not cause a regression comparing to the previous state of the master. There is an existing performance test suite which periodically runs on the master branch. You can check the timeline of its results after some time in this UI. If you are unsure about your changes and want to check the regression before merging, contact a PMC to submit a benchmark request and check results in the comparison UI.

Check also more details in the mailing list announcement.

If you want to know how to execute the benchmarks locally, please take a look at the benchmarks' readme

Submitting a benchmark request

Currently, the self-service method of triggering benchmarks is unavailable due to the lack of resources and potential vulnerabilities of Jenkins. Please contact one of Apache Flink PMCs to submit a benchmark with following steps:

  1. Push your changes to some Flink's clone github repository branch.
  2. Please select as small set of benchmarks to execute as possible.

  3. Contact one of Apache Flink PMCs and provide your repository, branch, the Java version (8,11 or 17) and the benchmarks you are willing to run.
  4. Once the benchmark finishes, the result will be pushed to the comparison UI.

How to handle a benchmark regression

When a benchmark regression is detected, the following steps will help to deal with regressions:

  1. Create a Jira ticket(one per group of related benchmarks). Set effects and fix versions to the current Flink version, component=Benchmarks, type=Bug, priority=Blocker.
  2. Post the ticket in the #flink-dev-benchmarks slack channel(replying in a thread).
  3. Verify that the regression is real and investigate the cause. Take FLINK-30623 as an example:
    1. Inspect the timeline following the link(http://flink-speed.xyz/timeline/#/?exe=1&ben=checkpointSingleInput.UNALIGNED&extr=on&quarts=on&equid=off&env=2&revs=200) from the notification. Suspicious commit ranges can be obtained from the figure, for this example, the suspicious range is 13ef498172b...fb272D2cdebf.
    2. Narrow down the commit range via git log. You can directly locate a specific commit based on experience or compare the benchmark results of each commit in this range, a commit would be found if this regression is real. See instructions for using benchmark-request, you can also try to benchmark locally. http://flink-speed.xyz benchmarking infrastructure is hosted using resources provided by Ververica(Alibaba) and maintained by PMCs and Ververica, please contact one of Apache Flink PMCs to submit a benchmark. For example, two benchmark requests had been submitted to verify whether FLINK-30533 caused the regression.
    3. Changes in flink-benchmarks may also cause a regression, don't forget to check if flink-benchmarks have changed recently.
    4. If a regression cannot be reproduced stably which is caused by the error in results or the issues of physical machines (like FLINK-18614[9]), this means the regression is not real.
  4. Post benchmark results under the Jira ticket, and ping the authors of the commit(or relevant developers) to investigate the regression if the regression is real. Otherwise, set the resolution of Jira ticket as "Not a bug", post the conclusion and close the ticket.
  5. If a regression is not fixed within a week of confirming that one commit is the root cause of the regression, contact the release manager to revert it (after confirming that reverting the changes resolves the issue using benchmark-request).



  • No labels