...
We plan to split the implementation into 3 4 phases:
- Add test suite for basic operations, and a visible WebUI to check the throughput and latency data, pretty much like our existing flink speed center.
- Add more software and hardware metrics for the benchmark.
- Add test suite for state backend, and more monitoring on hardware metrics.
- Add test suite for shuffle service.
...
The detailed design of each test suite will be illustrated in this section.
Test
...
Suite for
...
Basic Operations
In this test suite we will use the default backend (heap) and shuffle service, to make sure of no regression on the basic end-to-end performance of flink job.
...
The following dimensions are taken into account when setting the test scenarios:
Topology | Logical Attributes of Edges | Schedule Mode | Checkpoint Mode |
OneInput | Broadcast | Lazy from Source | ExactlyOnce |
TwoInput | Rescale | Eager | AtLeastOnce |
Rebalance | |||
KeyBy |
Test Job List
The above test scenarios could form 32 test jobs as shown below:
...
In this initial stage we will saturate the system until back-pressure, so we mainly monitor and display throughput of the jobs.
Add More Metrics for the Benchmark
Including software metrics like job-scheduling, task-launching, etc. and hardware metrics like cpu usage, network/disk IO consumption, etc. We plan to implement this at stage 2, and will write down detailed design in a separate child FLIP of this one.
Test
...
Suite for
...
State Backend
This test suite is mainly for making sure the performance of IO intensive applications. We plan to implement this at stage 2, as well as adding more monitoring on hardware3, and will write down detailed design in a separate child FLIP of this one.
Test
...
Suite for
...
Shuffle Service
This test suite is mainly for making sure the performance of batch applications. We plan to implement this at stage 34, and will write down detailed design in a separate child FLIP of this one.
Implementation
The test cases are written in java and scripts in python. We propose a separate directory/module in parallel with flink-end-to-end-tests, fwith the name of flink-end-to-end-perf-tests.
...