Authors: Aihua Li, George Chen, Yu Li

Status

Current state: "Under Discussion"

Discussion thread: https://lists.apache.org/thread.html/5aac294120d93b418bd6900eeb2416f4f49010241a847830e6ea2ff1@%3Cdev.flink.apache.org%3E

JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)

...

Page properties

Discussion thread

Vote thread

JIRA

Jira

server	ASF JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b
key	FLINK-14917

Release

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

We plan to split the implementation into 3 4 phases:

Add test suite for basic operations, and a visible WebUI to check the throughput and latency data, pretty much like our existing flink speed center.
Add more software and hardware metrics for the benchmark.
Add test suite for state backend, and more monitoring on hardware metrics.
Add test suite for shuffle service.

...

The detailed design of each test suite will be illustrated in this section.

Test

...

Suite for

...

Basic Operations

In this test suite we will use the default backend (heap) and shuffle service, to make sure of no regression on the basic end-to-end performance of flink job.

...

Test Scenarios

The following dimensions of Flink job are taken into account when setting the test scenarios:

Topology	Logical Attributes of Edges	Schedule Mode	Checkpoint Mode
OneInput	Broadcast	Lazy from Source	ExactlyOnce
TwoInput	Rescale	Eager	AtLeastOnce
	Rebalance
	KeyBy

There're also other dimensions other than Flink characteristics, including:

Record size: to check both the processing (records/s) and data (bytes/s) throughput, we will test the 10B, 100B and 1KB record size for each test job.
Resource for each task: we will use the Flink default settings to cover the most used cases.
Job Parallelism: we will increase the parallelism to saturate the system until back-pressure.
Source and Sink: to focus on Flink performance, we generate the source data randomly and use a blackhole consumer as the sink.

Test Job List

The above test scenarios could form 32 test jobs as shown below:

...

In this initial stage we will saturate the system until back-pressure, so we mainly monitor and display throughput of the jobs.

Add More Metrics for the Benchmark

Including software metrics like job-scheduling, task-launching, etc. and hardware metrics like cpu usage, network/disk IO consumption, etc. We plan to implement this at stage 2, and will write down detailed design in a separate child FLIP of this one.

Test

...

Suite for

...

State Backend

This test suite is mainly for making sure the performance of IO intensive applications. We plan to implement this at stage 2, as well as adding more monitoring on hardware3, and will write down detailed design in a separate child FLIP of this one.

Test

...

Suite for

...

Shuffle Service

This test suite is mainly for making sure the performance of batch applications. We plan to implement this at stage 34, and will write down detailed design in a separate child FLIP of this one.

Implementation

The test cases are written in java and scripts in python. We propose a separate directory/module in parallel with flink-end-to-end-tests, fwith the name of flink-end-to-end-perf-tests.

...

Page tree

Versions Compared

Old Version 5

New Version Current

Key

Authors: Aihua Li, George Chen, Yu Li

Status

Test

Suite for

Basic Operations

Test Scenarios

Test Job List

Add More Metrics for the Benchmark

Test

Suite for

State Backend

Test

Suite for

Shuffle Service

Implementation

Page tree

Page History

Versions Compared

Old Version 5

New Version Current

Key

Authors: Aihua Li, George Chen, Yu Li

Status

Test

Suite for

Basic Operations

Test Scenarios

Test Job List

Add More Metrics for the Benchmark

Test

Suite for

State Backend

Test

Suite for

Shuffle Service

Implementation