Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Authors: Aihua Li, George Chen, Yu Li

Status

Current state"Under Discussion"

Discussion threadhttps://lists.apache.org/thread.html/5aac294120d93b418bd6900eeb2416f4f49010241a847830e6ea2ff1@%3Cdev.flink.apache.org%3E

JIRAhere (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)

...

Page properties


Discussion thread
Vote thread
JIRA

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-14917

Release


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

  1. Add test suite for basic operations, and a visible WebUI to check the throughput and latency data, pretty much like our existing flink speed center.
  2. Add more software and hardware metrics for the benchmark.
  3. Add test suite for state backend.
  4. Add test suite for shuffle service. 

...

Test Scenarios

The following dimensions of Flink job are taken into account when setting the test scenarios:

Topology

Logical Attributes of Edges

Schedule Mode

Checkpoint Mode

OneInput

Broadcast

Lazy from Source

ExactlyOnce

TwoInput

Rescale

Eager

AtLeastOnce


Rebalance




KeyBy



There're also other dimensions other than Flink characteristics, including:

  • Record size: to check both the processing (records/s) and data (bytes/s) throughput, we will test the 10B, 100B and 1KB record size for each test job.
  • Resource for each task: we will use the Flink default settings to cover the most used cases.
  • Job Parallelism: we will increase the parallelism to saturate the system until back-pressure. 
  • Source and Sink: to focus on Flink performance, we generate the source data randomly and use a blackhole consumer as the sink.

Test Job List

The above test scenarios could form 32 test jobs as shown below:

...