Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The Flink optimizer works similarly to a relational Database Optimizer, but applies these optimizations to the Flink programs (that are written in general purpose languages), rather than SQL queries.

 

Optimizations

The following optimizations are performed

...

  • Join reordering (or operator reordering in general): Joins / Filters / Reducers are not re-ordered in Flink. This is a high opportunity optimization, but with high risk in the absence of good estimates about the data characteristics. Flink is not doing these optimizations at this point.
  • Index vs. Table Scan selection: In Flink, all data sources are always scanned. The data source (the input format) may apply clever mechanism to not scan all the data, but pre-select and project. Examples are the RCFile / ORCFile / Parquet input formats.

 

Data Structures

Optimizer DAG

Execution Plan and Execution Plan candidates

 

Properties and property matching

Global properties and local properties

Operators defined via properties

 

Role of Semantic Properties

 

Iterations

Static vs. dynamic paths