Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current stateUnder DiscussionAccepted

Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-146-Improve-new-TableSource-and-TableSink-interfaces-td45161.html

JIRA:   

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-19719

Released: <Flink Version>

...

The user specifies the customized parallelism through connector options:

Option

Type

Default value

sink.parallelism

Integer

None

- Chained: Use upstream parallelism

- Non-Chained: Use global parallelism setting

SourceProvider

ScanRuntimeProvider for FLIP-27:

...

The user specifies the customized parallelism through connector options:

Option

Type

Default value

scan.parallelism

Integer

None (Use global parallelism setting)

Infer Scan parallelism

Connector

Can infer Source or Sink

How to infer

Kafka

Unbounded Source

Infer by partitions

Filesystem / Hive / Iceberg

Bounded Source

Infer by split numbers

JDBC/HBase

Bounded Source

Infer by split numbers

Elasticsearch

None



As can be seen, most connectors infer parallelism according to split numbers, and only infer source parallelism. But it can't rule out that users have customized parallelism inference mode.

User can control inference logical by connector options:

Option

Type

Default value

scan.infer-parallelism.enabled

Boolean

True

scan.infer-parallelism.max

Integer

None (Use global parallelism setting)


(The global parallelism setting is StreamExecutionEnvironment.getParallelism, in table, it can be configured by “table.exec.resource.default-parallelism”)

...