Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: Under Discussion Accepted

Discussion thread:  here (<- link to https://mail-archiveslists.apache.org/mod_mbox/flink-dev/)
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)thread/y5owjkfxq3xs9lmpdbl6d6jmqdgbjqxo

JIRA:

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-33581

Released: <Flink Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently,

...

the job configuration in FLINK is spread out across different components, including StreamExecutionEnvironment, CheckpointConfig, and ExecutionConfig. This leads to inconsistencies between configurations stored in these components. For example, the 'execution.checkpointing.interval'

...

in StreamExecutionEnvironment configuration may be different from the checkpoint interval specified in CheckpointConfig. This can confuse developers and higher-level components like the Table layer has to retrieve configuration from multiple sources.

Furthermore,

...

the approaches used to configure these components are different, with some configurations using complex Java objects while others use ConfigOption, which is a key-value configuration approach. This makes it difficult to effectively manage job configuration. For example, validating non-ConfigOption job configuration is challenging, as seen in StreamContextEnvironment#checkCheckpointConfig. Additionally, passing complex Java objects (e.g., state backend and checkpoint storage) between the environment, streamGraph, and jobGraph adds complexity to development.

To address these issues, it is necessary to standardize the configuration approach by migrating non-ConfigOption objects to use ConfigOption. Additionally, adopting a single Configuration object to host all the configuration can also help resolve these challenges.

...

To address these issues, it is necessary to standardize the configuration approach by migrating non-ConfigOption objects to use ConfigOption. Additionally, adopting a single Configuration object to host all the configuration can also help resolve these challenges.

However, there is a significant blocker to implement the proposed solution. Currently, the non-ConfigOption objects in the StreamExecutionEnvironment, CheckpointConfig, and ExecutionConfig have already been exposed to users through the public API. This poses a challenge when trying to modify the existing implementation to accommodate the proposed solution. Therefore, this FLIP aims to deprecate these Java objects and their corresponding getter/setter interfaces, ultimately removing them in FLINK

...

-2.0.

...

Please note that this FLIP does not include deprecating fields related to serialization. The deprecation work for the serialization part will be carried out in conjunction with the relevant work in the FLINK-2.

...

0 serialization section.

Public Interfaces

Deprecate following classes, fields and methods

  • RestartStrategy:

Class

Annotation

org.apache.flink.api.common.restartstrategy.RestartStrategies

@PublicEvolving







org.apache.flink.api.common.restartstrategy.RestartStrategies.RestartStrategyConfiguration

org.apache.flink.api.common.restartstrategy.RestartStrategies.FixedDelayRestartStrategyConfiguration

org.apache.flink.api.common.restartstrategy.RestartStrategies.ExponentialDelayRestartStrategyConfiguration

org.apache.flink.api.common.restartstrategy.RestartStrategies.FailureRateRestartStrategyConfiguration

org.apache.flink.api.common.restartstrategy.RestartStrategies.FallbackRestartStrategyConfiguration


Method

Fields or Methods

Annotation

org.apache.flink.streaming.api.environment.StreamExecutionEnvironment

getRestartStrategy(

#setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration)

@Public

setRestartStrategy(RestartStrategies.RestartStrategyConfiguration restartStrategyConfiguration

org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#getRestartStrategy()

org.apache.flink.api.common.ExecutionConfig

#getRestartStrategy()

@Public

setRestartStrategy

org.apache.flink.api.common.ExecutionConfig#setRestartStrategy(RestartStrategies.

RestartStrategyConfiguration restartStrategyConfiguration

RestartStrategyConfiguration restartStrategyConfiguration)


Field

restartStrategyConfiguration

Annotation

org.apache.flink.api.common.

restartstrategy.RestartStrategies

Deprecate the entire class, includes its inner class

@PublicEvolving

ExecutionConfig#restartStrategyConfiguration

@Public

Suggested alternative: Users can configure the RestartStrategyOptions related ConfigOptions, such as Suggested alternative: Users can configure the RestartStrategyOptions related ConfigOptions, such as  "restart-strategy.type",  in the configuration, instead of passing a RestartStrategyConfiguration objectin the configuration, instead of passing a RestartStrategyConfiguration object.

  • CheckpointStorage

Class

Fields or Methods
MethodAnnotation

org.apache.flink.streaming.api.environment.

CheckpointConfigsetCheckpointStorage

CheckpointConfig#setCheckpointStorage(CheckpointStorage storage)

@Public
setCheckpointStorage
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(String checkpointDirectory)
setCheckpointStorage
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(URI checkpointDirectory)
setCheckpointStorage
org.apache.flink.streaming.api.environment.CheckpointConfig#setCheckpointStorage(Path checkpointDirectory)
getCheckpointStorage
org.apache.flink.streaming.api.environment.CheckpointConfig#getCheckpointStorage()

Suggested alternative: Users can configure "state.checkpoint-storage" in the configuration as the fully qualified name of the checkpoint storage or use some FLINK-provided checkpoint storage shortcut names such as "jobmanager" and "filesystem",  and provide the necessary configuration options for building that storage, instead of passing a CheckpointStorage objectand provide the necessary configuration options for building that storage, instead of passing a CheckpointStorage object.

  • StateBackend

Class

Fields or Methods

Method

Annotation

org.apache.flink.streaming.api.environment.StreamExecutionEnvironment

#setStateBackend(StateBackend backend)

@Public

getStateBackend

org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#getStateBackend()


Field

Annotation

org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#defaultStateBackend

@Public

Suggested alternative:  Users can configure Users can configure "state.backend.type"  in the configuration as the fully qualified name of the state backend or use some FLINK-provided state backend shortcut names such as in the configuration as the fully qualified name of the state backend or use some FLINK-provided state backend shortcut names such as "hashmap"  and and "rocksdb",  and provide the necessary configuration options for building that StateBackend, instead of passing a StateBackend objectand provide the necessary configuration options for building that StateBackend, instead of passing a StateBackend object.

Proposed Changes

We propose deprecating the classes/methods mentioned above and updating the documentation from the Flink website.

...