You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Status

Current state[One of "Under Discussion", "Accepted", "Rejected"]

Discussion threadhere (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)

JIRAhere (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)

Released: <Flink Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The current flink-conf.yaml parser in FLINK is not a standard YAML parser, which has some shortcomings. Firstly, it does not support nested structure configuration items and only supports key-value pairs, resulting in poor readability. Secondly, if the value is a collection type, such as a List or Map, users are required to write the value in a FLINK-specific pattern, which is inconvenient to use. Additionally, the parser of FLINK has some differences in syntax compared to the standard YAML parser, such as the syntax for parsing comments and null values. These inconsistencies can cause confusion for users, as seen in FLINK-15358 and FLINK-32740.

By supporting standard YAML, these issues can be resolved, and users can create a Flink configuration file using third-party tools and leverage some advanced YAML features. Therefore, this FLIP aims to introduce a standard YAML parser for parsing the FLINK configuration file.

Public Interfaces

  • Introduce the flink-config.yaml configuration file as the next generation of Flink configuration files. When flink-config.yaml exists in the Flink conf directory, Flink will use the standard YAML parser to parse it as the Flink configuration.

  • Modify the Flink packaging process so that when generating flink-dist, the flink-conf.yaml file is no longer generated in the conf directory. Instead, a flink-config.yaml file that conforms to the standard YAML syntax will be generated.

  • For compatibility reasons, in Flink 1.x, if the old configuration file flink-conf.yaml exists in the Flink conf directory, Flink will ignore the new configuration file flink-config.yaml and use the old parser to parse flink-conf.yaml as the Flink configuration. In Flink 2.x, Flink will no longer support parsing the old configuration file flink-conf.yaml.

Compatibility, Deprecation, and Migration Plan

  • Compatibility:

    There is no compatibility issue here because we use different configuration file names to ensure compatibility. The default configuration file will be changed to "flink-config.yaml" and parsed by the standard YAML parser. If users prefer to use the old parser, they can create a "flink-conf.yaml" file in the conf directory.

    Note that there are some behavior changes when using the standard YAML parser compared to using current Flink parser:


    Standard YAML parser

    FLINK parser

    ConfigOption key

    The key of a ConfigOption cannot be a prefix of another option's key.

    No requirements.

    Comment

    Comments must be separated from other tokens by white space characters.

    Anything after the '#' symbol is considered a comment.

    Null value

    Parses null or blank values as null value.

    Anything after the first ':' symbol is treated as a string type value, except for blank values.

    Special characters

    The indicator characters in standard YAML are as follows, and more details can be referred to at: https://yaml.org/spec/1.2.2/.

    If you want to use the following special characters as a part of a string value, you need to escape them by using quotation marks.

    • -
    • ?
    • :
    • ,
    • [
    • ]
    • {
    • }
    • #
    • &
    • *
    • !
    • |
    • >
    • '
    • "
    • %
    • @

    Currently, there are some special characters sequences in the FLINK parser, as follows:

    • ": " (a colon and a whitespace)
      • In the FLINK parser, the first ": " in a key-value pair is considered as the delimiter between the key and value, while the remaining ": " will be treated as a part of the value.
    • '#'
      • In the FLINK parser, anything after the first '#' is considered as a comment. So the hash tag (#) cannot be included as a part of the key-value pair.
    • ';'
      • When using the List type In the ConfigOption,  the semicolon (;) is used as the delimiter between List elements. If you want to include ';' as a part of an element value, you need to escape it using quotation marks.
    • ','
      • When using the Map type In the ConfigOption, the comma (,) is used as the delimiter between map elements, If you want to include ',' as a part of the value, you need to escape it using quotation marks.
    • ':'
      • When using the Map type In the ConfigOption, the colon (:) is used as the delimiter between key and value in the map, If you want to include ':' as a part of the value, you need to escape it using quotation marks.

    Duplicate key

    Standard YAML does not allow duplicate keys in a map. It states that "The content of a mapping node is an unordered set of key/value node pairs, with the restriction that each of the keys is unique." For more details, please refer to: https://yaml.org/spec/1.2.2/#nodes.

    FLINK parser allows users to configure duplicate keys, where the key-value pair that appears later in the file will override the earlier key-value pair.

    Sequences style

    Standard YAML has two styles for sequences, as follows:

    • Flow Style:
    • Blocking Style:
      • A block sequence is simply a series of nodes, each denoted by a leading “-” indicator. The “-” indicator must be separated from the node by white space. More details can refer to: https://yaml.org/spec/1.2.2/#821-block-sequences.

    The sequence style in FLINK parser is separated by ";" (semicolon).

    For example: A;B;C.

    Mapping style

    Standard YAML has two styles for Mapping, as follows:

    The Mapping style in FLINK parser separates key-value pairs using ":" (colon), and different key-value pairs are separated by "," (comma).

    For example: k1:v1, k2:v2, k3:v3.

  • Deprecation:

    The old configuration file "flink-conf.yaml" will be deprecated by announcing this change in the release notes and user documentation.

  • Migration Plan:

    In FLINK-2.0, the old parser will no longer be supported, nor will the flink-conf.yaml file be used as a configuration file.

Test Plan

The change will be tested via UT cases and e2e tests.

Rejected Alternatives

To avoid unexpected behavior changes for FLINK users, it is not reliable to use a standard YAML parser to parse the existing flink-conf.yaml file. This is because using different parsers to parse the same configuration file can result in potential changes in the parsing results, which can be difficult for users to notice. The following are examples that would cause a breaking change: 

  1. Comment

          The FLINK parser considers anything after the '#' symbol as a comment, while the standard YAML parser required "Comments must be separated from other tokens by white space characters". If the user configures "key1: value1#comment1" in "flink-conf.yaml", the old parser will regard "value1" as the value and "#comment1" as a comment. However, the standard YAML parser will regard "value1#comment1" as the value.

       2. Null value

           The FLINK parser considers anything after the first ':' symbol as string type value. However, the standard YAML parser parses null or blank as null, unless it is enclosed in double quotes, such as "null".

  • No labels