Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Page properties

Document the state by adding a label to the FLIP page with one of "discussion", "accepted", "released", "rejected".

Discussion threadhere (<- link to https://lists.apache.org/list.html?dev@flink.apache.org)/thread/cflonyrfd1ftmyrpztzj3ywckbq41jzg
Vote threadhere (<- link to https://lists.apache.org/list.html?dev@flink.apache.org)/thread/5qlr0xl0oyc9dnvcjr0q39pcrzyx4ohb
JIRA

Jira
serverASF JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-33221

Release<Flink Version>1.19


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Describe the problems you are trying to solve.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

  • The network protocol and api behavior

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Monitoring

  • Command line tools and arguments

  • Anything else that will likely break existing users in some way when they upgrade

Proposed Changes

...

In production environments, users typically develop and operate their Flink jobs through a managed platform. Platform administrators will set a series of default options, and users can override them using dynamic properties specified via platform UI (e.g. using Apache StreamPark) or config files (e.g. using Flink Kubernetes Operator).

Users may need to add JVM options to their Flink applications (e.g. to tune GC options). They typically use the env.java.opts.x series of options to do so. Platform administrators also have a set of JVM options to apply by default, e.g. to use JVM 17, enable GC logging, or apply pretuned GC options, etc. Both use cases will need to set the same series of options and will clobber one another. Similar issues have been described in SPARK-23472.

In the past, we managed to overcome it by prepending the administrator JVM options to user-specified JVM options using Java code when generating the starting command for JM/TM. However, this has been proven to be difficult to maintain. Therefore, I propose adding a set of default JVM options for administrator use that prepends the user-set extra JVM options. We can mark the existing env.java.opts.x series as user-set extra JVM options and add a set of new env.java.default-opts.x options for administrator use.

Public Interfaces

Add the administrator JVM options for each of the existing JVM options, except for client and SQL gateway. Values of the administrator JVM options listed above will be prepended to the user-set option values.

Existing JVM Options

Administrator JVM Option (newly added)

env.java.opts.all

env.java.default-opts.all

env.java.opts.jobmanager

env.java.default-opts.jobmanager

env.java.opts.taskmanager

env.java.default-opts.taskmanager

env.java.opts.client

N/A (Since the only way to specify Flink Client / SQL Gateway JVM options is by setting the corresponding options in the flink-conf.yaml file so far, it is meaningless to support an administrator version at this point)

env.java.opts.sql-gateway

Proposed Changes

Already covered above.

Dealing with JVM option conflicts

When the user-set JVM options has conflicts with the administrator-set options, the following resolution will be applied:

  • Duplicate options: the user-set options will win as the administrator-set options are prepended and have lower precedence.

  • Uncompatible JVM options: this may take place, for example, when administrators set options supported only in lower version of JVM, while the user wants to use a higher version of JVM, and they have no choice but to overwrite administrator JVM options as well.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Test Plan

Describe in few sentences how the FLIP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives

...

N/A.

Test Plan

Simple manual tests will do.

Rejected Alternatives

Prepend Administrator JVM Options to User JVM Options by Platform

An alternative is to maintain administrator JVM options in the platform's own config store and prepend them to the Flink JVM options when submitting the job. However, this approach is unfavored for two reasons:

  1. More difficult for users to resolve uncompatible JVM options as the administrator JVM options cannot be overriden by users directly as a normal Flink option now.
  2. More difficult for platform administrators to maintain administrator JVM options, as extra config assembly and management mechanism is needed.