Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Page properties


ReasonDiscarded because discussion stalled and there's no intention to work on this for the moment.


Status

Current state:  Under DiscussionDiscarded

Discussion threadhttp://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-161-Configuration-through-environment-variables-td48094.html (discussion prior to FLIP)

...

Flink currently requires configuration to be written to file. By allowing to override this configuration through environment variables, configuration can be made much more flexible. This becomes particularly useful in Kubernetes scenarios where some of the configuration can be defined through secrets exposed as environment variables, e.g. access keys. Furthermore, Flink can benefit from this internally as well as this mechanism provides an easy way to randomize end-to-end test configuration, see FLINK-19520.The specific approach proposed here is inspired by, and follows in large parts, the design of the equivalent feature of the Spring framework. This provides confidence as the feature has been excessively used already, and familiarity with developers who have a knowledge overlap.

Public Interfaces

No public interfaces apart from configuration are affected. Flink configuration is also not affected directly, but indirectly by virtue of allowing environment variable to override entries in the Flink configuration.

...

For example, with an environment

Code Block
FLINK_CONFIG_KEY_A="Environment=A"
FLINK_CONFIG_KEY_B="Environment=B"

and a configuration file

...

Due to this, a convention needs to be established on how configuration keys are looked up in the environment. We propose that each environment variables for the Flink configuration key (example:(e.g. key.A-b )  is looked up in follow the following ways and order, stopping if any yield a match:schema:

  1. Prefix "FLINK_CONFIG_" → FLINK_CONFIG_key.A-b  (no change)
  2. Replace "." (period) with "_" (underscore) → FLINK_CONFIG_key_A-b  (periods → underscores)
  3. key.A_b  (hyphens → underscores)
  4. key_A_b  (periods + hyphens → underscores)
  5. KEY.A-B  (uppercase)
  6. Replace "-" (dash) with "__" (double underscore) → FLINK_CONFIG_key_A__b 
  7. Uppercase → FLINK_CONFIG_KEY_A__B 

This provides a (semi-)bijective function between environment variable name and configuration key. More specifically, it allows parsing configuration keys from the environment without having to have prior knowledge of available configuration keys. Given an environment, we can look for all environment variables starting with the FLINK_CONFIG_ prefix and map them to their configuration key counterpart by following the inverse procedure:

  1. Remove FLINK_CONFIG_ prefix → KEY_A__B 
  2. Replace "__" (double underscore) with "-" (dash) → KEY_A-B  (uppercase, periods → underscores)
  3. Replace "_" (underscore) with "." (period) → KEY.A_-B  (uppercase, hyphens → underscores)
  4. KEY_A_B  (uppercase, periods + hyphens → underscores)

As motivated earlier, this follows the same specification as Spring.

Implementation Notes

Environment variables are evaluated lazily when the configuration option is requested. This is necessary as during parsing of the file there is no global knowledge of supported keys, and eagerly looking up all of them would likely lead to many unnecessary lookups. It is thus proposed to make use of Configuration#getRawValue to intercept querying a configuration parameter and perform the lookups described earlier. If a match is found, it should be cached such that further queries against the same key do not cause additional lookups.

...

  1. Lowercase → key.a-b 

As we can see, this yields the original (intended) configuration key, with the only difference being the casing. Configuration currently treats keys case-sensitively, but we propose to relax this requirement and treat them case-insensitively during the lookup of a specific key.

This mapping is not strictly bijective, but cases with consecutive periods or dashes in the key name are not considered here and should not (reasonably) be allowed. This should therefore be enforced in the implementation as well to prevent further development to run into such scenarios.

Implementation Notes

This proposal affects two code paths:

  1. In GlobalConfiguration, parsing the environment given the procedure above will be implemented. This overrides any configuration present from the configuration file.
  2. In Configuration, looking up a key in the internal data structure is changed to become case-insensitive.

Compatibility, Deprecation, and Migration Plan

No impact on existing users is expected. Hypothetically, if the environment happens to include a variable with a matching name, this change would cause a change in behavior. We consider this risk to be negligibly low due to the chosen naming schema, however. 

Test Plan

This feature These changes can be covered entirely through unit - tests against Configuration  and GlobalConfiguration.

Rejected Alternatives

Substitution

Initially, we also discussed a substitution solution where users modify their Flink configuration to use an environment variable substitution syntax such as 

...

  1. It changes the syntax of the configuration and would require additional details to be added to the syntax, i.e. to define default/fallback values and to escape variables so that they are not replaced.
  2. It requires the introduction of a new set of environment variables users have to memorize.
  3. If users want full flexibility to override any value, a configuration file would have to be maintained which simply maps all keys to some environment variable.


Lazy Evaluation

We initially proposed a naming schema for environment variables which closely follows how Spring does it, which includes several different alternatives per configuration key. However, this approach requires either complete knowledge of all configuration keys upfront, or lazy evaluation of the environment when a configuration key is looked up. During the discussion it was decided that neither approach seems favorable or very feasible.