Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Discussion threadTBAhttps://lists.apache.org/thread/gv0ro4jsq4o206wg5gz9z5cww15qkvb9
Vote threadTBAhttps://lists.apache.org/thread/7xfc5pwlcs29vr4c8bm7kbx3lq8n7r4m
JIRA

Jira
serverASF JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,customfield_12311032,customfield_12311037,customfield_12311022,customfield_12311027,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,Priority,Priority,Priority,Priority,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-32775

Release-

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

In a YARN environment, the current handling of classpath configuration using the yarn.provided.lib.dirs property has shown limitations. The yarn.ship-files property is utilized to add specific files to the classpath by including their parent directories. This approach is particularly useful when certain resources, such as hive-site.xml, need to be accessible to the application. However, there's an issue when the application employs Thread.currentThread().getContextClassLoader().getResource("hive-site.xml") to load hive-site.xml, as the parent directory of the resource is not included within Flink's classpath. This limitation prevents the application from effectively accessing the required configuration files.

To address this limitation, this is a proposal to extend suggests extending the existing current approach. The intention goal is to also incorporate include the parent directory of files specified in yarn.provided.lib.dirs to in the classpath. This extension will enable the inclusion of the necessary resources in ensure that the resources contained within these directories are accessible within the application's classpath, thereby providing allowing access to the required configuration files. This change will not modify the inclusion of JAR files designated in yarn.

Public Interfaces

provided.lib.dirs, which will continue to be added to the classpath as-is.

Public Interfaces

There will not be any programming interface change. However, the behavior of yarn.provided.lib.dirs will change to include the parent directory of resource files in the classpath for non-JAR resource files. JAR files will continue to be added to the classpath as they areNo new public interfaces are introduced by this proposal.

Proposed Changes

The proposed changes involve change involves enhancing the classpath configuration mechanism when using the yarn.provided.lib.dirs property. Specifically, the current method of utilizing yarn.ship-files to  to add specific files' parent directories to the classpath will be extended. This extension will encompass the inclusion of the parent directory of non-JAR files designated in yarn.provided.lib.dirs as  as well. This improvement aims to fix the issue where resources required for application configuration are inaccessible due to their parent directories not being included in Flink's classpath. This FLIP also addresses a concern related to treating all non-Flink dist-jar or plugin files as resource files. This approach might be considered excessive and could lead to resource overloading. An alternative approach that offers greater accuracy and explicitness is to follow a pattern similar to the handling of plugins. Specifically, introducing a reserved directory named resources can be considered. All files placed within the resources directory would be explicitly treated as resources. This approach offers a structured and controlled way to manage resources while avoiding potential overloading.JAR files designated in yarn.provided.lib.dirs will still be added to the classpath as they are without modification.

Compatibility, Deprecation, and Migration Plan

  • No impact on the existing users.
  • The proposed change is not introducing new parameters to the configuration file.
  • No special migration tools required.
  • There is an enhancement to the existing behavior.

Test Plan

  • Unit Test: New unit tests will be introduced to ensure proper inclusion of the parent directories of files specified in yarn.provided.lib.dirs to the classpath along with jar resource files
  • Manual Test: Add following configuration to flink-conf.yaml and submit Flink job. 

    Code Block
    yarn.provided.lib.dirs: hdfs:///user/argoyaltestuser/flinkTest/mapreduce/;hdfs:///user/argoyaltestuser/flinkTest/lib/;hdfs:///user/argoyaltestuser/flinkTest/plugins/;hdfs:///user/argoyaltestuser/flinkTest/opt/;hdfs:///user/argoyaltestuser/resources/;


Rejected Alternatives

No other ways to accomplish the same.