Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Proposed Changes

We will provide a pluggable config retrieval interface on AMnew config loader interface, which will simplify the job submission to Yarn, without involving any complex logic. AM on the other hand, will read job config using the provided config loaderbe used by AM fetch the config directly. AM will invoke config loader will fetch job config, performs planning, generate DAG and persist the final config back to coordinator stream.

Job runner will only submit the job to Yarn with the provided submission related configs. These configs include

  • configs directly related to job submission, such as yarn.package.path, job.name etc.
  • configs needed by the config loader on AM to fetch config from, such as path to the property file in the tarball.
  • configs that we would like to override.

As this changes how the runner starts a jobIn addition to this, we will take this opportunity to revamp samza job start up approachingapproach as well, such that all job submission related configs will provided with --config. This is consistent with other stream processing projects, such as Flink, Spart and Dataflow.

Public Interfaces

We will introduce one job configs to The following job config will be introduced to configure loader class on AM to fetch config:

  • job.config.loader.class

...

All the configs provided in the start up script will be passed to AM through environment variable and loaded by the designated config loader to load the complete config. 

...