Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • configs directly related to job submission, such as yarn.package.path, job.name etc.
  • configs needed by the config loader on AM to fetch config from, such as path to the property file in the tarball.
  • configs that we users would like to override.

As this changes how the runner starts a job, we will take this opportunity to revamp samza job start up approach as well, such that all job submission related configs will provided with --config. This is consistent with other stream processing projects, such as Flink, Spart and Spark and Dataflow.

We will force users to update how they start their Samza jobs.

Public Interfaces

The following job config will be introduced to configure loader class on AM to fetch config:

...

All the configs provided in the start up script will be passed to AM through environment variable and loaded by the designated config loader to load the complete config.  Config provided by startup script will override those read by the loader.

The full list of configs can be found in References#Complete list of job submission configs

...

Code Block
deploy/samza/bin/run-app.sh \
  --config job.name=wikipedia-stats \
  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
  --config yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz \
  --config job.config.loader.class==org.apache.samza.config.loader.PropertiesConfigLoader \
  --config job.config.loader.path=/__package/config/wikipedia-feed.properties

Rejected Alternatives

The above approach requires existing users to update its way to start a samza job. Alternatively, we may keep the ability for runner to read from a local config, and AM will load the config using with the loader again.

...