Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: [ UNDER DISCUSSION | ACCEPTED | REJECTED  ]

Discussion thread: <link to mailing list DISCUSS thread>

...

We will provide a pluggable config retrieval interface on AM, which will simplify the job submission to Yarn, without involving any complex logic. AM on the other hand, will read job config using the provided config loader, performs planning, generate DAG and persist the final config back to coordinator stream.

We will also make changes to start up script, run-app.sh, such that it does not read local config files anymore. All In addition to this, we will take this opportunity to revamp samza job start up approaching, such that all job submission related configs needs to be explicitly will provided with --config. This is consistent with other stream processing projects, such as Flink, Spart and Dataflow.

Public Interfaces

We will introduce one job configs to configure the job to load configuration loader class on AM to fetch config:

  • job.config.loader.class

The changes are backward incompatible as we are We will be removing the usage of --config-factory & --config-path in the start up script, run-app.sh. Instead, we will ask for explicit configurations related to job submission such job name, yarn package path using --config.

...

Code Block
deploy/samza/bin/run-app.sh \
  --config job.name=wikipedia-stats \
  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
  --config yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz \
  --config job.config.loader.class==org.apache.samza.config.loader.PropertiesConfigLoader \
  --config config.path=/__package/config/wikipedia-feed.properties

Alternatives

The above approach requires existing users to update its way to start a samza job. Alternatively, we may keep the ability for runner to read from a local config, and AM will load the config using with the loader again.

Take wikipedia-feed in Hello Samza as an example:

Code Block
deploy/samza/bin/run-app.sh \
  --config job.config.loader.class==org.apache.samza.config.loader.PropertiesConfigLoader \
  --config local.config.path=/config/wikipedia-feed.properties
  --config config.path=/__package/config/wikipedia-feed.properties

We need to either provide multiple config path so PropertiesConfigLoader can load the corresponding file path or implement PropertiesConfigLoader in a way that works on both runner and AM with a single path.

This approach is not in favor as it is introducing much complex logic in config handling, thus resulting more confusion but does provide a certain level of ease of migration.

Implementation and Test Plan

...