Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The above approach requires existing users to update its way to start a samza Samza job. Alternatively, we may keep the ability for runner to read from a local config, and AM will load the config using with the loader again.

Option 1 - Coexist ConfigFactory and ConfigLoader

ConfigFactory will be used to read configs during start up, which provides start up configs as of today.

ConfigLoader will be used on AM to fetch complete configs for the job to run.

This is rejected because coexist both interfaces brings confusion on their usage, in addition, reading configs multiple times introduces extra complexity in the workflow.

Option 2 - Launch aware ConfigLoader

ConfigLoader takes in a signal for it to know whether it is being invoked on the runner or on AM, then it can fetch configs accordingly based on the input properties. For example, when the input config path is /config/wikipedia-feed.properties, ConfigLoader will read from "/config/wikipedia-feed.properties" on runner and read from "/__package/config/wikipedia-feed.properties" on AM, as all Samza job tarballs are bing unzipped under "__package" folder.

This approach is rejected because the expected assumption is too tight and does not have much flexibility. In addition, implementation of ConfigLoader will depend on the deployment of a Samza job, which should be independent and completely decoupled.

Option 3 - Launch aware ConfigLoader with additive properties

ConfigLoader takes in a signal for it to know whether it is being invoked on the runner or on AM, then it can fetch corresponding configs accordingly in the input properties.Take wikipedia-feed in Hello Samza as an example:

Code Block
deploy/samza/bin/run-app.sh \
  --config job.config.loader.class==org.apache.samza.config.loader.PropertiesConfigLoader \
  --config localjob.config.loader.properties.local.path=/config/wikipedia-feed.properties
  --config job.config.loader.properties.remote.path=/__package/config/wikipedia-feed.properties

We need to either provide multiple config path so PropertiesConfigLoader can load the corresponding file path or implement PropertiesConfigLoader in a way that works on both runner and AM with a single pathConfigLoader will use "job.config.loader.properties.local.path" when running on runner and "job.config.loader.properties.remote.path" on AM.

This approach is not in favor as it is introducing much complex logic in config handling, thus resulting more confusion but does provide a certain level of ease of migrationrejected as it causes excessive responsibility for users to configure multiple properties. In addition, implementation of ConfigLoader will depend on the deployment of a Samza job, which should be independent and completely decoupled.

Implementation and Test Plan

...