Released:

Table of Contents
maxLevel 1

Problem

Samza Yarn follows a multi-stage deployment model, where Job Runner, which runs on the submission host, reads configuration, performs planning and persist config in the coordinator stream before submitting the job to Yarn cluster. In Yarn, Application Master (AM) reads config from coordinator stream before spinning up containers to execute. Split of responsibility between job runner and AM is operationally confusing, and makes debugging the pipeline difficult with multiple points of failure. In addition, since planning invokes user code, it usually requires isolation on the runner from security perspective to guard the framework from malicious user code, or a malicious user can gain access to other user jobs running on the same runner.

...

The full list of configs can be found in References#Complete list of job submission configs

Take wikipedia-feed in Hello Samza as an example:

Code Block

deploy/samza/bin/run-app.sh \
  --config job.name=wikipedia-stats \
  --config job.factory.class=org.apache.samza.job.yarn.YarnJobFactory \
  --config yarn.package.path=file://${basedir}/target/${project.artifactId}-${pom.version}-dist.tar.gz \
  --config job.config.loader.class==org.apache.samza.config.loader.PropertiesConfigLoader \
  --config config.path=/__package/config/wikipedia-feed.properties

...

Space shortcuts

Child pages

Versions Compared

Old Version 11

New Version 12

Key

Table of Contents
maxLevel 1

Problem

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 11

New Version 12

Key

Table of ContentsmaxLevel1

Problem

Table of Contents
maxLevel 1