Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Problem

Samza Yarn follows a split multi stage deployment model, where Job Runner, which runs on the submission host, reads configuration, performs planning and persist config in the coordinator stream before submitting the job to Yarn cluster. In Yarn, Application Master (AM) reads config from coordinator stream before spinning up containers to execute. Split of responsibility between job runner and AM is operationally confusing, and makes debugging the pipeline difficult. In addition, since planning invokes user code, it requires isolation on the runner from security perspective to guard the framework from malicious user code. In addition, config file is already packed in the tarball submitted to Yarn, it could be easier for AM to pick up the config locally.

...