Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: [ UNDER DISCUSSION ]

Discussion thread: <link to mailing list DISCUSS thread> [Discuss] SEP-23: Simplify Job Runner

JIRASAMZA-2405

Released: 

...

Samza Yarn follows a multi-stage deployment model, where Job Runner, which runs on the submission host, reads configuration, performs planning and persist config in the coordinator stream before submitting the job to Yarn cluster. In Yarn, Application Master (AM) reads config from coordinator stream before spinning up containers to execute. Split of responsibility between job runner and AM is operationally confusing, and makes debugging the pipeline difficult with multiple points of failure. In addition, since planning invokes user code, it usually requires isolation on the runner from security perspective to guard the framework from malicious user code, or a malicious user can gain access to other user jobs running on the same runner. 

Proposed Changes

We will provide a new config loader interface, which will be used by AM to fetch config directly. AM will invoke config loader to fetch config, perform planning, generate DAG and persist the final config back to coordinator stream for backward compatibility.In the proposed workflow, job runner will simplify job runner to only submit the Samza job to Yarn with the provided submission related configs, without invoking other complex logic such as config retrieval, planning, DAG generation, coordinator stream persisting etc. 

These configs include

  • configs directly related to job submission, such as yarn.package.path, job.name etc.
  • configs needed by the config loader on AM to fetch config from, such as path to the property file in the tarball, all of such configs will have a job.config.loader.properties prefix.
  • configs that users would like to override.

As this changes how the runner starts a job, we will take this opportunity to revamp Samza job start up approach as well, such that we In addition, these configs can only be supplied by --config, job runner will not read configs from local file anymore. This is in favor because we don't need to maintain the old launch workflow and we can eliminate the need to read configs multiple times. Instead, all job submission related configs will provided with --config. In addition, this is This is also consistent with other stream processing projectsframeworks, such as Flink, Spark and Dataflow.

In order to simplify job runner, we need to move the config retrieval logic from runner to AM, which is a prerequisite of planning, DAG generation. We will achieve this by providing a new config loader interface, which will be used by AM to fetch config directly. AM will invoke config loader to fetch config, perform planning, generate DAG and persist the final config back to coordinator stream for backward compatibility. This suggests that AM may need extra CPU and/or memory compared to existing flow. 

We will force users to update how they start their Samza jobs.

...