Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

JobModel is the data model which that logically represents a samza job. The hierarchy for JobModel is that samza jobs have one to many containers, and containers have one to many tasks. Each data model contains relevant information, such as an id, partition information, etc. Zookeeper is used to store JobModel in standalone and coordinator In standalone deployment model, JobModel is stored in zookeeper. Coordinator stream(kafka topic) is used to store JobModel in yarn deployment model.

In existing Existing implementation , of host affinity in yarn is accomplished via through the following two phases:
A. ApplicationMaster(JobCoordinator) in yarn deployment model generates the Job model(optimal processor to task assignment) and saves the JobModel in coordinator stream(kafka topic).
B. ContainerAllocator phase(which happens after JobModel generation) requests physical host(resources) from the cluster manager to facilitate execution of processors in JobModel.

...

Here’re the list of important and notable differences in processor and JobModel generator generation semantics between yarn and standalone deployment model:

...