Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

JobModel is the data model which represents a samza job. The hierarchy for JobModel is that jobs have containers, and containers have tasks. Each data model contains relevant information, such as an id, partition information, etc. Zookeeper is used as to store JobModel store in standalone and coordinator stream(kafka topic) is used as to store JobModel store in yarn.

In existing implementation, host affinity in yarn is accomplished via two phases:
A. ApplicationMaster(JobCoordinator) in yarn deployment model generates the Job model(optimal processor to task assignment) and saves the JobModel in coordinator stream(kafka topic).
B. ContainerAllocator phase(which happens after JobModel generation) requests physical host(resources) from the cluster manager to facilitate execution of processors in JobModel.

...