Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

JobModel is the data model in samza that logically represents a samza job. The JobModel hierarchy is that samza jobs have one to many containers(ContainerModel), and each container has one to many tasks(TaskModel). Each data model contains relevant information, such as logical id, partition information, etc. Existing host affinity implementation in yarn is accomplished through the following two phases:A.

  • ApplicationMaster(JobCoordinator) in yarn deployment model generates the Job model(optimal processor to task assignment) and persists the JobModel in coordinator stream(kafka topic) associated with the samza job. 

...

  • ContainerAllocator phase(which happens after JobModel generation) schedules each processor to run on a physical host by coordinating with the underlying ClusterManager and orchestrates the execution of the processor.

Here’re the list of important and notable differences in processor and JobModel generation semantics between yarn and standalone deployment model:

...