Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. In the embedded samza library model, users are expected to perform manual garbage collection of unused local state stores(to reduce the disk footprint) on nodes.
  2. Monitoring and handling the increase/decrease of input stream partitions of a stateful job is out of scope of this feature.

Proposed Changes

JobModel is the data model which represents a samza job. The hierarchy for JobModel is that jobs have containers, and containers have tasks. Each data model contains relevant information, such as an id, partition information, etc. Zookeeper is used to store JobModel in standalone and coordinator stream(kafka topic) is used to store JobModel in yarn.

...