Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The current solution would keep task number unchanged and use a proper partition-to-task assignment to make sure the Samza output is correct after partition expansion. An alternative solution is to allow task number to increase after partition expansion and uses a proper task-to-container assignment to make sure the Samza output is correct. The second solution, which allows task expansion, is needed in order to scale up the performance of Samza. Note that this solution would also allow partition expansion for stateful job that doesn't use join operation for co-partitioned streams. However, the second solution is much more complicated to design and implement than the solution proposed in this doc. And it doesn't enable the partition expansion for stateful Samza jobs that uses join operation for co-partitioned streams (See Rejected Alternative section), which can be addressed by this proposal. Thus, these two solutions don't replace each other and can be designed independently. We plan to use the first solution described in this doc to enable partition expansion as a low hanging fruit. The feature of task expansion is out of the scope of this proposal and will be addressed in a future SEP.

...