Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. find relevant state
    1. what are the requirements? can we use any distribution as long as we can satisfy (3) (find correct channels)?
  2. filter out records/buffers
    1. after scaling out, state file can contain irrelevant records
  3. load data into the correct channels (IncputChannel/SubPartition)
    1. this should resolve MBR issues
    2. solution: task IDs?
  4. ensure ordering
    1. solution: epochs?

Open questions

Avoiding double processing in downstream (if continuous spilling)

...