THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- i.e., if we have same some channel state on upstream, and channel state on downstream corresponds to it (as ensured by step 4); then it’s enough to sort them by channel id ([src_id : dst_id]); this will ensure the order of records
- there is a (technical?) problem with the proposed solution to match multi-buffer records: we need to alternate load channel and process network operations which can be tricky
- another solution would be to have temporary "virtual" channels to process data from "imported" channels
...
Reading/writing using the existing mechanisms (versus custom storage):
- Don’t duplicate code for provider-agnostic storage code
- Don’t duplicate configuration code
- Don’t duplicate , configuration, local recovery code
- Ease deployment by having single storage for operator and channel state
- Avoid inconsistencies between operator and channel state (e.g. different retention times)
- Possibility to reuse incremental checkpoints; other alternatives avoid doubly-storing buffers differently, but probably at the cost of more complex cleanup
Disadvantages
- Less flexibility
- Risk to of break snapshottingProbably more difficult to implement some parts
- Increased checkpoints size
...
The following optimizations won't necessarily be implemented in MVP:
- not writing the same buffer multiple times (when the job is backpressured); can be addressed by implementing incremental checkpointing for FS backend
- incremental loading and processing of state
- no additional memory: ideally, existing network buffers should be reused
...