...
Goals | Prerequisites | |
Phase 1 | Explicit cache, i.e. Table.cache() | |
Explicit cache removal, i.e. Table.invalidateCache() | RPC in TaskManager to remove result partitions. | |
Locality for default intermediate result storage. | Support of locality hint from ShuffleMaster | |
Phase 2 | Pluggable external intermediate result storage. | |
Phase 3 | Implicit Auto cache support, i.e cache at shuffle boundaries. | Old result partition eviction mechanism for ShuffleMaster / ShuffleService |
Long Term | Locality in general for external intermediate result storage and external shuffle service. | Custom locality preference mechanism in Flink |
Support of sophisticated optimization (use or not use cache) | Statistics on the intermediate results. | |
Cross-application intermediate result sharing. | External catalog service |
...