Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Pros: If we have N records and M buckets, the time complexity of each incoming record is (N/M) + M and there are at most M writes to the underling store. There is some tradeoff between purely optimizing Optimizing for time complexity alone gives M = sqrt(N) for overall O(sqrt(N)) (although there are other considerations – see below)

Cons: Additional spatial complexity (N + M storage is needed). If the running aggregates are held in-memory, restore time is increased; if held in the same RocksDB instance as the window store we increase the size and writes to that store with performance implications; if held in a separate RocksDB instance there is additional memory overhead (relative to reusing the window store's RocksDB)

Determining 'M': M should be chosen carefully considering the relative expense of aggregations and writes to the underlying store – the appropriate value for M will depend, for example, on whether the underlying store is RocksDB or in-memory. Also we may consdier optimizations such as loading the first (oldest) bucket into memory for quicker reads, in which case M must be large enough s.t. N/M values will fit in memory.

Compatibility, Deprecation, and Migration Plan

...