Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Here is an illustration of how the index might look like within a single bucket pre and post compaction 

Image ModifiedImage ModifiedImage Modified      

Here is the same illustration for cloud stores like S3

Image Modified Image Modified Image Modified


This structure gives us many benefits. Since async compaction has been battle tested, with some minimal changes, we can reuse the compaction. Instead of data files, it is embedded hFiles in this case. With this layout, it is easy to reason about rollbacks and commits. And we get the same file system views similar to a hoodie partition. For instance, fetching new index entries after a given commit time, fetching delta between two commit timestamps, point in time query and so on. 

...