Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Motivation

For some datasets and applications (like Cloudberry), it is desirable to have the property that all disk components of the primary index and all secondary indexes of a dataset align on the same filter value boundaries. The benefit is that when a tuple is found at some component di of the secondary index, we can directly search the corresponding component di' of the primary index to fetch that tuple without checking other disk components.

Current Workflow of Flush/Merge

Currently, the workflow of the flush operation is as follows. After a transaction commits (insert/delete/upsert), if any memory component of any index of a dataset needs flush (i.e., is full), the primary index operation tracker would submit a flush request for all indexes of the dataset to the LSMIOOperationScheduler. That is, all indexes of a dataset would be flushed together, which means the newly generated disk components due to flush are always aligned.

For merge, whenever a new disk component is added for an index (due to flush or merge), the corresponding merge policy would be notified. The merge policy checks the existing disk components for an index, and if it decides some disk components need to be merged, it would submit the merge request to the LSMIOOperationScheduler. By default, the merge request is sent for each index independently. However, currently we have a CorrelatedPrefixPolicy which only checks the disk components of the primary index, and sends a corresponding merge request for all secondary indexes together when the primary index needs to be merged.

Correlated Merge Policy

Proposed Solution

...

Deprecated