You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

AsterixDB supports a number of merge policies based on full merges [1]. In full merges, disk components are ordered based on recency and are always merged in full.


Concurrent Merge Policy (default)

The concurrent merge policy is the default merge policy for the current master. It has four important parameters:

  • minMergeComponentCount: the mimimum disk components per merge

  • maxMergeComponentCount: the maximum disk components per merge

  • maxComponentConut: the maximum number of disk components allowed before stopping flushes
  • size ratio: explained below.

Suppose we have a list of disk components D1, D2, ..., DN ordered from oldest to newest. The basic idea for this merge policy is to merge as many disk components as possible at once (not exceeding maxMergeComponentCount) but uses the size ratio to make optimal trade-offs. It will trigger as a merge as follows:

  • Start with the oldest disk component D1. Look at the longest younger disk component sequence D2, D3, ..., DK. If the length of D2, D3,...,DK >= minMergeComponentCount and the total size of D2, D3, ..., DK * sizeRatio >= the size of D1, trigger a merge of D1, D2, ..., DK.
  • If the merge cannot be triggered for D1, then report the above step for D2.

The size ratio controlls some interesting performance trade-offs. By setting the size ratio < 1, it gives a leveling merge policy[1]. Disk components will be merged more frequently and the total number of disk components will be reduced. By setting the size ratio > 1, it gives a tiering merge policy[2]. There will be more disk components but write performance is improved.

As the name suggests, this merge policy performs concurrent merges. The above merge decision process is refined by starting with the oldest component where no older disk component is being merged. For example, suppose we have D1, D2, D3, D4, D5 and D2+D3 are being merged. The next call of this merge policy will start looking from D4 (and later D5).

More explanation of this process can be found in my performance stability paper [2] (Section 5.3). It should be noted that this merge policy is highly dynamic and non-deterministic. Don't be suprised if different partitions have different storage layout when using this merge policy.

Prefix Merge Policy

The prefix merge policy is similar to the concurrent merge policy but has some important differences:

  • The prefix merge policy has an additional parameter "maxMergableComponentSize" to controll the maximum disk component size. All disk comopnents larger than this parameter will be excluded from future merging.
  • Only one merge is scheduled at a time (as opposed to concurrent merges).
  • The flow control mechanism (i.e., deciding when to stop flushes) is different: if there is an extra merge can be scheduled, then flushes will be stopped.

In general, the prefix merge policy should not be used for an update-heavy workload because we have to keep merging to clean up obsolete records. However, this policy will be helpful for an temporal and appendly-mostly workloads (with range filters)

Constant Merge Policy

The constant merge policy has been recently improved to incoporate a very interesting theorical study [3]. This policy is designed for append-mostly workloads and only requires a parameter "num-components" to control the total number of disk components. I won't describe the full detail here (because I don't completely understand the analysis) but it has been shown that this merge policy has optimal write cost when the  number of disk components is fixed and the workload is append-only.


[1] Chen Luo, Michael J. Carey. LSM-based Storage Techniques: A Survey, https://arxiv.org/abs/1812.07527.

[2] Chen Luo, Michael J. Carey. On Performance Stability in LSM-based Storage Systems, https://arxiv.org/abs/1906.09667.

[3] Claire Mathieu, Carl Staelin, Neal E. Young, Arman Yousefi. Bigtable Merge Compaction. https://arxiv.org/abs/1407.3008.

  • No labels