Performance numbers of other vendors demonstrate that we could expect 2x-4x decrease in required disk space and >1.5x increase in throughput (ops/sec) on typical workloads.

Competitive

...

Analysis

This section describes general compression approaches and their pros and cons. The following compression mechanisms are implemented in practice:

Data Format
Index Prefix Compression
Page Compression
WAL compression
Column Compression
Columnar Store
Row Compression with Dictionary
Misc approaches

Data size could be reduced with efficient data format. Common row metadata could be skipped. Numeric values could be encoded to occupy less space. Fixed-length strings could be trimmed. NULL and zero values could be skipped with help of small row bitmaps.

Next compression could be applied to specific columns, either manually or transparently.

Data pages could be compressed on per-page basis, what gives 2x-4x compression rate on typical workloads. Depending on concrete implementation pages could be stored in memory in compressed or uncompressed forms. File system specific features, such as hole punching, could be applied to reduce IO.

Indexes are compressed differently because it is necessary to keep index page structure for fast lookups. Prefix compression is a common mechanism.

Extreme compression rates up to 10x-15x are possible with column store formats. But it is only applicable to historical data with minimal update rate and is not suitable for lookups.

WAL could be compressed to decrease log size. In some cases it may improve overall system throughput.

...

Efficient disk usage starts with proper data layout. Vendors strive to place data in pages in such a way that total overhead is kept as low as possible while still maintaining high read speed. Typically this is achieved as follows:

Common metadata

...

is stored outside of data page
Numeric types are written using varlen encoding (e.g. int data type may take 1-5 bytes instead of 4)
Fixed-length string data types (CHAR, NCHAR) are trimmed
NULL and zero values are optimized to consume no space, typically with special bitmap (e.g. if there are

Examples:

SQL Server row format [1] - varlen, CHAR trimming, NULL/zero optimization
MySQL row format [2] - no varlen, no CHAR trimming, NULL/zero optimization

...

Prefix compression could be applied to the following index types:

Non-unique single column secondary index
Non-unique and unique multi-column secondary index

Prefix compression could be implemented as follows:

Static - compression is applied to all index rows irrespective of whether it is beneficial or not. Attributes with low cardinality are compressed well. Contrary, attributes with high cardinality may have negative compression rates. Decision whether to compress or not is delegated to user (typically DBA)
Dynamic - compression is either applied or not applied to indexed values on page-by-page basis based on some internal heuristics. Negative compression rates are avoided automatically, but implementation is more complex.

Examples:

Oracle index compression (static) [1]
Oracle advanced index compression (dynamic) [2]
MongoDB index compression [3]

...

[1] http://www.oracle.com/technetwork/database/features/availability/311358-132337.pdf
[2] https://msdn.microsoft.com/en-us/library/gg492088(v=sql.120).aspx
[3] https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-overview?view=sql-server-2017
[4] https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/data-compression?view=sql-server-2017#using-columnstore-and-columnstore-archive-compression

Misc Approaches

Compressing large values

Large values, such as LOBs and long varchars, cannot be stored in original data block. Some vendors compress these values and then split into pieces.

Row Compression with Dictionary

Usually data is stored in row format, and there's a lot of overlap between values in different rows. Flags, enum values, dates or strings can have same byte sequences repeating from row to row. It is possible to harvest a set of typical rows for a table, create an external dictionary based on them, and then reuse this dictionary when writing each next row. This only offers limited benefits for classical RDBMS since their row format is low-overhead and with fixed field offset lookups, which are defeated by compression. However, BinaryObjects used in Ignite are high-overhead, with field/type information repeating in every record, and offset lookups are not used. Row compression can provide high yield with low overhead. In theory it is possible to share dictionary between nodes, but having separate dictionaries look more practical.

Advantages:

Easy to implement - no architectural changes
Reasonably fast in both writing and reading
2.5x compression on mock data with naive prototype
More data per page - more data fits in RAM - less latency even if throughput is lower

Disadvantages:

Need to keep dictionary alongside the data
Occassionally need to update the dictionary as data evolves
Keep track of multiple dictionaries per node
Pages from different nodes use different dictionaries, might interfere with checkpointing

Examples:

IBM DB2 supports this approach on per-table basis [1]
A prototype of dictionary-based row compression for Apache Ignite [2]

[1] https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.admin.dbobj.doc/doc/c0052331.html

[2] https://github.com/apache/ignite/pull/4295

Misc Approaches

Compressing Large Values

Large values, such as LOBs and long varchars, cannot be stored in original data block. Some vendors compress these values and then split into pieces.

Examples:Examples:

PostgreSQL TOAST [1]
Oracle Advanced LOB Compression [2] [3]

[1] https://wiki.postgresql.org/wiki/TOAST
[2] https://docs.oracle.com/database/121/ADLOB/adlob_smart.htm#ADLOB45944
[3] https://docs.microfocus.com/itom/Network_Automation:10.50/Administer/DB_Compression/ConfiguringLOB_Oracle

Compression

...

During Data Load

Oracle attempts to compress values during data load (Direct Path, CTAS, INSERT ... APPEND) [1]. Compression is applied on per-block basis using dictionary approach. Oracle may decide to skip compression if there are no benefits. Alternatively, it may reorder attrbutes in rows to get longer common prefixes and improve compression ratio.

[1] https://www.red-gate.com/simple-talk/sql/oracle/compression-oracle-basic-table-compression/

Best practices

https://www.oracle.com/us/assets/lad-2015-ses16380-pedregal-2604876.pdf

https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/data-compression?view=sql-server-2017

Proposed changes

TBD

...

CTAS, INSERT ... APPEND) [1]. Compression is applied on per-block basis using dictionary approach. Oracle may decide to skip compression if there are no benefits. Alternatively, it may reorder attrbutes in rows to get longer common prefixes and improve compression ratio.

[1] https://www.red-gate.com/simple-talk/sql/oracle/compression-oracle-basic-table-compression/

Proposed Changes

Some approaches adds more value than others. Some approaches are hard to implement, some are easy. For this reason compression should be implemented in phases, with the most efficient and simple techniques being developed first. Proposed plan:

Phase 1: Low Hanging Fruits

Index Prefix Compression - efficient and relatively easy to implement
WAL Compression - could increase throughput and easy to evaluate

Phase 2: The Battle

Page Compression - efficient, but implementation would be complex. with lots of changes to storage engine
or, Row Compression with Dictionary - no changes to storage engine, but adds management of dictionaries

Phase 3: Excellence

Data Format improvements - moderate value for the system, complex to implement, benefit may be cancelled out by actual compression
Column Compression - depends on new data format

Out of Scope

The following changes are not likely to be implemented in the nearest time due to their complexity and/or limited impact on general use cases:

Columnar store - require large changes in storage engine

Tickets

Jira

server	ASF JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues	20
jqlQuery	project = Ignite AND labels IN (iep-20) ORDER BY status
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b

Page tree

Versions Compared

Old Version 31

New Version Current

Key

Competitive

Analysis

Misc Approaches

Compressing large values

Row Compression with Dictionary

Misc Approaches

Compressing Large Values

Compression

During Data Load

Best practices

Proposed changes

Proposed Changes

Phase 1: Low Hanging Fruits

Phase 2: The Battle

Phase 3: Excellence

Out of Scope

Tickets

Page tree

Page History

Versions Compared

Old Version 31

New Version Current

Key

Competitive

Analysis

Misc Approaches

Compressing large values

Row Compression with Dictionary

Misc Approaches

Compressing Large Values

Compression

During Data Load

Best practices

Proposed changes

Proposed Changes

Phase 1: Low Hanging Fruits

Phase 2: The Battle

Phase 3: Excellence

Out of Scope

Tickets