You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Next »

IDIEP-20
AuthorVladimir Ozerov Ozerov
SponsorVladimir Ozerov Ozerov
Created25 Apr 2018
StatusDRAFT


Motivation

Compression is used extensively by all database vendors to reduce TCO and improve performance. Compression could be applied to different parts of the system to achieve the following goals:

  • Less writes to disk - less IO call, less disk space needed
  • Less reads from disk - less IO call, less RAM is needed to accommodate the same number of records.

Performance numbers of other vendors demonstrate that we could expect 2x-4x decrease in required disk space and >1.5x increase in throughput (ops/sec) on typical workloads.

Competitive analysis

This section describes general compression approaches and their pros and cons. The following compression mechanisms are implemented in practice:

  1. Data format 
  2. Index prefix compression
  3. Page-level compression
  4. WAL compression
  5. Per-column compression
  6. Compression on file system level
  7. Column store

Data Format

Efficient disk usage starts with proper data layout. Vendors strive to place data in pages in such a way that total overhead is kept as low as possible while still maintaining high read speed. Typically this is achieved as follows:

  1. Common metadata such is stored outside of data page
  2. Numeric types are written using varlen encoding (e.g. int data type may take 1-5 bytes instead of 4)
  3. Fixed-length string data types (CHAR, NCHAR) are trimmed
  4. NULL and zero values are optimized to consume no space, typically with special bitmap (e.g. if there are 

Examples:

  1. SQL Server row format [1] - varlen, CHAR trimming, NULL/zero optimization
  2. MySQL row format [2] - no varlen, no CHAR trimming, NULL/zero optimization

[1] https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/row-compression-implementation?view=sql-server-2017
[2] https://dev.mysql.com/doc/refman/5.6/en/innodb-physical-record.html

Index Prefix Compression

Secondary indexes tend to have entries with common prefix. E.g. {'IndexedValue', link1}, {'indexedValue', link2}. Prefix compression technique extracts common prefixes from entries on the index page and place them in a special directory within a page.

Prefix compression could be applied to the following index types:

  1. Non-unique single column secondary index
  2. Non-unique and unique multi-column secondary index

Prefix compression could be implemented as follows:

  1. Static - compression is applied to all index rows irrespective of whether it is beneficial or not. Attributes with low cardinality are compressed well. Contrary, attributes with high cardinality may have negative compression rates. Decision whether to compress or not is delegated to user (typically DBA)
  2. Dynamic - compression is either applied or not applied to indexed values on page-by-page basis based on some internal heuristics. Negative compression rates are avoided automatically, but implementation is more complex.

Examples:

  1. Oracle index compression (static) [1]
  2. Oracle advanced index compression (dynamic) [2]
  3. MongoDB index compression [3]

[1] https://blogs.oracle.com/dbstorage/compressing-your-indexes:-index-key-compression-part-1
[2] https://blogs.oracle.com/dbstorage/compressing-your-indexes:-advanced-index-compression-part-2
[3] https://docs.mongodb.com/manual/core/wiredtiger/#storage-wiredtiger-compression

Page Compression

The whole pages could be compressed. This gives 2x-4x reduciton in size on average. Two different approaches are used in practice - without in-memory compression, with in-memory compression

Without in-memory compression

Data is stored in-memory as is, in uncompressed form. When it is time to flush data to disk compression is applied. If data size is reduced significantly, data is stored in compressed form. Otherwise it is stored in plain form (compression faiure). Big block sizes (e.g. 32Kb) is typically used in this case to achieve higher compression rates. Data is still being written to disk in blocks of smaller sizes. E.g. one may have 32Kb block in-memory, which is compressed to 7Kb, which is then written as two 4Kb blocks to disk. Vendors allow to select compression algorithm (Snappy, zlib, lz4, etc.).

Page compression is not applicable to index pages because it incurs serious slowdow no reads.

Hole punching with fallocate [1] might be added if underlying file system supports it. In this case compressed block is written as is, but then empty space is trimmed with separate system call. E.g. if 32Kb block is compressed to 6.5Kb, then 32Kb is written as is, and then 32 - 7 = 25 Kb are released. 

Advantages:

  • High compression rates
  • No overhead when reading data from memory
  • Ability to choose compression algorithm

Disadvantages:

  • High RAM usage
  • Need to re-compress data frequently (spe
  • Hole-punching is supported by very few file systems (XFS, ext4, Btrfs), and may lead to heavy file maintenance [2], [7]

Examples:

  1. MySQL Table Compression - uses different in-memory and disk block sizes, block data is fully re-compressed on every access [3]
  2. MySQL Page Compression - uses hole-punching instead [4]
  3. MongoDB with Snappy codec - gathers up to 32Kb of data and then try to compress it [5]
  4. Postgres Professional attempts to implement similar approach - data will be uncompressed when read from disk [6]

[1] http://man7.org/linux/man-pages/man2/fallocate.2.html
[2] https://mariadb.org/innodb-holepunch-compression-vs-the-filesystem-in-mariadb-10-1/
[3] https://dev.mysql.com/doc/refman/5.7/en/innodb-compression-background.html
[4] https://mysqlserverteam.com/innodb-transparent-page-compression/
[5] https://www.objectrocket.com/blog/company/mongodb-3-0-wiredtiger-compression-and-performance/
[
6] https://habr.com/company/postgrespro/blog/337180/
[
7] https://www.percona.com/blog/2017/11/20/innodb-page-compression/

With in-memory compression

Data is stored in-memory in compressed form. Data is accomodated in blocks in raw form. When certain threshold is reached data is compressed and more records could be added to it. Original row structure may be maintained to certain extent, so that subsequent reads do not need to uncompress data.

Advantages:

  • Less IO reads as more data fits memory
  • Uncompression may be avoided on reads in some cases

Disadvantages:

  • Lower compression rates
  • More complex algorithms

Examples:

  1. Oracle Advanced Compression [1], [2]
  2. MS SQL Page compression [3] 

[1] https://blogs.oracle.com/dbstorage/updates-in-row-compressed-tables
[2] https://blogs.oracle.com/dbstorage/advanced-row-compression-improvements-with-oracle-database-12c-release-2
[3] https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/page-compression-implementation?view=sql-server-2017

WAL Compression

Write-ahead log writes all changes to data to journal. Data is read back only during crash recovery. Data chunks being written to journal may be compressed. It reduces number of disk writes and saves WAL space increasing likelihood of delta-update in case of temporal node shutdown. Compression could be applied to specific records - the larger the record, the more savings are expected. Compression typically takes more time than decompression, so operation latency may increase. However less number of IO calls typically increases overall system throughput because disk resources are typically more deficient than CPU.

Examples:

  1. PostgreSQL
  2. MariaDB (MySQL)
  3. MongoDB

 

Draft materials

Compression options:
5) Column store
7) Per-column compression
9) WAL compression
10) New algorithms (LSM, BWTree, ...)
------------------------------
Postgres:
  • Compress full pages
  • Remove holes in the page
------------------------------
MySQL: 
3) COMPRESS/UNCOMPRESS functions
------------------------------
MariaDB
1) WAL compression - compress event data once certain threshold is reached https://mariadb.com/kb/en/library/compressing-events-to-reduce-size-of-the-binary-log/
3) Independent Column Compression - automatically compress and uncompress, cannot create indexes on these columns https://mariadb.com/kb/en/library/storage-engine-independent-column-compression/
4) COMPRESS function (similar as in MySQL?) https://mariadb.com/kb/en/library/compress/
------------------------------
SQL Server
------------------------------
Oracle
1) Basic compression - compression during bulk loads only (direct load, CREATE TABLE AS SELECT)
3) Advanced LOB compression and deduplication - appears to be something similar to PG TOAST (?)
?? 6) Hybrid columnar compression (query level, archive level) - tremendous compression rates (up to 50x), 5-15x typical, "query" - improves scan performance, "archive" - for data archives
------------------------------
MongoDB

Best practices

Granularity

1) Oracle: Compression at tablespace and table levels
4) MongoDB: Configurable per-collection and per-index https://www.mongodb.com/blog/post/new-compression-options-mongodb-30

Motivation

TBD

Proposed changes

TBD

 

Tickets

key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels