Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

IDIEP-22
AuthorVladimir Ozerov Ozerov
SponsorVladimir Ozerov Ozerov
Created15 Jun 2018
Status

Status
colourGrey
titleDRAFT


Table of Contents

Motivation

...

[1] https://dev.mysql.com/doc/refman/8.0/en/insert-optimization.html
[2] https://dev.mysql.com/doc/refman/8.0/en/load-data.html
[3] https://dev.mysql.com/doc/internals/en/bulk-insert.html

Proposed Changes

...

  1. Implement distributed cache (table) lock - when it is held no other operation on cache is possible, PME infrastructure will be used for that

  2. Implement external sort - get some memory buffer, collect there values in some order, flush to disk if needed, merge in the end
  3. Implement direct data load - write data to data blocks, re-created indexes with external sort and other necessary data structures in the end
  4. Add necessary API to control direct data load mode 

Distributed Cache Lock

Index update and integrity checks will be delayed during direct data load. In order to maintain data consistency during this time it is essential to prevent any concurrent updates. This could be done with new cache lock concept. We will initiate exchange which will wait for all cache operations to finish, and then put special flag on that cache. If flag is set, exception will be thrown in an attempt to access cache. 

External Sort

We will not update PK and secondary indexes during the data load, so it is necessary to rebuild them in the end. The most efficient way to build indexes is bottom-up approach, when the lowest level of BTree is built first, and the root is build last. We will need a buffer where indexed values and respective links will be sorted in index order. If the buffer is big enough and all the data fits into it, index will be created in one hop. Otherwise it is necessary to sort indexed values in several runs using an external sort. It is necessary to let users configure sort parameters - buffer size (ideally - in bytes), and the file system path where temp files will be stored. The latter is critical - typically users would like to keep temp files on a separate disk, so that WAL and checkpoint operations are not affected.

Direct Data Load

We will have a small in-memory buffer for several consecutive data blocks. Data being injected is put into these blocks, bypassing the page memory. When the buffer is full, we could issue a multi-buffer async disk write and continue filling the buffer with new data. As data loading typically affects several partitions, multiple buffers and/or some additional synchronization may be required. 

Data will be inserted into the new blocks only. We will have to track the start and end positions of the inserted data blocks. These positions will be used to scan new data during index rebuilding.

TBD

  • Interaction with WAL
  • Rebalance
  • Crash recovery

Tickets

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = Ignite AND labels IN (iep-22) ORDER BY status
serverId5aa69414-a9e9-3523-82ec-879b028fb15b