...

ID

IEP-22

Author

Vladimir Ozerov Ozerov

Sponsor

Vladimir Ozerov Ozerov

Created

15 Jun 2018

Status


colour	Grey
title	DRAFT

Table of Contents

Motivation

...

[1] https://dev.mysql.com/doc/refman/8.0/en/insert-optimization.html
[2] https://dev.mysql.com/doc/refman/8.0/en/load-data.html
[3] https://dev.mysql.com/doc/internals/en/bulk-insert.html

Proposed Changes

...

Implement distributed cache (table) lock - when it is held no other operation on cache is possible, PME infrastructure will be used for that
Implement external sort - get some memory buffer, collect there values in some order, flush to disk if needed, merge in the end
Implement direct data load - write data to data blocks, re-created indexes with external sort and other necessary data structures in the end
Add necessary API to control direct data load mode

Distributed Cache Lock

Index update and integrity checks will be delayed during direct data load. In order to maintain data consistency during this time it is essential to prevent any concurrent updates. This could be done with new cache lock concept. We will initiate exchange which will wait for all cache operations to finish, and then put special flag on that cache. If flag is set, exception will be thrown in an attempt to access cache.

External Sort

We will not update PK and secondary indexes during the data load, so it is necessary to rebuild them in the end. The most efficient way to build indexes is bottom-up approach, when the lowest level of BTree is built first, and the root is build last. We will need a buffer where indexed values and respective links will be sorted in index order. If the buffer is big enough and all the data fits into it, index will be created in one hop. Otherwise it is necessary to sort indexed values in several runs using an external sort. It is necessary to let users configure sort parameters - buffer size (ideally - in bytes), and the file system path where temp files will be stored. The latter is critical - typically users would like to keep temp files on a separate disk, so that WAL and checkpoint operations are not affected.

Direct Data Load

We will have a small in-memory buffer for several consecutive data blocks. Data being injected is put into these blocks, bypassing the page memory. When the buffer is full, we could issue a multi-buffer async disk write and continue filling the buffer with new data. As data loading typically affects several partitions, multiple buffers and/or some additional synchronization may be required.

Data will be inserted into the new blocks only. We will have to track the start and end positions of the inserted data blocks. These positions will be used to scan new data during index rebuilding.

TBD

Interaction with WAL
Rebalance
Crash recovery

Tickets

Jira

server	ASF JIRA
columns	key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues	20
jqlQuery	project = Ignite AND labels IN (iep-22) ORDER BY status
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b

Page tree

Versions Compared

Old Version 12

New Version Current

Key

Motivation

Proposed Changes

Distributed Cache Lock

External Sort

Direct Data Load

TBD

Tickets

Page tree

Page History

Versions Compared

Old Version 12

New Version Current

Key

Motivation

Proposed Changes

Distributed Cache Lock

External Sort

Direct Data Load

TBD

Tickets