Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

IDIEP-32
Author
Sponsor

 

Created 
Status
Status
colourGrey
titleDRAFT


Table of Contents

Motivation

Currently batch updates on page memory level are not implemented in Ignite. Internal structures, such as BPlusTree and FreeList, do not support them. The performance of batch operations (putAll, datastreamer, preloader) can be improved by implementing batch updates on page memory level.

Competitive analysis

Profiling

Profiling the current rebalancing process shows that most of the time spent on working with FreeList and B+ Tree (details).

Process overview

Currently, when updating the batch of cache entries in off-heap, each of them is processed separately. Such update in PageMemory consists of the following steps:

...

To prevent a redundant secondary search in B+ Tree the invoke operation was introduced.

Invoke in B+ Tree

Let's describe the B+ Tree in more detail to understand the need for invoke operation.

...

Code Block
OffHeapManager.invoke
                |-> BPlusTree.invoke
                |               |
                |               |
                |               |
                |               |-> Invoke.invokeClosure()
                |               |           |-> UpdateClosure.call(foundRow)
                |               |           |                   |- compare entry version
                |               |           |                   |- newRow = create newRow on data page
                |               |           |                   |- treeOp = PUT/REMOVE/NOOP
                |               |           |                  
                |               |           \-> read treeOp -> create operation closure (Put(newRow)/Remove(row))
                |               |
                |               \-> Invoke.found()/notFound() -> exec operation closure
                |
                |
                \-> finishUpdate/finishRemove

Store key-value

Saving a key-value on data page consists of the following operations:

  1. Remove a page from FreeList, which has enough storage space (if there is no such page - allocate a new one)
  2. Lock page
  3. Write cache entry data
  4. Update page counters
  5. Unlock page
  6. Based on the remaining free space insert page into FreeList

Advantages

The batch will improve:

  1. Average insertion time by reducing count of FreeList actions on each data row:
    *
    • insert/remove page into FreeList
    *
    • update page counters
    *
    • pages locking
    *
    • WAL recording
  2. Average search/update time in B+ Tree

Proposed changes

Implementation should consists of two major related improvements:

  1. Batch writing to data pages
  2. Batch updates in B+ Tree

Diagram overview

draw.io Diagram
bordertrue
viewerToolbartrue
fitWindowfalse
diagramNamebatch updates
simpleViewerfalse
width600
diagramWidth1061
revision5

Batch write to data pages

Find the most free page with enough space for data row in FreeList (if there is no such page - allocate new one) and fill it up to the end.

Batch update in B+ Tree

TBD: describe the implementation.

Proposed plan

Overall changes to support batch updates in PageMemory can be divided into following phases.

Phase 1: Batch insert new keys

  • Implement insertDataRows operation in FreeList - insert several data rows at once.
  • Implement invokeAll operation in BPlusTree: support searching and inserting range of keys.
  • Enable batch insertion in preloader (enabled by a special system property).

Phase 2: Batch update existing keys

  • InvokeAll operation should support batch update of existing keys.

Phase 3: DataStreamer support

  • Add support for batch updates in DataStreamer (isolated updater).

Phase 4: putAll support

  • Add support for batch updates in IgniteCache putAll.

Phase 5: MVCC support

  • Add support for MVCC (TRANSACTIONAL_SNAPSHOT) cache mode.

Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

Jira
serverASF JIRA
columnskey,summary,type,updated,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = Ignite AND labels IN (iep-32) order by key
serverId5aa69414-a9e9-3523-82ec-879b028fb15b