...

Let's describe the B+ Tree in more detail to understand the need for invoke operation.

The keys of the tree (hashes) are stored on the B+ Tree pages (index pages), the cache key-value itself is stored on data pages. Each item on the index page includes a link to the data page item. In general, a B+ Tree supports find, put and remove operations. For put andremove, you must first find the point of insertion/update/removal. So, cache entry update without invoke operation can look like this:

Search B+ Tree for link to old key-value (find)
The old value does not differ in length - a simple value update of key-value on the data page
The old value differs in length - the link to it changes:
Store a new key-value into data page
Put B+ Tree key (with "secondary" find) to update link to data page item
Remove old key-value from data page

The invoke operation uses an in-place update and has the following execution scheme:

...

Batch writing to data pages
Batch updates in B+ Tree

Diagram overview

...

Batch writing to data pages

Divide the input data rows into 2 lists:

Objects whose size is equal to or greater than the size of a single data page.
Other objects and remainders (heads) of large objects.

Sequentially write objects and fragments that occupy the whole page. The data page is taken from "reuse" bucket, if there is no page in reuse bucket - allocate a new one.

For remaining (regular) objects (including the remainders ("heads") of large objects) , find the most free page with enough space in FreeList (allocate new one if there is no such page) and fill it up to the end.

...

Overall changes to support batch updates in PageMemory can be divided into following phases.

Phase 1:

...

Batch insertion in FreeList to improve rebalancing

Implement insertDataRows operation in FreeList - insert several data rows at once.
Implement invokeAll operation in BPlusTree: support searching and inserting range of keys.
Enable batch insertion in preloader (enabled by a special system property)Preloader should insert a batch of data rows before initializing cache entries. In the case when the cache entry is initialized incorrectly, preloader should rollback changes and remove pre-created data row.

Phase 2:

...

DataStreamer support

InvokeAll operation should support batch update of existing keys.

Phase 3: DataStreamer support

Add support for batch updates in DataStreamer (inserts in FreeList in the isolated updater (similar to the preloader).

Phase

...

3: putAll support

Add support for Implement batch updates in IgniteCache putAll.

...

operations in B+ tree (findAll/putAll/removeALl/invokeAll).
Examine the performance difference between the following approaches and select the best:
A. single updates (current approach)
B. sort + BPlusTree.invokeAll() + FreeList.insertDataRow
C. sort + BPlusTree.findAll + FreeList.insertDataRows + BPlusTree.putAll

Phase 4: MVCC support

Add support for MVCC (TRANSACTIONAL_SNAPSHOT) cache mode.

Risks and Assumptions

Concurrent update of the cache entry is blocked until the batch operation with the same key is completed, therefore the batch size must be strictly limited.
Duplicate keys processing (WAL rebalancing) based on assumption that duplicate keys will be ordered according to cache entry version (a newer version is the latest)For BPlusTree batch operations, ordered keys are required, moreover, an attempt to simultaneously lock the same keys in a different order lead to a deadlock, so batch insertion into the page memory must be performed on an unlocked entries. Alternatively, keys passed in batches from different components (preloader, datastreamer, putAll) should be locked in the same order.
Heap usage/GC pressure.

Prototype testing results

...

Parameters: 1 node, 1 cache, 1 partition, 500 100 objects, message size is not limited, 4k page.

Entry size (bytes)

44-5656-104

140-240340

240340-540740

500740-8001240

8001240-12003040

2000-3000

10001040-800080404000-

4040-16040

(fragmented)

16000

100-32000

(fragmented mostly)

Time improvement (%)

4543.1445

37.21

4233.59

4128.6243.

0

405.8468

10.41

708.6

691.91

64.2

Testing on dedicated servers

...

Jira

server	ASF JIRA
columns	key,summary,type,updated,assignee,reporter,priority,status,resolution
maximumIssues	20
jqlQuery	project = Ignite AND labels IN (iep-32) order by key
serverId	5aa69414-a9e9-3523-82ec-879b028fb15b

The improvement in rebalancing time when using batch insertion is mostly noticeable when writing small objects and decreases with increasing object size.the improvement in total rebalancing time is reduced in cases of demander idle waiting for the next messages from the supplier

Page tree

Versions Compared

Old Version 62

New Version Current

Key

Diagram overview

Batch writing to data pages

Phase 1:

Batch insertion in FreeList to improve rebalancing

Phase 2:

DataStreamer support

Phase 3: DataStreamer support

Phase

3: putAll support

Phase 4: MVCC support

Risks and Assumptions

Prototype testing results

Testing on dedicated servers

Page tree

Page History

Versions Compared

Old Version 62

New Version Current

Key

Diagram overview

Batch writing to data pages

Phase 1:

Batch insertion in FreeList to improve rebalancing

Phase 2:

DataStreamer support

Phase 3: DataStreamer support

Phase

3: putAll support

Phase 4: MVCC support

Risks and Assumptions

Prototype testing results

Testing on dedicated servers