ID | IEP-32 |
Author | |
Sponsor |
|
Created | |
Status | DRAFT |
Currently batch updates on page memory level are not implemented in Ignite. Internal structures, such as BPlusTree and FreeList, do not support them. The performance of batch operations (putAll, datastreamer, preloader) can be improved by implementing batch updates on page memory level.
Profiling the current rebalancing process shows that most of the time spent on working with FreeList and BPlusTree (details).
Currently, when updating the batch of cache entries in off heap, each of them is processed separately.
Such update in PageMemory can be divided into the following steps:
Search and update of key in tree implemented through the invoke operation, to understand the needs for this operation, it is necessary to describe B+ tree in more detail.
The keys of the tree (hashes) are stored on the B+ tree pages (index pages), the cache key-value (data row) itself is stored on data pages. Each item on the index page includes a link to the data page item.
In general, a B+ tree supports find, put and remove operations. For put and remove, you must first find the point of insertion/update/removal.
So, cache entry update without invoke operation should look like this:
To prevent a redundant secondary search in the case described above the invoke operation was introduced. This operation has the following scheme of execution:
OffHeapManager.invoke |-> BPlusTree.invoke | | | | | | | |-> Invoke.invokeClosure() | | |-> UpdateClosure.call(foundRow) | | | |- compare entry version | | | |- newRow = create newRow on data page | | | |- treeOp = PUT/REMOVE/NOOP | | | | | \-> read treeOp -> create operation closure (Put(newRow)/Remove(row)) | | | \-> Invoke.found()/notFound() -> exec operation closure | | \-> finishUpdate/finishRemove
Saving a data row on data page consists of the following operations:
Implementation should consists of two major related improvements:
Find the most free page with enough space for data row in FreeList (if there is no such page - allocate new one) and fill it up to the end.
Improves average insertion time by reducing count of FreeList actions on each data row (insert/remove pages, locking and counters updates).
Improves the time of searching and inserting a range of keys.
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
// Links to discussions on the devlist, if applicable.
// Links to various reference documents, if applicable.