Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Initial data load is one of the most frequent and important use cases for Apache Ignite. At the moment the fastest way to load data is data streamer, which relies on asynchronous messaging фnd fast-path entry update method [1].  But all other internal mechanics of cache and page memory works as usual during data load.

General approach employed by major database vendors is to disable and/or skip as much internals as possible during data load, and to employ alternative sorting methods for indexes. With this idea idea, we have potential to improve data loading speed in several times as follows:

  • Add exclusive cache access mode, when only data loading process is able to interact with cache
  • Load data directly to data pages, skipping page buffer and free lists
  • Then rebuild indexes bottom-up using external sort algorithm

Speedup is expected from less number of IO operations and and less locking page/entry overhead.

[1] https://github.com/apache/ignite/blob/ignite-2.5/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/GridCacheMapEntry.java#L2691

...