...

Write Ahead Log (WAL) segments - constant size file files (WAL work directory 0...9.wal;, WAL archive 0.wal…)
CP markers - small files for events of starting and finishing checkpoints (UUID-Begin.bin, UUID-End.bin)
Page store - constains cache parameters, page store for partitions and SQL indexes data (uses file per partition: cache-(cache_name)\part1,2,3.bin, and index.bin)

...

Page store is the storage for all pages related to particual cache (cache and its partition. 's partitions and SQL indexes).

Using page identifier it is possible to map from page ID to file and to position in particular file:

No Format
pageId = ... \|\| partition ID \|\| page index and (idx) //pageId can be easily converted to file + offset in this file offset = idx * pageSize

...

Cache page storage contains following files:

part-0.bin, part-1.bin, ..., part-21023.bin (shown as P1,P2,PN at picture) - Cache partition pages.
index.bin - index partition data, special partition with number 65535 is used for SQL indexes and saved to index.bin
cache_data.dat - special file with stored cache data (configuration) . StoredCacheData includes more data than CacheConfiguration, e.g. Query entities

Persistence

Checkpointing

Checkpointing can be defined as process of storing dirty pages from RAM on a disk, with results of consistent memory state is saved to disk. At the point of process end, page state is saved as it was for the time the process begins.

There are two approaches to implementation of checkpointing:Can be of two types

Sharp Checkpointing - if checkpoint is completed all data structures on disk are consistent, data is consistent in terms of references and transactions.
Fuzzy Checkpointing - means state on disk may require recovery itself

Implemented Approach implemented in Ignite - Sharp Checkpoint; F.C. - to be done in future releases.

To achieve consistency Checkpoint checkpoint read-write lock is used (see GridCacheDatabaseSharedManager#checkpointLock)

Cache Updates - holds read lock
Checkpointer - holds write lock for short time. Holding write lock means all state is consistent, updates are not possible. Usage of CP Lock checkpoint lock allows to do sharp checkpoint

Under CP checkpoint write lock held we do the following:

1. 1. WAL marker is added: CP checkpoint (begin) record is added - CheckpointRecord - marks consistent state in WAL
  2. Collect pages were changed since last checkpoint

And then CP then checkpoint write lock is released, updates and transactions can run.

...

Copy on write technique is used. If there is modification in page which is under CP checkpoint now we will create temporary copy of page.

...

it is updated directly in memory bypassing CP checkpoint pool.

If page was already flushed to disk, dirty flag is cleared. Every future write to such page (which was initially involved into CP checkpoint, but was flushed) does not require CP checkpoint pool usage, it is written dirrectly in segment.

...

Possible future optimisation: for full update we may send page store file over network.

Order of nodes join is not relevant, there is possible situation that oldest node has older partition state, but joining node has higher partition counter. In this case rebalancing will be triggered by coordinator. Rebalancing will be performed from the newly joined node to existing one (note this behaviour may be changed under IEP-4 Baseline topology for caches)

WAL history size

In corner case we need to store WAL only for 1 checkpoint in past for successful recovery (PersistentStoreConfiguration#walHistSize )

...

Page tree

Versions Compared

Old Version 17

New Version 18

Key

Persistence

Checkpointing

WAL history size

Page tree

Page History

Versions Compared

Old Version 17

New Version 18

Key

Persistence

Checkpointing

WAL history size