Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Copy on write technique is used. If there is modification in page which is under CP now we will create temporary copy of page.


If page

  • was not involved into checkpoint,
  • but updated concurrenly with checkpointing process:

it is updated directly in memory bypassing CP pool.


If page was already flushed to disk, dirty flag is cleared. Every future write to such page (which was initially involved into CP, but was flushed) does not require CP pool usage, it is written dirrectly in segment.


Triggers

  • Percent of dirty pages is trigger for checkpointing (e.g. 75%).
  • Timeout is also trigger, do checkpoint every N seconds

...

  1. Logical record
    1. Operation description - which operation we want to do. Contains operation type (put, remove) and (Key, Value, Version)  - DataRecord
    2. Transactional record - this record is marker of begin, prepare, commit, and rollback tx records transactions - (TxRecord
    3. Checkpoint record - marker of begin checkpointing (CheckpointRecord)
  2. Physical records
    1. Full page snapshot - record is issued for first page update after successfull checkpointing. Record is logged when page state changes from 'clean' to 'dirty' state (PageSnapshot)
    2. Delta record - describes memory region change, page change. Subclass of PageDeltaRecord. Contains bytes changed in the page. e.g bytes 5-10 were changed to [...,]. Relatively small records for B+tree recordsFull page snapshot - written for first page change after CP, when page state changes from clean->dirty state (PageSnapshot)

For particular cache entry update we write log records in follwowing order:

  1. logical record with change planned - DataRecord with several DataEntry (ies)
  2. page record:
    1. option: then for page changed by this update we write was initially clean, full page is loged PageSnapshot,
    2. option: for  page was already modified, delta record is issued PageDeltaRecord

Possible future optimisation - refer data modified from PageDeltaRecord to logical record. Will allow to not store byte updates twice. We have There is file wal WAL pointer, pointer to record from the beginning of time. This refreence may be used.

...