Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In case of using separate process for capturing data changes from WAL archives makes the lag between CDC event happens and consumer notified about it is relatively big. It's proposed to provide opportunity to capture data and notify consumers directly from Ignite process. It helps minimize the lag by cost of additional memory usage.

Issues to solve

...

  1. Entrypoint for WALRecords to be captured by CDC. Options are:
    1. SegmentedRingByteBuffer (represents WAL segment) is multi-producer/single-consumer data structure.
      1. + Relying on the consumer workflow we can guarantee order of events.
      2. + Consumer is a background thread, capturing records doesn't affect performance of transactional threads
      3. - Can't filter physical records at the entrypoint (might waste the buffer space). Must deserialize and filter them later before actual sending to a CdcConsumer
      4. - The consumer is triggered by a schedule - every 500ms by default.
      5. - Logic has some differences depending on the WAL settings (mmap true/false, FULL_SYNC) 
    2. Capturing in FileWriteAheadLogManager#log(WALRecord).
      1. + Capture logical records only
      2. + Common logic for all WAL settings  
      3. - Captures record in buffer in transactional threads - might affect performance
      4. - CDC process must sort events by WALPointer by self - maintain concurrent ordering data structure, and implementing waiting for closing WAL gaps before sending.
      5. - Send events before they actually flushed in local Ignite node - lead to inconsistency between main and stand-by clusters.
  2. Behavior after the CDC buffer is full. Options are:
    1. Stop online CDC and delegate capturing to the ignite-cdc.sh process (CdcMain, based on WAL archives)
    2. Temporary switch to the CdcMain, and switch back to online CDC after closing the gap.
  3. From which point online CDC starts capturing:
    1. Check local persisted OnlineCdcConsumerState - find last captured WALPointer.
    2. What if the pointer is less than any pointer recovered during Ignite node startup?
    3. What if ignite-cdc.sh streamed before node stop?

User interface

...

  1. IgniteConfiguration#cdcConsumer - implementation of the CdcConsumer interface.
  2. IgniteConfiguration#cdcBufSize - size of the buffer used by CDC to store captured changes. Default is (walSegCount * walSegSize), for the default values it is 640MB.
  3. Logs: 
    1. Initialization info.
    2. Switch between working modes.
  4. metrics: 
    1. Ordinary CDC metrics (count of captured WAL segments and entries).
    2. Current working mode.
    3. Used buffer space.
    4. Lag between buffer and WAL archive (segments).
    5. Lag between writing to WAL and capturing by CDC (milliseconds).
    6. Last captured WALPointer.

Segments

...

Note, there is a confusion of using “segment” word:

  1. WAL segments are represented as numerated files. Size of WAL segments is configured with DataStorageConfiguration#walSegmentSize.
  2. ReadSegment is a slice of the mmap WAL segment. It contains WAL records to sync with the actual file. Size of the segment differs from time to time and its maximum can be configured with DataStorageConfiguration#walBuffSize.

CdcWorker

...

CdcWorker is a thread responsible for collecting WAL records, transforming them into cdc events, submitting them to the CdcConsumer. The worker has 2 modes:

...