...
ID | IEP-104 | ||||||||
Author | |||||||||
Sponsor | |||||||||
Created | 26/05/2023 | ||||||||
Status |
|
Table of Contents |
---|
IEP-59 Change Data Capture defines CDC that runs in near realtime. The background process ignite-cdc
awaits WAL segments to be archived for data capturing. The awaiting leads to the lag between the moment event happens and consumer is notified about it. This lag can be relatively big (1s-10s seconds). It's proposed to provide opportunity to capture data and notify consumers directly from the Ignite node process. It will minimize the lag by cost of additional memory usage.
Enable realtime CDC on cluster:
ignite-cdc
(it automatically switches to the PASSIVE
mode)Ignite node restart after failure:
Run CDC with only ignite-cdc.sh process:
./control.sh –cdc realtime off
Command stops Ignite internal cdc process, CDC relies on ignite-cdc only (it automatically switches to the ACTIVE state).
Try restart realtime CDC after working with online ignite-cdc.sh:
...
Ignite
IgniteConfiguration#CdcConfiguration
- CdcConsumer, keepBinary.DataStorageConfiguration#cdcBufSize
- by default (walSegments * walSegmentSize). it’s now 640 MB by default. ignite-cdc:
control.sh
Note, there is a confusion of using “segment” word:
DataStorageConfiguration#walSegmentSize
.ReadSegment
is a slice of the mmap WAL segment. It contains WAL records to sync with the actual file. Size of the segment differs from time to time and its maximum can be configured with DataStorageConfiguration#walBuffSize
.On Ignite start during memory restore (in the main thread):
CdcConfiguration#cdcConsumer
is not null, then create CdcProcessor
.CdcProcessor
read from the Metastorage the last persisted CdcConsumerState
.CdcState#enabled
is false then skip initialization.CdcState == null
then initialize.GridCacheDatabaseSharedManager#performBinaryMemoryRestore
.Entrypoint for WALRecords to be captured by CDC. Options are:
FileWriteAheadLogManager#log(WALRecord).
First option is proposed to use.
CdcWorker is a thread responsible for collecting WAL records and submitting them to a CdcConsumer
. The worker collects records in the queue.
Capturing from the buffer (wal-sync-thread):
...
Otherwise, stop realtime CDC:
...
Body loop (cdc-worker-thread):
RealtimeCdcRecord
record to WAL with the WALPointer.Try switch to the realtime mode:
...
FileWriteAheadLogManager
logs record into mmap files, each is represented as a byte buffer FileWriteHandleImpl#SegmentedRingByteBuffer
. The buffer designed for multiple writers, single reader.
The reader is a thread that is responsible for fsync'ing the file content on a disk. This role are performed by the following threads: wal-segment-syncer
, db-checkpoint-thread
or user thread in case rollover WAL segment.
It's guaranteed that the reader reads the buffer sequentially from first byte until the buffer full. Then it's safe to notify CDC about new events from the reader.
Performance suggestions:
IGNITE_WAL_SEGMENT_SYNC_TIMEOUT.
During start node performs memory restore based on WAL - restore physical state and replay logical updates. Here CDC should collect events from WAL since the CdcConsumerState#walState
until the restored pointer.
The restoring the state should be performed before any new events happened.
Enable realtime CDC on cluster:
ignite-cdc
(it starts by default in the PASSIVE
mode)Ignite node restart after failure:
Ignite
CdcManager
interface that provides ...
Code Block | ||||
---|---|---|---|---|
| ||||
RealtimeCdcRecord extends WALRecord { private WALPointer last; } StopRealtimeCdcRecord extends WALRecord { private WALPointer last; } TryStartRealtimeCdcRecord extends WALRecord { } |
RealtimeCdcRecord
and StopRealtimeCdcRecord
CdcRecord
- clears obsolete links from CDC directoryStopRealtimeCdcRecord
- switch to ACTIVE mode, start capturing from the last WALPointer (from previous RealtimeCdcRecord).TryStartRealtimeCdcRecord
- after reaching it, persist CdcConsumerState locally, switch to PASSIVE mode....
language | java |
---|---|
title | CdcWorker |
...
...
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
// Links to discussions on the devlist, if applicable.
// Links to various reference documents, if applicable.
...
Jira | ||||||
---|---|---|---|---|---|---|
|