...
The existing .checkpoint
files will be retained for any StateStore
that does not set managesOffsets()
to true
, and the existing offsets will be automatically migrated into StateStores
that manage their own offsets, iff there is no offset returned by StateStore#getCommittedOffset
.
Offsets for Consumer Rebalances
During Consumer rebalance, Streams directly checks all on-disk .checkpoint
files, including those for Tasks
/StateStores
that are not currently "open". This is done to ensure that Tasks
are assigned to the instance with the most complete local state. With Atomic Checkpointing, any StateStore
that manages its own offsets (i.e. managesOffsets()
is true
) will first need to be opened in order to read its offsets, and then closed again.
The additional latency this would introduce to the rebalance is unacceptable. Instead, we will continue to write the offsets of all StateStores
, including those that manage their own offsets, to the .checkpoint
file. For StateStores
that manage their own offsets, this file will only be read if that store is closed (i.e. not yet initialized). Since the store is not open, the above race condition does not apply, making this safe. If a StateStore
returns true for both managesOffsets
and isOpen
, then the store will be queried for its offsets directly, via getCommittedOffset
.
Interactive Queries
Interactive queries currently see every record, as soon as they are written to a StateStore
. This can cause some consistency issues, as interactive queries can read records before they're committed to the Kafka changelog, which may be rolled-back. To address this, interactive queries will query the underlying StateStore
, and will not be routed through a Transaction
. This ensures that interactive queries see a consistent view of the store, as they will not be able to read any uncommitted records.
...