...
When operating under IsolationLevel.READ_UNCOMMITTED
, (i.e. ALOS), all records will be immediately visible to interactive queries, so the high default commit.interval.ms
of 30s
will have no impact on interactive query latency.
Error Handling
Kafka Streams currently generates a TaskCorruptedException when a Task
needs to have its state wiped (under EOS) and be re-initialized. There are currently several different situations that generate this exception:
- No offsets for the store can be found when opening it under EOS.
OutOfRangeException
during restoration, usually caused by the changelog being wiped on application reset.TimeoutException
under EOS, when writing to or committing a Kafka transaction.
The first two of these are extremely rare, and make sense to keep. However, timeouts are much more frequent. They currently require the store to be wiped under EOS because when a timeout occurs, the data in the local StateStore
will have been written, but the data in the Kafka changelog will have failed to be written, causing a mismatch in consistency.
With Transactional StateStores and Atomic Checkpointing, we can guarantee that the local state is consistent with the changelog, therefore, it will no longer be necessary to reset the local state on a TimeoutException
.
Compatibility, Deprecation, and Migration Plan
...