Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Gliffy Diagram
nameOutOfOrderResolution-groupby
pagePin1

Table of Contents

Status

Current state: Under Discussion

...

This race condition is especially visible when multiple threads are being used to process the Kafka topic partitions. A given thread on a given node may process its records much sooner or much later than the other threads, due to load, network, polling cycles, and a variety of other causes. It is expected that there is no guarantee on the order in which the messages arrive. All things equal, it is equally likely that you would see the "null" message as the final result as it would be the correct updated message. This issue is only further compounded if the foreign key were changed several times in short succession, with multiple additional partitions. The only remedy to this that I can currently find is to propagate the offset of the original update along with the payload of the message. This introduces my next section.

Solution - Hold Ordering Metadata in Record Headers

Gliffy DiagramnameOutOfOrderResolution-RecordHeaderspagePin4


Final Steps - Materializing

A Table KTable<CombinedKey<A,B>,JoinedResult> is not a good return type. It breaks the KTable invariant that a table is currently partitioned by its key, which this table wouldn't be and the CombinedKey is not particularly usefull as its a mere Kafka artifact.

User Managed Group by

with a followed up group by, we can remove the repartitioning artifact by grouping into a map. Out of order events can be hold in the map and can be dealt with, hower one likes it. Either wait for some final state and propagate no changes that are "intermediate" and show artifacts or propagate directly. The eventuall correcness is guaranteed in both ways. The huge advantage is further, that the group by can be by any key, resulting in a table of that key.

Gliffy Diagram
size600
nameOutOfOrderResolution-RecordHeaders
pagePin7



Solution - Hold Ordering Metadata in Record Headers

Automaically rekey for the user. The user can get a clean KTable of the original key back. repartitioning required.

Gliffy Diagram
nameOutOfOrderResolution-RecordHeaders
pagePin7
version5


Since the final out-of-order data is sourced from a topic, the only way to ensure that downstream KTables have the means to query their parent's ValueGetter is to Since the final out-of-order data is sourced from a topic, the only way to ensure that downstream KTables have the means to query their parent's ValueGetter is to materialize the final state store. There is no way to get specific values directly from a topic source - a Materialized store is required when providing statefulness to data stored in a topic (see KTableSource). In this case, it would mean that a user-provided Materialized store is mandatory. The work flow would look like this:

Gliffy Diagram
nameFinalStageToUserStateStore
pagePin1

Unified Finalizing

Compatibility, Deprecation, and Migration Plan

...