Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Implementation details: as we keep no window state for KStream, we cannot trigger a lookup (and thus cannot call ValueJoineValueJoiner) for KTable updates (no left input value available). 

...

 

 

innerJoin

leftJoin

outerJoin

lookup

each regular (value != null) input record (of each input KTable) triggers a lookup

<key:null> does not trigger lookup

each regular (value != null) input record of the left stream or each input record (including <key:null>) of the right stream  triggers a lookup

each (including <key:null>) input record (of each input stream) triggers a lookup

call ValueJoiner

as long as both of the two joining record are not null (i.e. the received record is not null, AND there is a matching record in the other store), trigger join; otherwise send tombstone.

left input: if record is not null call ValueJoiner (even if no matching record in right state was found); otherwise, send tombstone


right input: if there is matching record in left store, call ValueJoiner; otherwise send tombstone

As long as one of the two joining record is not null (i.e. either the received record is not null, or there is a matching record in the other store), call ValueJoiner; otherwise send tombstone.

state

each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable

each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable

each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable

 

Example (using single keywkey):

we only show values -- output shows the <value1;value2> pairs that are handed to ValueJoiner or null tombstone message

...

  1. Handle <key:null> records for KStreams input as regular <key:value> records, rejected for the following reasons:

    •  in contrast to relational model, <key:null> records do have special semantics (there is actually nothing to be joined); this also relates to KTable tombstone (ie, delete) semantics of <key:null> records 
    • for inner joins, user would not expect that ValueJoiner is called with one parameter being null
    • for left/outer join, user cannot distinguish if the call to ValueJoiner is done because no key was found or because <key:null> record was found

...

    • alternatively, we would need to introduce different ValueJoiner classes with different method, ie, #join(left, right), #joinLeft(left), #joinRight(right), but we want to keep API simple
      •  InnerValueJoin only offering #join()   
      • LeftValueJoiner offering #join() and #leftJoin()

...

      • OuterValueJoiner offering all three methods
  1. add outer KStream-KTable join, rejected for the following reasons:

    • we can only trigger lookup for KStream records

    • thus, outer KStream-KTable join is essentially same as left KStream-KTable join

...

    • if we want outer KStream-KTable join, we must use a windowed KStream to get a state for lookups, thus introducing a completely new join operator (which is beyond of the KIP)