...
...
innerJoin | leftJoin | outerJoin | |
lookup | each regular (value != null) input record (of each input KTable) triggers a lookup | each regular (value != null) input record of the left stream or each input record (including <key:null>) of the right stream triggers a lookup | each (including <key:null>) input record (of each input stream) triggers a lookup |
call ValueJoiner | as long as both of the two joining record are not null (i.e. the received record is not null, AND there is a matching record in the other store), trigger join; otherwise send tombstone. | left input: if record is not null call ValueJoiner (even if no matching record in right state was found); otherwise, send tombstone right input: if there is matching record in left store, call ValueJoiner; otherwise send tombstone | As long as one of the two joining record is not null (i.e. either the received record is not null, or there is a matching record in the other store), call ValueJoiner; otherwise send tombstone. |
state | each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable | each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable | each KTable state: changelog semantics, ie, regular <key:value> records insert/update KTable -- <key:null> records delete key in KTable |
Example (using single keywkey):
we only show values -- output shows the <value1;value2> pairs that are handed to ValueJoiner or null tombstone message
...
Handle <key:null> records for KStreams input as regular <key:value> records, rejected for the following reasons:
- in contrast to relational model, <key:null> records do have special semantics (there is actually nothing to be joined); this also relates to KTable tombstone (ie, delete) semantics of <key:null> records
- for inner joins, user would not expect that ValueJoiner is called with one parameter being null
- for left/outer join, user cannot distinguish if the call to ValueJoiner is done because no key was found or because <key:null> record was found
...
- alternatively, we would need to introduce different ValueJoiner classes with different method, ie, #join(left, right), #joinLeft(left), #joinRight(right), but we want to keep API simple
- InnerValueJoin only offering #join()
- LeftValueJoiner offering #join() and #leftJoin()
- alternatively, we would need to introduce different ValueJoiner classes with different method, ie, #join(left, right), #joinLeft(left), #joinRight(right), but we want to keep API simple
...
- OuterValueJoiner offering all three methods
add outer KStream-KTable join, rejected for the following reasons:
we can only trigger lookup for KStream records
- thus, outer KStream-KTable join is essentially same as left KStream-KTable join
...
- if we want outer KStream-KTable join, we must use a windowed KStream to get a state for lookups, thus introducing a completely new join operator (which is beyond of the KIP)