Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This is a symmetric non-window join. The basic semantics is a KTable lookup in the "other" stream for each KTable update. The result is a (continuously updating) KTable (ie, a changelog stream that can contain tombstone message with format <key:null>; those tombstone are shown as null in the result in contrast to results "X - null" indicating a valid join result with only one join partner). Pay attention, that the KTable lookup is done on the current KTable state, and thus, out-of-order records can yield non-deterministic result. Furthermore, in practice Kafka Streams does not guarantee that all records will be processed in timestamp order (even if processing records in timestamp order is the goal, it is only best effort).

Warning
titleKTable Cache

If you want to observe the below described behavior, you will most likely need to disable KTable deduplication cache (for Kafka 0.10.1.x), by setting cache.max.bytes.buffering=0 in StreamsConfig. Otherwise, the deduplication cache will "swallow" many of the produced results and it will be hard to reason about the actual join behavior.

ts

STREAM_1 (left)

STREAM_2 (right)

innerJoin

leftJoin

outerJoin

1

null

 

null

null

null

2

 

null

null

null

null

3

A

 

null

A - null

A - null

4

 

a

A - a

A - a

A - a

5

B

 

B - a

B - a

B - a

6

 

b

B - b

B - b

B - b

7

null

 

null

null

null - b

8

 

null

null

null

null

9

C

 

null

C - null

C - null

10

 

c

C - c

C - c

C - c

11

 

null

null

C - null

C - null

12

null

 

null

null

null

13

 

null

null

null

null

14

 

d

null

null

null - d

15

D

 

D - d

D - d

D - d