Currently, when a new KafkaStreams applications is initialized, the user would have a config available to them which defines the number of threads KafkaStreams will use (num-stream-threads). What will happen is that N StreamThread instances would be created where N = num-stream-threads. However, if N is greater than the number of tasks, then some threads will be held in reserve and will idle unless some other thread fails, in which case, they will be brought online. By current structure, each task will have at maximum one thread processing it.

Processor API Structure

draw.io Diagram

border	true
viewerToolbar	true

fitWindow	false
diagramName	Current Flow of Processor API
simpleViewer	false
width
diagramWidth	781
revision	2

Above, we could see a simplified diagram of how KafkaStreams implements the Processor API. It should be noted that one StreamThread can process records from multiple StreamTasks at once. But this is not applicable in reverse. A StreamTask could not be sending records to multiple StreamThreads. This is a major bottleneck and we would need to work to fix this.

Public Interfaces

We have a couple of choices available to us as mentioned in the ticket discussion (KAFKA-6989). Recall that in asynchronous methods, we do not need inputs from a previous function to start it. All we need is the signal. However, when offsets have been committed or processed in an asynchronized manner, we will need to determine the behavior that occurs after the conclusion of the process. It would also be wise to remember that asynchronous processing does not necessarily require an extra thread. When offsets are committed or received, we should consider the following:

...

Space shortcuts

Child pages

Versions Compared

Old Version 27

New Version 28

Key

Processor API Structure

Public Interfaces

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 27

New Version 28

Key

Processor API Structure

Public Interfaces