Table of Contents |
---|
Status
Current state: Under DiscussionWithdrawn, because the committers do not seem to be convinced that you cannot control on what thread code runs with an asyn runtime.
Discussion thread: here discussion thread, though the discussion was mostly on the vote thread
JIRA: KAFKA-14972
Proposed implementation: pull request 13914
...
Examples of affected async runtimes are Kotlin co-routines (see KAFKA-7143) and Zio.
Here follows a condensed example of how we'd like to use ZIO in the rebalance listener callback from the zio-kafka library.
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
def onRevoked(revokedTopicPartitions: Set[TopicPartition], consumer: KafkaConsumer) = {
for {
_ <- ZIO.logDebug(s"${revokedTps.size} partitions are revoked")
state <- currentStateRef.get
streamsToEnd = state.assignedStreams.filter(control => revokedTps.contains(control.tp)) // Note, we run 1 stream per partition.
_ <- ZIO.foreachDiscard(streamsToEnd)(_.end(consumer)) // <== Streams will commit not yet committed offsets
_ <- awaitCommitsCompleted(consumer).timeout(15.seconds)
_ <- ZIO.logTrace("onRevoked done")
} yield ()
}
|
This code is run using the ZIO-runtime as follows from the {{ConsumerRebalanceListener::onPartitionsRevoked}} method:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
def onPartitionsRevoked(partitions: java.util.Collection[TopicPartition]): Unit = {
Unsafe.unsafe { implicit u =>
runtime.unsafe
.run(onRevoked(partitions.asScala.toSet, consumer))
.getOrThrowFiberFailure()
()
}
} |
(Note that this code is complex on purpose, starting a ZIO workflow from scratch is not something you would normally do.)
Look at line 6 of the first code block. In method end
the stream will try to call consumer::commitAsync(offsets, callback)
. In awaitCommitsCompleted()
we call consumer::commitSync(Collections.emptyMap)
to wait untill all callbacks are invoked.
Since this code is running in the rebalance listener callback, KafkaConsumer enforces that the commit methods must be invoked from the same thread as the thread that invoked onPartitionsRevoked
. Unfortunately, the ZIO runtime is inherently multi-threaded; tasks can be executed from any thread. There is no way Zio could support this limitation without a major rewrite.
Why can this code not run on a single thread?
We want to use the ZIO runtime. ZIO cannot support this (same argument applies to Cats-effects, a similar and also popular Scala library). To understand why, you first need to know how these libraries work.
In both libraries one creates effects (aka workflows) which are descriptions of a computation. For example, when executing the Scala code val effect = ZIO.attempt(println("Hello world!"))
one creates only a description; it does not print anything yet. The language to describe these effects is very rich, enough to describe entire applications. Things like concurrency, resource management, timeouts, retries, etc. can all be expressed in an effect. Then to execute the effect, one gives it to the runtime. The runtime then schedules the work on one of the threads in its thread-pool. Zio, nor Cats-effects supports running an effect on the thread that manages the thread-pool. Nor is it possible to do so; for example, how would one implement a timeout?
Another reason can be read in
Jira | ||||||
---|---|---|---|---|---|---|
|
Public Interfaces
Two new methods will be added to org.apache.kafka.clients.consumer.KafkaConsumer
: getThreadAccessKey
and setThreadAccessKey
.
...