Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Gliffy Diagram
nameStreamTask Lifecycle

 

Exception Handling

A Kafka Streams client need to handle multiple different types of exceptions. We try to summarize what kind of exceptions are there, and how Kafka Streams should handle those. In general, Kafka Streams should be resilient to exceptions and keep processing even if some internal exceptions occur.

Types of Exceptions:

First, we need to distingues between retryable and non-retryable (ie, fatal) exceptions. For non-retryable/fatal exceptions, Kafka Streams is doomed to fail and cannot start/continue to process data.

 

 

the exception thrown during processing of this record. Null if no error occurred.
     *                  Possible thrown exceptions include:
     *
     *                  Non-Retriable exceptions (fatal, the message will never be sent):
     *
     *                  InvalidTopicException
     *                  OffsetMetadataTooLargeException
     *                  RecordBatchTooLargeException
     *                  RecordTooLargeException
     *                  UnknownServerException
     *
     *                  Retriable exceptions (transient, may be covered by increasing #.retries):
     *
     *                  CorruptRecordException
     *                  OffsetMetadataTooLargeException
     *                  NotEnoughReplicasAfterAppendException
     *                  NotEnoughReplicasException
     *                  OffsetOutOfRangeException
     *                  TimeoutException
     *                  UnknownTopicOrPartitionException

Not all exception that could potentially occure are exception we expect to ever happen. If an unexpected exception occurs, it indicate a bug in Streams API code base. Thus, we should fail-fast to get a proper bug report from the field.

  • Consumer exceptions:
    • expected:
      • ConfigException (fatal)
      • InvalidOffsetException (handled by StreamThread)
        1. OffsetOutOfRangeException
        2. NoOffsetForPartitionsException
      • CommitFailedException [non-EOS only] (handled by StreamThread: swallow and retry on next commit)
      • QuotaViolationException (fatal ?)
      • AuthorizationException (fatal)
      • SecurityDisabledException (fatal)
      • InvalidTopicException (fatal)
      • all RetryableException
    • should never happen (all fatal):
      • ConcurrentModificationException
      • WakeupException
      • InterruptedException
      • IllegalArgumentException
      • IllegalStateException
      • All ApiException that are not mentioned somewhere else
  • Producer exception:
    • expected:
      • BufferExhausedException (fatal)
      • SerializationException (fatal)
      • ProducerFencedException
      • SecurityDisabledException (fatal)
      • all RetryableException
    • should never happen:
      • All ApiException that are not mentioned somewhere else
  • AdminClient exceptions:
  • State store exceptions:
  • Serialization exceptions:
  • StreamsException
  • User-code exceptions:

 

What about

  • TopicExistException (consumer group leader "split brain" – might be self-healing)

 

Gliffy Diagram
nameExceptionHandling

 

Exception progagation / chain-reactions

For non-retryable/fatal exceptions there are two different subtypes. For the first type, we expect that it will eventually