Status

Current stateAccepted

Discussion thread: here

JIRA: herehere

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently the ERROR state in KStreams means that there are no threads running. However, in KIP-663 streamThreads can be added dynamically so this definition no longer is useful. In that kip, the automatic transition to ERROR state upon the death of all threads is removed. Also, ERROR state should be terminal. This KIP should clarify the use of ERROR state as well as bring it into alignment with the other states

Public Interfaces

KafkaStreams will have added State PENDING_ERROR.

The following transitions will be added:

The following transitions will be removed:

The SHUTDOWN_CLIENT option in the Streams Uncaught Exception Handler should leave the client state in ERROR instead of NOT_RUNNING

Proposed Changes

As ERROR will now be a terminal state, PENDING_ERROR will be added. The only way to reach the ERROR state is through the PENDING_ERROR state. This will mirror PENDING_SHUTDOWN and mean that resources are closing before the client transitions to ERROR. Currently, the client goes to ERROR before it closes the resources and does not signal when done.

ERROR will be redefined to mean that the streams client is in an unrecoverable state and should not be restarted until the problem has been investigated. Streams will only reach ERROR state in the event of an exceptional failure in which the `StreamsUncaughtExceptionHandler` chose to either shutdown the application or the client. After beginning to shutdown either the client or the application, the state of the streams client will be PENDING_ERROR until it has finished closing all resources. After doing so it will transition to ERROR. It is not recommended to automatically restart from ERROR state.

In order to be consistent, SHUTDOWN_CLIENT will leave the client state in ERROR instead of NOT_RUNNING. ERROR should be the state that exceptional failures leave the application in, not NOT_RUNNING.

Close() called on ERROR or PENDING_ERROR will be no-op and not throw an exception but we will log a warning.

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives