Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state"Under Discussion"

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Streams API can fail with a TimeoutException if a global state (or global KTable) is used. This is an issue if the brokers are temporarily not available, as no retry strategy is provided, and thus the StreamsThread (and consequently the whole KafkaStreams instance) dies. To make Streams API robust against such issues, it would be required to implement a configurable retry strategy. 

Public Interfaces

We suggest to add a new configuration parameters to StreamsConfig namely "retries" similar to KafkaConsumer and KafkaProducer. To keep the current "fail-fast" strategy for default config, default value for "retries" could be 0. Furthermore, we would use already existing parameter "retry.backoff.ms" for this retry strategy.

Proposed Changes

We would apply both parameters in GlobalStateManagerImpl to guard against TimeoutException for KafkaCosumer#partitionsFor and KafkaConsumer#endOffsets.

...

To align with current design, we would also pass the new parameter to producer and consumer if there is no "producer." or "consumer." prefix. Thus, users can still configure different values for consumer/producer if needed.

Compatibility, Deprecation, and Migration Plan

  • This change is backwards compatible as we only add new parameters and also keep the current behavior with default parameter settings.
  • State locking retry would be disabled by default (currently hard coded of 5), but this should be ok.

Test Plan

We add tests with mocking the global consumer client to throw TimeoutException to check if retry parameters are respected.

Rejected Alternatives

None.