You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Status

Current state"Under Discussion"

Discussion thread: [DISCUSS] KIP-224: Add configuration parameters `retries` and `retry.backoff.ms` to Streams API

JIRA: KAFKA-6122

Released: targets 1.1.0

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Streams API can fail with a TimeoutException if a global state (or global KTable) is used. This is an issue if the brokers are temporarily not available, as no retry strategy is provided, and thus the StreamsThread (and consequently the whole KafkaStreams instance) dies. To make Streams API robust against such issues, it would be required to implement a configurable retry strategy. 

Public Interfaces

We suggest to add a new configuration parameters to StreamsConfig namely "retries" similar to KafkaConsumer and KafkaProducer. To keep the current "fail-fast" strategy for default config, default value for "retries" could be 0. Furthermore, we would use already existing parameter "retry.backoff.ms" for this retry strategy.

Proposed Changes

We would apply both parameters in GlobalStateManagerImpl to guard against TimeoutException for KafkaCosumer#partitionsFor and KafkaConsumer#endOffsets.

Furthermore, we can apply both parameters for retrying getting state locks (cf. StateDirectory) that is hard coded with five retires at the moment.

To align with current design, we would also pass the new parameter to producer and consumer if there is no "producer." or "consumer." prefix. Thus, users can still configure different values for consumer/producer if needed.

Compatibility, Deprecation, and Migration Plan

  • This change is backwards compatible as we only add new parameters and also keep the current behavior with default parameter settings.
  • State locking retry would be disabled by default (currently hard coded of 5), but this should be ok.

Test Plan

We add tests with mocking the global consumer client to throw TimeoutException to check if retry parameters are respected.

Rejected Alternatives

None.

  • No labels