Sometimes end user wants to reach a sweet spot between ongoing task transfer and streaming resource free-up. So we want to take a similar approach as KIP-415, where we shall introduce a client config to make sure the scale down is time-bounded. If the time takes to migrate tasks outperforms this config, the leaving member will shut down itself immediately instead of waiting for the final confirmation. And we could simply transfer learner tasks to active because they are now the best shot to own new tasks.

Task Tagging

...

Note that to make sure the above resource shuffling could happen as expected, we need to have the following task status indicators to be provided:
...
scale.down.timeout.ms
Default: infinity
Timeout in milliseconds to force terminate the stream worker when informed to be scaled down.

stream.worker.balancing.factor
Default: 2
The tolerance of task imbalance factor between hosts to trigger rebalance.

stream.rebalancing.mode
Default: incremental
The setting to help ensure no downtime upgrade of online application.
Options : upgrading, incremental
to help user define their customized strategy.
...
Only client side upgrade is required . just need to change the encoded metadata so that the server could as long as the Kafka broker version >= 0.9, where broker will react with a rebalance based on:
...
when a normal consumer changes the encoded metadata.

Switching Protocol Type

As we have mentioned above, a new protocol type shall be created. To ensure smooth transition, we need to make sure the existing job doesn't fail. So we need to explicitly set the `stream.rebalancing.mode` to `upgrading`, where we

shall apply "consumer" as protocolType for existing stream jobs. Otherwise the job upgrade would fail due to incompatible protocol type.

Stream Instance Replacement

...

The right approach for a global application instance replacement is
Increase the capacity of the current stream job to 2
Mark existing stream instances as leaving
Learner tasks finished on new hosts, shutting down old ones.

FAQ

Why do we call stream workers?

...

Space shortcuts

Child pages

Versions Compared

Old Version 41

New Version 42

Key

Task Tagging

Switching Protocol Type

Stream Instance Replacement

The right approach for a global application instance replacement is
Increase the capacity of the current stream job to 2
Mark existing stream instances as leaving
Learner tasks finished on new hosts, shutting down old ones.

FAQ

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 41

New Version 42

Key

Task Tagging

Switching Protocol Type

Stream Instance Replacement

The right approach for a global application instance replacement isIncrease the capacity of the current stream job to 2Mark existing stream instances as leavingLearner tasks finished on new hosts, shutting down old ones.

FAQ

The right approach for a global application instance replacement is
Increase the capacity of the current stream job to 2
Mark existing stream instances as leaving
Learner tasks finished on new hosts, shutting down old ones.