Table of Contents |
---|
Status
Current state: Under discussionAdopted.
Discussion thread: here
Vote thread: Not started yet. here
JIRA: KAFKA-5505
Released: N/A AK 2.3.0
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
scheduled.rebalance.max.delay.ms
Type: Int32
Default: 300000 (5min)
This is a delay that the leader may set to tolerate departures of workers from the group by allowing a transient imbalance connector and task assignments. During this delay a worker has the opportunity to return to the group and get reassigned the same or similar amount of work as before. This property corresponds to the maximum delay that the leader may set in a single assignment. The actual delay used by the leader to hold off redistribution of connectors and tasks and maintain imbalance may be less or equal to this value.connect.protocol
Type: Enum
Values:eager
,compatible
, cooperative
Default: eagercompatible
This property defines which Connect protocol is enabled.eager
corresponds to the initial non-cooperative protocol that resolves imbalance with an immediate redistribution of connectors and tasks (version 0).compatible
corresponds to both eager andcooperative
protocols (protocol version 0) and incremental cooperative (protocol version 1 or higher) protocols being enabled with the incremental cooperative protocol being preferred if both are supported (version 1 or version 0).cooperative
means that only an incremental cooperative protocol is enabled that tolerates imbalances in connectors and tasks to a certain maximum delay (version 1 or higher).
Compatibility, Deprecation, and Migration Plan
...
Migration of Connect Workers to the new version of the Connect protocol is supported without down time. In order to perform live migration a two phase rolling bounce process should be followedis preferred as follows:
Bounce each Worker one-by-one after setting:
When all Workers are up and running with the property as set above in {{compat}} mode, repeat a rolling bounce round after setting on each Worker:Code Block language java connect.protocol = compatible
To downgrade your cluster to use protocol version 0 from version 1 or higher with
rebalancing policy what is required is to switch one of the workers back to eager
eager
mode.
Code Block | ||
---|---|---|
| ||
connect.protocol = |
...
eager |
Once this worker joins, the group will downgrade to protocol version 0 and eager
rebalancing policy, with immediately release of resources upon joining the group. This process will require a one-time double rebalancing, with the leader detecting the downgrade and first sending a downgraded assignment with empty assigned connectors and tasks and from then on just regular downgraded assignments.
Test Plan
- Parameterize existing unit tests to test all Connect protocols and compatibility modes.
- Add additional unit tests covering the new Connect protocol for Incremental Cooperative Rebalancing.
- Write the first integration tests for Connect protocols using the integration test framework for Kafka Connect: https://issues.apache.org/jira/browse/KAFKA-7503.
- Write system tests that exercise different scenarios.
...