Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Requests with expired keys

The leader will only accept requests signed with the most current key. This should not cause any major problems; workers already engage in an infinite retry loop when requests to forward tasks to the leader fail with a short backoff period in between each retry. If a follower attempts to make a request with an expired key, this retry behavior can be leveraged almost entirely as-is to keep attempting the request until the updated session key is read by the follower. The only change to this retry behavior will be that a grace period of one minute will be added before emitting error-level log messages on request failures. During this one-minute grace period, if the request is rejected with a 403 response, only a debug-level log message will be emitted. This grace period should leave sufficient room for the follower to read the new session key from the config topic. If longer than that is required, the usual error-level log messages will begin to be generated by the worker However, Connect follower workers may routinely experience small delays when reading the new key. Rather than always logging such task configuration failure and retry attempts as errors (the current behavior), Connect's distributed herder will be modified slightly to handle such HTTP 403 responses for this task configuration request by quietly retrying them with the latest key for up to 1 minute. If failures persist for more than 1 minute, they will be logged as errors.

New JMX worker metric

Finally, a new worker JMX metric will be exposed that can be used to determine whether the new behavior proposed by this KIP is enabled:

...