Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note that a higher/lower configured timeout doesn't change how long requests actually run in the herder - currently, if a request exceeds the default timeout of 90 seconds we simply return with the 500 response but the request isn't interrupted or cancelled and is allowed to continue to completion. Furthermore, each connector config validation is anyway done on its own thread via a cached thread pool executor in the herder (create / update connector calls requests are done processed asynchronously by simply writing a record to the Connect cluster's config topic, so config validations are the only relevant operation here).

...

Another small improvement will be made to avoid double connector config validations when Connect is running in distributed mode - currently, if a request to POST /connectors or PUT /connectors/{connector}/config is made on a worker that isn't the leader of the group, a config validation is done first, and the request is forwarded to the leader if the config validation is successful (only the leader is allowed to do writes to the config topic, which is what a connector create / update entails). The forwarded request results in another config validation before the write to the config topic can finally be done on the leader. The only benefit of this approach is that it avoids request forwarding to the leader for requests with invalid connector configs. However, it can be argued that it's cheaper and more optimal overall to forward the request to the leader at the outset, and allow the leader to do a single config validation before writing to the config topic. Since config validations are done on their own thread and are typically short lived operations, it should not be an issue even with large clusters to allow the leader to do all config validations arising from connector create / update requests (the only situation where we're adding to the leader's load is for requests with invalid configs, since the leader today already has to do a config validation for forwarded requests with valid configs). Note that the PUT /connector-plugins/{pluginName}/config/validate endpoint doesn't do any request forwarding and can be used if frequent validations are taking place (i.e. they can be made on any worker in the cluster to avoid overloading the leader).

...

A simple integration test will be added to ensure that a validate REST API request for a connector that takes longer than the default REST API request timeout (90 seconds) doesn't fail if the query parameter timeout is set to a higher value. Unit tests will be added wherever applicable.


Rejected Alternatives

Introduce a new internal endpoint to persist a connector configuration without doing a config validation

Summary: Instead of forwarding all create / update requests to the leader directly, we could do a config validation on the non-leader worker first and if the validations pass forward the request to a new internal-only endpoint on the leader which will just do the write to the config topic without doing a config validation first.

Rejected because: Introduces additional complexity with very little benefit as opposed to simply delegating all config validations from create / update requests to the leader.

Configure the timeout via a worker configuration

...