You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »


Status

Current State: Discussion
Discussion Thread: here
Jira:  Unable to render Jira issues macro, execution error.

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently very little checking is done against connector names, when creating a new connector in Kafka Connect. The only check that is performed is that no / is present in the name, other than that it is even possible to create a connector with an empty name that is then impossible to delete.
Additionally there are a large number of special characters that can be used in connector names but create issues afterwards, when they need to be used in the url for the rest call to change or delete the connector.
A pull request is ready for review that will deal with a few of the cases that cause issues, but this is more of a band aid than a proper fix for this issue, as I am sure there are a whole lot more characters out there that were missed and also create issues.

There are a number of related jiras currently open that could probably all be fixed by implementing something like the checks proposed here:

Unable to render Jira issues macro, execution error.

Unable to render Jira issues macro, execution error.

Unable to render Jira issues macro, execution error.

 

Proposed Change

Change the validation of connector names to trim leading and trailing whitespaces and reject zero length strings after trimming. This would allow for whitespaces in connector names but remove potential confusion caused by accidentally padding the name with whitespaces, which is easily possible due to the create request having the name as a json value, not in the url.

Apart from that no characters will be rejected.

New or changed public interfaces

There is no new public interface, however the behavior of the create connector api will change insomuch as some connector names that previously worked will now not be accepted anymore when creating new connectors. This may break deployment scripts for a few people, so we should definitely announce this as a breaking change, not sure if a full deprecation cycle makes sense, but since this will definitely be post 1.0 I guess it might be a good idea.


Migration plan and compatibility

Backward compatibility should be given, subject to testing.

Since no characters are restricted nothing that previously worked should break based on this KIP. Connectors with an empty name were broken before and could not be deleted anyway.

The only scenario that I can come up with is an automated deployment of connectors containing leading or trailing whitespaces that later expects being able to query configuration and status of these connectors under that connector name. I think adding a note to this effect to the release notes should suffice (if that, one might argue that this is a fringe case).

Rejected alternatives

Initially the proposal was to reject a number of characters as illegal in connector names based on a whitelist or blacklist. However following discussions on the mailinglist it was agreed that we can be very generous in allowing characters in connector names as long as all rest requests are properly url-encoded.

Original KIP:

Based on this research and since I don't really see the benefit behind supporting a large number of exotic special characters I propose to limit allowed chars for connetor names to the unreserved characters:

 

a-z   A-Z   0-9   .   -   _   ~

 

Appendix A - Research

The set of allowed characters should be determined by the fact that the connector name is part of the url in a rest call to update the config or delete the connector, so we should take care
to allow only characters that can be "legally" used within urls - however it turns out that this is not an entirely easy distinction. After looking at a few stackoverflow threads(herehere & here) as well as RfCs 17382396 and 3896 my understanding is the following.

There are some characters that are definitely legal and allowed without any restriction:

 
a-z   A-Z   0-9   .   -   _   ~

 

Then there are reserved characters, these are legal, but can have special meaning depending on which section of the URL they appear in:

; / ? : @ & = + $ ,

 

And last but not least there are a few "unwise" characters:

{   }   |   \   ^   [   ]   `

 

During my work on KAFKA-4930 I found that at least ? from the list of reserved characters also creates issues and leads to connectors that are not accessible anymore after creation, so I'd be hesitant to simply include these in the list of allowed characters without further research into what causes these issues. A good example for one of these creating issues is the ; char. Connect uses jetty internally to serve the rest endpoints. Jetty considers ; to be a special character delimiting two url parameters from each other and stops parsing the url portion at this character. What this means is that you can create a connector with the name "test;test", since during creation you specify the name within the body of the request, but when you try calling the /connectors/test;test/status endpoint jetty will cut at ; - look for a connector named "test" and not find anything.

 

  • No labels