Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current State: Discussion Accepted
Discussion Thread: here
Jira: 

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-56694930

Released: 1.1.0

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4827

 

Proposed Change

All changes suggested in this KIP will only be applied when connectors are created, for connectors already deployed at the time this is implemented no changes are made, they can be updated and deleted like before.

 

For new connectors, Change the validation of connector names will be changed to trim leading and trailing whitespaces and reject zero length strings after trimming. This would allow for whitespaces in connector names but remove potential confusion caused by accidentally padding the name with whitespaces, which is easily possible due to the create request having the name as a json value, not in the url. This will only affect the creation of new connectors, 

Additionally connector names containing control characters or one of their more common escape sequence representations Apart from that no characters will be rejected.

Additionally a section will be added to the documentation explaining which characters need to be url encoded for rest calls to work properly. This together with the work done on 

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyKAFKA-4827
 should enable Connect to handle a very broad range of characters for connector names.

New or changed public interfaces

There is no new public interface, however the behavior of the create connector api will change insomuch as some connector names that previously worked will now not be accepted anymore when creating new connectors. This may break deployment scripts for a few people, so we should definitely announce this as a breaking change, not sure if a full deprecation cycle makes sense, but since this will definitely be post 1.0 I guess it might be a good ideachange a little bit. Existing connectors are not affected by this change and can be updated and queried just like today.

Whitespace handling

Following implementation of this KIP, leading and trailing whitespaces in connector names will be trimmed before the connector is created. In the response to the create request the updated name will be reflected. See the following table for a few examples:

Original nameTrimmed nameComment
" test""test" 
"test ""test" 
" test "

"test"

 
" """Will be rejected as empty connector name!

This is a change to current behavior, as today when sending a create connector request the connector will be created with the specified name or the request will fail, names are not changed by Connect. However, requests containing whitespaces would fail in the current version (fixed in KAFKA-4827) - so it could be assumed that not many people rely on connectors containing whitespaces in the name at all.

 

Control characters

The control characters ASCII 0 - 31 & 127 will be considered illegal characters and connector names containing these will be rejected. In order to avoid injecting these characters via escape sequences the connector name will be unescaped before testing for control characters.

 

 

See Migration plan and compatibility for scenarios that might be influenced by this change in behavior.

Migration plan and compatibility

Backward compatibility should be given, subject to testing.for existing connectors is given, as all changes suggested here only affect connectors when they are created. The possibility to update and delete existing connectors is not impacted.

 

For new connectors that are created after implementing this change limitations are restricted to a few scenarios described below. Apart from those, Since no characters are restricted nothing that previously worked should break based on this KIP. Connectors with an empty name were broken before and could not be deleted anyway, so there is no reason to half-support these any further. It is however a change in behavior to reject these. 

As mentioned there are however two scenarios that are affected by this change:

 

Changing of connectors with leading or trailing whitespaces

If an external system is used to manage connectors, the change in connector names could potentially cause this system not to find deployed connectors anymore - if their names contain trailing or leading whitespaces!

  1. Create connector: "test    "    -> Connector gets created (request fails though, due to KAFKA-4827)
  2. Query status of connector: "test    "    -> Fails as the connector was created with the name "test"

There are two points worth noting in this scenario though: 

  1. The system managing the connectors would have to completely ignore the response from connect to the first request, as this would return an error in the current version and it should not have assumed that the connector has been created anyway.
  2. Following the fix for KAFKA-4827 being merged the request would be successful, but contain the updated name for the created connector, so there would be an easy check to update the name used. 

Based on the above points I would consider this a fringe scenario that should not impact a significant number of people (to be honest, I'd be surprised if anyone is impacted at all). Additionally I can't come up with a valid reason to use trailing or leading whitespaces in connector names.

 

Creating connectors that only differ in the number of trailing or leading whitespaces

As shown in the table above, multiple input names might be matched to the same output name after this change: "   test" and "test   " would be considered the same after stripping whitespaces. The second create request would be rejected as a connector with the name "test" already exists.

While this is strictly speaking a limitation my personal opinion is that using connector names that differ only in the number of whitespaces at the beginning or end is a recipe for disaster and should not be encouragedThe only scenario that I can come up with is an automated deployment of connectors containing leading or trailing whitespaces that later expects being able to query configuration and status of these connectors under that connector name. I think adding a note to this effect to the release notes should suffice (if that, one might argue that this is a fringe case).

Rejected alternatives

Initially the proposal was to reject a number of characters as illegal in connector names based on a whitelist or blacklist. However following discussions on the mailinglist it was agreed that we can be very generous in allowing characters in connector names as long as all rest requests are properly url-encoded.

...