Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Page properties


Discussion threadTODO
Vote threadTODO
JIRA

Jira
serverASF JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,customfield_12311032,customfield_12311037,customfield_12311022,customfield_12311027,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,Priority,Priority,Priority,Priority,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-29918

Release1.1718


Motivation

The current delegation token framework supports mainly Kerberos authentication and Hadoop based tokens. This satisfies the use cases described in FLIP-211, however there are many non-hadoop compliant frameworks, where the authentication protocol is not Kerberos. The main motivation is to generalize the actual delegation token framework to make it authentication protocol agnostic. This change would open doors to implement providers for example for S3 (amongst many others).

...

It’s planned to implement automated integration tests and end to end tests with dockerized service containers.

Rejected Alternatives

  • Have a single DelegationTokenProvider API: here the implication would be that all existing token providers must include authentication logic which is the same for all Hadoop based providers. Reasons for rejection:
    • It must be marked somewhere that a provider is Hadoop based or not (practically it goes into UGI or not). Storing this info runtime is going to make the framework horror to debug and would add extra code complexity (practically UGI must be rebuilt from scratch on each and every TM).
    • On the task manager side each and every Hadoop connector must check whether the singleton TokenContainer has a token which can be used and must copy the content to the UGI. Since connectors are used from multiple threads this would indicate that UGI writes must be synchronized. This would add quite some code complexity and would slow down data processing.
    • Having a common API would make it harder to drop the Hadoop part when it reaches end of life.
    • The same authentication logic would need to be used in all Hadoop providers (practically all Hadoop providers would contain the same authentication code in some way). This change would be error prone and wouldn’t give any value but just additional complexity.

...