Knox Picketlink Federation Provider
The picketlink federation provider allows for the federation of an authentication event that is represented by a SAML assertion cookie/token.
It enables the flow illustrated above in the sequence diagram for SAML based authentication for Hadoop Web UIs and is based on Apache Picketlink.
It has currently been tested with shibboleth as the SAML provider.
The following table details the configuration elements of the provider:
Param | Description | Default |
---|---|---|
identity.url | The URL to redirect incoming requests that do not contain the expected cookie and presumably to facilitate an authentication challenge. | none |
service.url | The URL back to the KnoxSSO endpoint for the IdP to redirect the browser after authentication. | none |
keystore.url | The location of the keystore with the public cert of the IdP for token validation. BUG: this is currently hardcoded to gateway.jks | gateway.jks |
validating.alias.key | This is the idp domain which is used as the alias for looking up the alias for the cert to validate incoming tokens with - ie. idp.example.com | none |
validating.alias.value | This is the alias for the actual cert to use for the idp domain - ie. server.crt | none |
clock.skew.milis | The clock skew to use during the validation of tokens | none |
Sample Topology file: "idp.xml"
<topology>
<gateway>
<provider>
<role>federation</role>
<name>Picketlink</name>
<enabled>true</enabled>
<param>
<name>identity.url</name>
<value>https://localhost:9443/idp/profile/SAML2/POST/SSO</value>
</param>
<param>
<name>service.url</name>
<value>http://c6401.ambari.apache.org:8888/gateway/idp/knoxsso/</value>
</param>
<param>
<name>keystore.url</name>
<value>/usr/hdp/current/knox-server/data/security/keystores/gateway.jks</value>
</param>
<param>
<name>validating.alias.key</name>
<value>c6401.ambari.apache.org</value>
</param>
<param>
<name>validating.alias.value</name>
<value>gateway-identity</value>
</param>
<param>
<name>clock.skew.milis</name>
<value>2000</value>
</param>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
</provider>
<provider>
<role>authorization</role>
<name>AclsAuthz</name>
<enabled>true</enabled>
</provider>
</gateway>
<service>
<role>KNOXSSO</role>
<param>
<name>sso.cookie.secure.only</name>
<value>false</value>
</param>
</service>
</topology>
Proof of Concept Status
The following notes reflect the CURRENT POC state for the above flow inside an Ambari managed ambari-vagrant 3 node cluster:
- On first request to a given UI (NameNode) ie. http://c6401.ambari.apache.org:50070 with the redirecting authentication handler, the hadoop auth filter sees that there is no hadoop auth cookie and delegates to the configured handler. The redirecting authentication handler looks for a simple cookie that represents a knoxsso token (this may be changed to a JWT bearer token or cookie). In the absence of this cookie, the handler redirects the browser to the configured endpoint for knoxsso and passes the original UI url as a request parameter "originalUrl". Example: http://c6401.ambari.apache.org:8888/knoxsso?originalUrl=http://c6401.ambari.apache.org:50070
- The knoxsso endpoint has a number of filters. The first captures the original url parameter and creates a cookie* called original-url to be used later to redirect the user to the UI once authentication has been successfully accomplished. Example: original-url http://c6401.ambari.apache.org:50070
- The next filter is the JBoss picketlink SPFilter for SAML service providers. It redirects the user to the IdP (shibboleth running in a centos VM hosted in jetty) to challenge for credentials where the user is currently authenticated against the Knox demo ApacheDS LDAP server. This could be any LDAP server or AD. Once the user is successfully authenticated the IdP redirects the user back the knoxsso endpoint. The capture filter ignores the incoming POST since it doesn't have the originalUrl parameter and allows the processing to go back to the picketlink filter where the assertion is accepted and the userid extracted and made available to the servlet programming model through HttpServletRequest.getUserPrincipal.
- The next filter is for redirecting back to the UI with a token that can be consumed by the UI authentication handler. This redirecting filter extracts the userid from getUserPrincipal and creates a cookie** that simply has the username as the value. Example: hadoop-auth: guest It then extracts the original-url from the cookie that was added by the capture filter and redirects the users with token cookie to the original url. Example: original-url http://c6401.ambari.apache.org:50070
- The hadoop auth filter on the UI endpoint accepts the requests but still finds no hadoop auth cookie and delegates once again to the redirect authentication handler. The auth handler finds the expected cookie/token and extracts the userid, creates a hadoop authentication token and returns it to the filter. The filter creates a hadoop auth cookie for this token and uses this for authentication until it expires and is no longer presented by the browser and we start back at #1.
* ISSUE: the original-url cookie is currently problematic. We need to utilize the Relay-State of the SAML form elements instead. We then need to intercept it before the redirect filter is invoked and make it available in the request context. This will normalize the redirect mechanism across authentication server providers. For example: SAML, Form, HTTP Basic, etc. The redirect filter can be reused across all of those as long as the original-url can be found the same way.
** CAVEAT: the simple hadoop-auth cookie and any subsequent JWT solution will dictate that knoxsso endpoint be in the same domain as all of the UIs. IOW - all nodes in the cluster that host UIs or will need the cookie to be available need to be in the same domain as the knoxsso endpoint.
Required Configuration for Hadoop Consoles
OBSOLETE but in the proper spirit of HADOOP-11717 (
)<property>
<name>hadoop.http.authentication.simple.anonymous.allowed</name>
<value>false</value>
</property>
<property>
<name>hadoop.http.authentication.type</name>
<value>org.apache.hadoop.sso.poc.RedirectAuthenticationHandler</value>
</property>
<property>
<name>hadoop.http.authentication.authentication.provider.url</name>
<value>http://c6401.ambari.apache.org:8888/knoxsso</value>
</property>
<property>
<name>hadoop.http.filter.initializers</name>
<value>org.apache.hadoop.security.AuthenticationFilterInitializer</value>
</property>
<property>
<name>hadoop.http.authentication.public.key.pem</name>
<value>MIICVjCCAb+gAwIBAgIJAPPvOtuTxFeiMA0GCSqGSIb3DQEBBQUAMG0xCzAJBgNV
BAYTAlVTMQ0wCwYDVQQIEwRUZXN0MQ0wCwYDVQQHEwRUZXN0MQ8wDQYDVQQKEwZI
YWRvb3AxDTALBgNVBAsTBFRlc3QxIDAeBgNVBAMTF2M2NDAxLmFtYmFyaS5hcGFj
aGUub3JnMB4XDTE1MDcxNjE4NDcyM1oXDTE2MDcxNTE4NDcyM1owbTELMAkGA1UE
BhMCVVMxDTALBgNVBAgTBFRlc3QxDTALBgNVBAcTBFRlc3QxDzANBgNVBAoTBkhh
ZG9vcDENMAsGA1UECxMEVGVzdDEgMB4GA1UEAxMXYzY0MDEuYW1iYXJpLmFwYWNo
ZS5vcmcwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAMFs/rymbiNvg8lDhsdA
qvh5uHP6iMtfv9IYpDleShjkS1C+IqId6bwGIEO8yhIS5BnfUR/fcnHi2ZNrXX7x
QUtQe7M9tDIKu48w//InnZ6VpAqjGShWxcSzR6UB/YoGe5ytHS6MrXaormfBg3VW
tDoy2MS83W8pweS6p5JnK7S5AgMBAAEwDQYJKoZIhvcNAQEFBQADgYEANyVg6EzE
2q84gq7wQfLt9t047nYFkxcRfzhNVL3LB8p6IkM4RUrzWq4kLA+z+bpY2OdpkTOe
wUpEdVKzOQd4V7vRxpdANxtbG/XXrJAAcY/S+eMy1eDK73cmaVPnxPUGWmMnQXUi
TLab+w8tBQhNbq6BOQ42aOrLxA8k/M4cV1A=</value>
</property>
Picketlink POC Server
The knoxsso endpoint at this point is represented by a simple embedded jetty server hosting a webapp with a number of filters.
It checks for the incoming original-url and sets it in appropriate state for later retrieval, does the SAML dance with the picketlink SPFilter and finally redirects back to original-url.
Built with "mvn clean install".
Run with "mvn exec:java -Dexec.mainClass="org.apache.hadoop.sso.poc.PocServer"
This needs to be migrated to Knox with a picketlink provider and KnoxSSO service (jersey based?).
Additional Notes
- We have to ascertain the impact of changing the auth handler on the UI endpoints on the REST APIs that may be on the same endpoint
- Cookie domains may not need to be the same across all UIs using this approach
- In order to do a more complicated/secure token between knoxsso and the UI - we will need to verify signature using a common key. This will likely require the use of the KeyProvider API or CredentialProvider API. This will also require either:
- a central KMS provider that will allow contrained access to the same key materials by knoxsso and the UI auth handler
- separate keystores that will need the key provisioned independently and to stay in sync
- Normalizing on JWT as the token that is consumed by the UI auth handler will require some JWT parsing and verification code to be available in hadoop. Not sure if it can be put into hadoop auth module or whether it needs to go into common/security.
- This same architecture can be used with other implementations on the knoxsso side in place of the SAML/Shibboleth integration. We will have to make this configurable. The first filter will also capture the original url and the last will always redirect back to the original url. The processing that goes on in between can be pluggable to accommodate various integrations with SSO providers, simple hosted mechanisms (FORM, HTTP Basic), etc.
Other Considerations
- Introduce new SSO cookie as first class citizen rather than hadoop auth cookie
- Create new filter for new cookie instead of a new handler
- Refactor existing AuthenticationFilter into a delegating filter rather than returning 403
- add check to see if the user is established already
- continue the filter chain
- add new filter that checks for established user and in it's absence looks for new cookie
- does the redirect
- verifies the signature based on PKI public key of the configured knoxsso endpoint
- add new filter that terminates the chain, returning 403 if the user has not been established at the end
- New knoxsso cookie is better security due to PKI rather than a shared secret across the cluster that can/must be acquired by each server
- Can always fall back to the hadoop auth cookies under pressure and revisit it later
- Add groups to the knoxsso token
- We may be able to get away with using existing hadoop auth cookie with a strengthened signer based on PKI
References
- Configuration for authentication on the component UIs: http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/HttpAuthentication.html
- Forgerock openam documentation for CDSSO and protection against cookie hijacking: http://docs.forgerock.org/en/openam/10.1.0/admin-guide/index/chap-cdsso.html
- core-default for hadoop config defaults - filter initializers: https://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-project-dist/hadoop-common/core-default.xml
- nimbus-jose-jwt library - Apache 2 License: http://connect2id.com/products/nimbus-jose-jwt
OBSOLETE
The previous POC effort was proving the above flow and trying to minimize the work required on the UI.
The following notes reflect the INITIAL (AND OBSOLETE) POC state for the above flow:
- On first request to a given UI with the redirecting authentication handler, the hadoop auth filter sees that there is no hadoop auth cookie and delegates to the configured handler. The redirecting authentication handler looks for a simple cookie that represents a knoxsso token (this may be changed to a JWT bearer token or cookie). In the absence of this cookie, the handler redirects the browser to the configured endpoint for knoxsso and passes the original UI url as a request parameter "originalUrl". Example: http://localhost:8888/knoxsso?originalUrl=http://localhost:8888/app/
- The knoxsso endpoint has a number of filters. The first captures the original url parameter and creates a cookie called original-url to be used later to redirect the user to the UI once authentication has been successfully accomplished. Example: original-url http://localhost:8888/app/
- The next filter is the JBoss picketlink SPFilter for SAML service providers. It redirects the user to the IdP (shibboleth running in a centos VM hosted in jetty) to challenge for credentials where the user is currently authenticated against the Knox demo ApacheDS LDAP server. This could be any LDAP server or AD. Once the user is successfully authenticated the IdP redirects the user back the knoxsso endpoint. The capture filter ignores the incoming POST since it doesn't have the originalUrl parameter and allows the processing to go back to the picketlink filter where the assertion is accepted and the userid extracted and made available to the servlet programming model through HttpServletRequest.getUserPrincipal.
- The next filter is for redirecting back to the UI with a token that can be consumed by the UI authentication handler. This redirecting filter extracts the userid from getUserPrincipal and creates a cookie that simply has the username as the value. Example: hadoop-auth guest It then extracts the original-url from the cookie that was added by the capture filter and redirects the users with token cookie to the original url. Example: original-url http://localhost:8888/app/
- The hadoop auth filter on the UI endpoint accepts the requests but still finds no hadoop auth cookie and delegates once again to the redirect authentication handler. The auth handler finds the expected cookie/token and extracts the userid, creates a hadoop authentication token and returns it to the filter. The filter creates a hadoop auth cookie for this token and uses this for authentication until it expires and is no longer presented by the browser and we start back at #1.