Status

Current state: Work In Progress

Discussion threadhere

JIRA

Motivation

With the movement of ETL and analysis workloads to the cloud, there are a number of possible ways for Apache Knox to aid in the federation of on-prem activity, identities, authentication events to the cloud based operating environment. The tasks of submitting monitoring and acting the results of ETL and data engineering tasks can be accomplished from the desktop rather than a gateway machine with the use of Knox and its capabilities as a cluster trusted proxy, client libraries and DSL and KnoxToken sessions. This KIP needs to articulate end-to-end scenarios that can be targeted for cloud usecases.

UC-1: On-prem -> Cloud Knox Federation

JIRA: Unable to render Jira issues macro, execution error.

This usecase allows for a topology based federation from one Knox instance to another.

It will allow clients to interact with a single on-prem Knox instance (cluster of instances) but have the interaction federated transparently to a corresponding cloud based instance.

By deploying a topology that is specifically designed to federate to another Knox instance, the clients will be unaware of the fact that the calls are dispatched with a normalized security context to instances that are hosted elsewhere. Obviously, there will be some performance impact related to the double and remote network hop but the actual destination and details of the interaction will not need to be known by the on-prem Knox clients.

The key improvement needed for this usecase is the ability to override the dispatch mechanism for each service configured within a topology such that a single normalized dispatch is used across all services exposed by the topology. This single dispatch would probably just need to implement an outgoing version of our Header based Preauth SSO Provider with Client Cert over TLS.

The cloud Knox instance/s will need a topology that is configured with the Header based Preauth Provider and have the public key of the on-prem instance in its truststore.

This will enable things like:

  • WebHDFS calls to an on-prem Knox actually redispatches the calls the cloud instance/s and results in files being put to or read from HDFS in the cloud.
  • Spark jobs submitted to Livy through on-prem instances actually redispatch and are submitted as cloud workloads
  • MapReduce jobs submitted to YARN RM through Knox will be submitted as workloads to the cloud

UC-2: Identity Broker API

This usecase allows for the exchange of one type of credential for another in a way that provides the caller with credentials to gain access to cloud specific services that are unaware of the typical Hadoop security context, delegation tokens and even kerberos identities. This will require a number of enhancements to Knox but is perfectly in line with the capabilities of KnoxToken usecases that allow for token/credential exchange for a KnoxToken.

We will need to allow callers provide a set of credentials: username/password, SPENGO authn, delegation tokens, etc and receive credentials for access to cloud provider specific services.

Key improvements for this usecase require an API that can be called to acquire cloud service credentials/tokens for the caller itself and also on behalf of another user. It will require Knox to ensure that an impersonation attempt is only successful for trusted proxy users and that they are explicitly allowed to impersonate the users in question. It will require a pluggable backend so that the credentials may be cloud provider specific credentials or even to delegate to a 3rd party identity broker. This will also require some form of persistence of mappings between hadoop users and groups to permissions for the cloud environment.

NOTE: particular attention needs to be paid to locking down the entities allowed to request credentials while still providing access to cloud storage to all appropriate workloads for a given cloud deployment.

This will enable things like:

  • Workload jobs to acquire credentials for access to S3 buckets in AWS
  • Workload jobs to acquire credentials for access to ADLS in Azure
  • In combination with with UC-1 this would enable an on-prem ETL job be submitted to a local Knox instance that ultimately access data in cloud storage rather than or in addition to HDFS

The following example requests credentials with the policy specified in the policy variable from the AWS STS service.

The trick will be to determine the policy here based on the user and its group membership from some configured mapping of the same.

public class Sts {
 public static void main(String[] args) {
AWSSecurityTokenService sts_client = AWSSecurityTokenServiceClientBuilder.standard().withRegion(Regions.US_EAST_1).build();
String policy = "{\n" +
" \"Version\": \"2012-10-17\",\n" +
" \"Statement\": [\n" +
" {\n" +
" \"Effect\": \"Allow\",\n" +
" \"Action\": [\n" +
" \"s3:Get*\",\n" +
" \"s3:List*\"\n" +
// " \"s3:Delete*\"\n" +
" ],\n" +
" \"Resource\": \"*\"\n" +
" }\n" +
" ]\n" +
"}";
GetFederationTokenRequest request = new GetFederationTokenRequest("larry.mccay@gmail.com").withPolicy(policy);
GetFederationTokenResult result = sts_client.getFederationToken(request);
System.out.println(result.getCredentials());
 Credentials session_creds = result.getCredentials();
BasicSessionCredentials sessionCredentials = new BasicSessionCredentials(
session_creds.getAccessKeyId(),
session_creds.getSecretAccessKey(),
session_creds.getSessionToken());
 AmazonS3 s3 = AmazonS3ClientBuilder.standard().withRegion(Regions.US_EAST_1)
.withCredentials(new AWSStaticCredentialsProvider(sessionCredentials)).build();
List<Bucket> buckets = s3.listBuckets();
for (Bucket bucket : buckets) {
System.out.println(bucket.getName());
// s3.deleteBucket(bucket.getName());
}
}
}

Policy in Topology

<?xml version="1.0" encoding="UTF-8"?>
<topology>
<name>sandbox</name>
<gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.knox.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapContextFactory</name>
<value>org.apache.knox.gateway.shirorealm.KnoxLdapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.contextFactory</name>
<value>$ldapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://localhost:33389</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
</provider>
<provider>
<role>identity-assertion</role>
<name>Default</name>
<enabled>true</enabled>
<!-- new for extracting identity from request URL -->
   <param>
     <name>path.segment.impersonated.principal</name>
     <value>IDBROKER;credentials</value>
</param>
<param>
<name>group.principal.mapping</name>
<value>admin=admin</value>
</param>
</provider>
</gateway>
 <service>
<role>IDBROKER</role>
<!-- pluggable policy config providers: default|ranger -->
<param>
<name>cloud.policy.config.provider</name>
<value>default</value>
</param>
   <!-- pluggable cloud credentials client providers: default|ranger -->
<param>
<name>cloud.client.provider</name>
<value>AWS</value>
</param>
   <!-- mapping a particular user to a set of s3 Actions -->
<param>
<name>s3.user.policy.action.guest</name>
<value>s3:Get*,s3:List*</value>
</param>
   <!-- mapping a particular user to a set of s3 resources -->
<param>
<name>s3.user.policy.resource.guest</name>
<value>*</value>
</param>
   <!-- mapping a user's group to a set of s3 Actions
Since a user may be a member of multiple groups these
mappings must be combined into a single policy as separate
Statements within the policy JSON. It also must be combined
with any user specific mappings as separate Statements as well. -->
<param>
<name>s3.group.policy.action.admin</name>
<value>*</value>
</param>
   <!-- mapping a user's group to a set of s3 resources.
Since a user may be a member of multiple groups these
mappings must be combined into a single policy as separate
Statements within the policy JSON. It also must be combined
with any user specific mappings as separate Statements as well. -->
<param>
<name>s3.group.policy.resource.admin</name>
<value>*</value>
</param>

<!-- mapping an authenticated user to an IAM role
this will result in the pulling of the role attached policy
to be used in the getFederationToken call. It also supersedes
other group and user mappings. Only support one IAM Role mapping
per user. Since, user mappings are more specific than group mappings
the user mapping supersedes the group to role mapping as well.-->
<param>
<name>s3.user.role.mapping.lmccay</name>
<value>awsAdminRole</value>
</param>
   <!-- mapping an authenticated users group to an IAM role
this will result in the pulling of the role attached policy
to be used in the getFederationToken call. It also supersedes
other group and user mappings. Only support one IAM Role mapping
per LDAP group.-->
<param>
<name>s3.group.role.mapping.admin</name>
<value>awsAdminRole</value>
</param>
</service>
</topology>

UC-3: Azure AD Integration

This IdP has come up on the community email lists already and will obviously be valuable for Azure based deployments.

Latest Pac4J Provider should have some support for this but we have run into some difficulties either in the actual pac4j library or in the provider adapter in Knox that is making the exchange fail.

It actually seems that there is a limitation to the Azure AD OAuth support in that it doesn't support query params in the callback URL.

We can try and work around this in pac4j or consider a separat AAD Provider based on https://github.com/AzureAD/azure-activedirectory-library-for-java.

This may actually be need as part of the pluggable backend for Azure in UC-2 above as well.

TODO: PROVIDE ENABLED USECASES LIKE ABOVE

UC-4: On-prem Cipher Proxy

As a variation of UC-1 above, this usecase would allow for data to be moved to the cloud but never have it leave the on-prem deployment in clear text.

Before federating the call to the remote Knox instance a the new dispatch would have the ability to encrypt the data on the way out of the enterprise.

This would require integration with on-prem Hadoop and/or Ranger KMS systems via the Hadoop Key Provider API. It would also require a new provider type to be added before dispatch that encrypts the outgoing payload. It is possible that this can be implemented as rewrite filter rather than a new provider type. It would however require us to be able to inject this rewite filter into a service def for a topology that is leveraging the cipher filter. We would also need to have a corresponding decrypt rewrite filter for incoming payloads.

This would allow for things like:

  • Move data to the cloud for storage without it ever being in clear text outside of the enterprise
  • Move it back to on-prem for use and allow it to be stored in clear text or in HDFS TDE encryption zones or leverage vol encryption locally

UC-5: S3 Integration

While the Knox WebHDFS integration may still work for many cloud deployments, it does seem like a gap that there is no way to move files in and out of S3 or other cloud storage mechanisms through Knox.

We can actually combine UC-2 above to acquire temporary credentials on behalf of the authenticated users. We would request the IAM role and permissions that are appropriate for the user and their group memberships in order to access buckets protected with IAM roles. We could also combine with UC-4 above to have encrypted files put into S3 that will only be able to be decrypted on-prem.

It would require Knox to be granted permission in a given cloud deployment to make STS calls and may require AWS credentials for the Knox user to be an IAM role. We may also be able to assumeRole to the needed role for STS access.

It will also require a Jersey service hosted in Knox to put files into S3 (Knox3?) or we can create a pluggable backend and make it a more generic object store API.

 

Improvements

Testing

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

  • No labels