Status

Current state: In-progressComplete

Discussion thread:

...

Anchor
motivation
motivation
Motivation

There are a number of usability issues that are related to the current need to edit possibly large XML topology files within the Apache Knox Admin UI and within

...

We will need to make sure that we continue to meet the needs that this existing mechanism provides.

Anchor
simpledesc
simpledesc
2. Simplified Topology Descriptor

Addressing item #3 In order to simplify what management UI's need to manage, we should support a much simpler descriptor that contains what is needed to generate an actual topology file at deployment time (see item #1 from the Motivation section).

A , a simplified topology descriptor must allow the following details to be specified:

Service discovery type
An identifier indicating which type of discovery to apply (e.g., Ambari, etc...)
Service discovery address
The associated service registry address
Credentials for interacting with the discovery source
A provider configuration reference (a unique name, filename, etc...)
A unique name representing mapped to a set of provider configurationconfigurations (see item #3 from the Motivation section)
A list of services to be exposed through Knox (with optional service parameters and URL valuevalues)
A list of UIs to be proxied by Knox (per KIP-9)

...

YAML offers more structure than a properties file, but the representation is not important so long as the necessary contents can be expressed clearlyand is suitable for adminstrators/developers hand-editing simple descriptors.

Code Block

language	text
title	Proposed YAML

# Discovery info source
discovery-type : AMBARI
discovery-address : http://c6401.ambari.apache.org:8080sandbox.hortonworks.com:8080
discovery-user: maria_dev
discovery-pwd-alias: ambari.discovery.password

# Provider config reference, the contents of which will be
# included in (or referenced from) the resulting topology descriptor.
# The contents of this reference has a <gateway/> root, and
# contains <provider/> configurations.
provider-config-ref : ambarisandbox-cluster-policyproviders.xml

# The cluster for which the service details should be discovered
cluster: myclusterSandbox

# The services to declare in the resulting topology descriptor,
# whose URLs will be discovered (unless a value is specified)
services:
    - name: NAMENODE
    - name: JOBTRACKER
    - name: WEBHDFS
    - name: WEBHCAT
    - name: OOZIE
    - WEBHBASEname: WEBHBASE
    - name: HIVE
    - name: RESOURCEMANAGER
    - AMBARI : http://c6401.ambari.apache.org:8080
 name: KNOXSSO
      params:
       - AMBARIUI : http://c6401.ambari.apache.org:8080

# UIs to be proxied through the resulting Knox topology (see KIP-9)
#uis:
#knoxsso.cookie.secure.only: true
          knoxsso.token.ttl: 100000 
    - AMBARIUI : httpname: AMBARI
      urls:
        - http://c6401.ambari.apache.org:8080

3.Topology Generation

In order to simplify what management UI's need to manage, we should support a much simpler descriptor that contains what is needed to generate an actual topology file at deployment time.

Given that we will have a Service Discovery service that can integrate with Ambari as well as other sources of needed metadata, we should be able to start with merely service and UI names and a reference to a named provider configuration. Once the deployment machinery finds the descriptor it can pull in the provider configuration, iterate over each of the services, UIs, applications and lookup the details for each. With the provider configuration and service details we can then generate a fully baked topology.

Let's assume a properties file for now:

Listing 1. sample.properties

provider.config=ProductionAD

service=NAMENODE,JOBTRACKER,WEBHDFS,WEBHCAT,OOZIE,WEBHBASE,HIVE,RESOURCEMANAGER,AMBARI

ui=AMBARIUI

The above properties file should result in the provider config being pulled in, each servicen and ui (an other KIP) with its appropriate URLs and details added and should results in something like the following:

Listing 2. sample.xml Sample Topology

<?xml version="1.0" encoding="UTF-8"?>
<topology>
 <name>sample</name>
 <gateway>
   <provider>
     <role>authentication</role>
     <name>ShiroProvider</name>
     <enabled>true</enabled>
     <param>
       <name>sessionTimeout</name>
       <value>30</value>
     </param>
     <param>
       <name>main.ldapRealm</name>
       <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
     </param>
     <param>
       <name>main.ldapContextFactory</name>
       <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory</name>
       <value>$ldapContextFactory</value>
     </param>
     <param>
       <name>main.ldapRealm.userDnTemplate</name>
       <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory.url</name>
       <value>ldap://localhost:33389</value>
     </param>
     <param>
       <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
       <value>simple</value>
     </param>
     <param>
       <name>urls./**</name>
       <value>authcBasic</value>
     </param>
   </provider>
   <provider>
     <role>identity-assertion</role>
     <name>Default</name>
     <enabled>true</enabled>
   </provider>
   <provider>
     <role>hostmap</role>
     <name>static</name>
     <enabled>true</enabled>
   </provider>
 </gateway>

...

sandbox.hortonworks.com:8080
    - name: AMBARIUI
      urls:
        - http://sandbox.hortonworks.com:8080

# UIs to be proxied through the resulting Knox topology (see KIP-9)
#uis:
#   - name: AMBARIUI
#     url: http://sandbox.hortonworks.com:8080

While JSON is not really a format for configuration, it is certainly appropriate as a wire format, and will be used for API interactions.

Code Block

language	text
title	Proposed JSON

{
  "discovery-type":"AMBARI",
  "discovery-address":"http://sandbox.hortonworks.com:8080",
  "discovery-user":"maria_dev",
  "discovery-pwd-alias":"ambari.discovery.password",
  "provider-config-ref":"sandbox-providers.xml",
  "cluster":"Sandbox",
  "services":[
     {"name":"NAMENODE"},
     {"name":"JOBTRACKER"},
     {"name":"WEBHDFS"},
     {"name":"WEBHCAT"},
     {"name":"OOZIE"},
     {"name":"WEBHBASE"},
     {"name":"HIVE"},
     {"name":"RESOURCEMANAGER"},
     {"name":"KNOXSSO",
      "params":{
          "knoxsso.cookie.secure.only":"true",
          "knoxsso.token.ttl":"100000"
      }
     },
     {"name":"AMBARI", "urls":["http://sandbox.hortonworks.com:8080"]}
  ],
  "uis":[
     {"name":"AMBARIUI", "urls":["http://sandbox.hortonworks.com:8080"]}
  ]
}

Anchor
topologygen
topologygen
3.Topology Generation

Given that we will have a Service Discovery service that can integrate with Ambari as well as other sources of needed metadata, we should be able to start with a simplified topology descriptor.
Once the deployment machinery notices this descriptor, it can pull in the referenced provider configuration, iterate over each of the services, UIs, applications and lookup the details for each.
With the provider configuration and service details we can then generate a fully baked topology.

From the example descriptors in the Simplified Topology Descriptors section, the resulting topology should include the provider configuration from the reference, and each service and UI (another KIP) with its respective URLs:

Code Block

language	xml
title	Sample Topology File

<?xml version="1.0" encoding="UTF-8"?>
<topology>
    <gateway>
        <provider>
            <role>authentication</role>
            <name>ShiroProvider</name>
            <enabled>true</enabled>
            <param>
                <!--
                session timeout in minutes,  this is really idle timeout,
                defaults to 30mins, if the property value is not defined,,
                current client authentication would expire if client idles contiuosly for more than this value
                -->
                <name>sessionTimeout</name>
                <value>30</value>
            </param>
            <param>
                <name>main.ldapRealm</name>
                <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
            </param>
            <param>
                <name>main.ldapContextFactory</name>
                <value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
            </param>
            <param>
                <name>main.ldapRealm.contextFactory</name>
                <value>$ldapContextFactory</value>
            </param>
            <param>
                <name>main.ldapRealm.userDnTemplate</name>
                <value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
            </param>
            <param>
                <name>main.ldapRealm.contextFactory.url</name>
                <value>ldap://localhost:33389</value>

 <ui>
   <role>AMBARIUI</role>
   <url>http://c6401.ambari.apache.org:8080</url>
 </ui>

</topology>

We should also consider how we will discover simple descriptors and I think that we may want to have multiple ways.

Just as we do for topology files, we can monitor a directory for new or changed descriptors and trigger deployment upon such events. This is great for development and small cluster deployments but for production and larger deployments, we need to be able to accommodate multiple instances of Knox better.

My proposal is that for such deployments we also implement a Zookeeper based discovery mechanism. Then all instances will pick up the change from ZK as the central source of truth.

4. Service Discovery

In order to transform the simple descriptor described in #2 into a full topology, we need some source for the relevant details and metadata.

Knox is often deployed in clusters with Ambari and with ZooKeeper also deployed and both contain some level of the needed metadata about the service URLs wtihin the cluster.

The Zookeeper based service registry has promise but we need to invetigate the level of use from a registered services perspective and details available for each service.

Ambari should contain everything we need to flesh out a full topology from a minimal description of what resources that we want exposed and how we want access to those resources protected.

4.1 Apache Ambari API

These are some excerpts of responses from the Ambari REST API which can be employed for topology discovery:

(CLUSTER_NAME is a placeholder for an actual cluster name)

Identifying Clusters - /api/v1/clusters

Code Block

title	/api/v1/clusters

{
  "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/",
  "items" : [
    {
      "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME",
      "Clusters" : {
        "cluster_name" : "CLUSTER_NAME",
        "version" : "HDP-2.6"
      }
	}
  ]
}

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Code Block

title	/api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

“items” : [
  {
    “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HIVE",
    “components” : [
      {
        “ServiceComponentInfo” : { },
        “host_components” : [
          {
  </param>
          “HostRoles” : {<param>
              “cluster_name” : “CLUSTER_NAME”, <name>main.ldapRealm.contextFactory.authenticationMechanism</name>
              “component_name” :  <value>simple</value>
  “HCAT”,
          </param>
    “host_name” : “c6402.ambari.apache.org”
      <param>
      }
          }<name>urls./**</name>
        ]
        },<value>authcBasic</value>
      {
      </param>
  “ServiceComponentInfo”  : { },
  </provider>
      “host_components” : [<provider>
          {
  <role>identity-assertion</role>
          “HostRoles” : {<name>Default</name>
            <enabled>true</enabled>
  “cluster_name”   : “CLUSTER_NAME”,
  </provider>
        <!--
    “component_name” :  “HIVE_SERVER”,
 Defines rules for mapping host names internal to a Hadoop cluster to externally “host_name”accessible :host “c6402names.ambari.apache.org”
        For example, a  }
          }hadoop service running in AWS may return a response that includes URLs containing the
        ]
some AWS internal host name.  }
	]
  },
  {
    “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HDFS",If the client needs to make a subsequent request to the host identified
    “ServiceInfo” : {},
  in those “components”URLs :they [
need to be mapped to external {
host names that the client Knox can use “ServiceComponentInfo”to :connect.
 { },
      If  “host_components” : [
          {
   the external hostname and internal host names are same turn of this provider by setting the value of
        enabled “HostRoles”parameter :as {false.
        The name parameter specifies the external host “cluster_name”names : “CLUSTER_NAME”,
   in a comma separated list.
        The value parameter “component_name”specifies :corresponding internal “NAMENODE”,
host names in a comma separated list.
        “host_name” : “c6401.ambari.apache.org”
            }
          }
        ]
Note that when you are using Sandbox, the external hostname needs to be localhost, as seen in out
        of box sandbox.xml.  This is because Sandbox uses port mapping to allow clients to connect to the
        }
	]
  }
]

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Code Block

title	/api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

"items" : [
  { Hadoop services using localhost.  In real clusters, external host names would almost never be localhost.
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=HDFS&service_config_version=2", -->
      "cluster_name" : "CLUSTER_NAME",
 <provider>
        "configurations"   : [<role>hostmap</role>
         {   <name>static</name>
          "Config" : {<enabled>true</enabled>
            "cluster_name" : "CLUSTER_NAME",<param><name>localhost</name><value>sandbox,sandbox.hortonworks.com</value></param>
        </provider>
    "stack_id" : "HDP-2.6"
  </gateway>
 
    <service>
        },<role>AMBARIUI</role>
        <url>http://c6401.ambari.apache.org:8080</url>
  "type" : "ssl-server", </service>
    <service>
      "properties" : {<role>HIVE</role>
            "ssl.server.keystore.location" : "/etc/security/serverKeys/keystore.jks",<url>http://c6402.ambari.apache.org:10001/cliservice</url>
    </service>
     <service>
   "ssl.server.keystore.password" : "SECRET:ssl-server:1:ssl.server.keystore.password",
    <role>WEBHCAT</role>
        "ssl.server.keystore.type" : "jks",<url>http://c6402.ambari.apache.org:50111/templeton</url>
    </service>
    <service>
        <role>AMBARI</role>
        "ssl.server.truststore.location" : "/etc/security/serverKeys/all.jks",<url>http://c6401.ambari.apache.org:8080</url>
    </service>
    <service>
        "ssl.server.truststore.password" : "SECRET:ssl-server:1:ssl.server.truststore.password"
<role>OOZIE</role>
        <url>http://c6402.ambari.apache.org:11000/oozie</url>
    },</service>
     <service>
     "properties_attributes" : { }<role>JOBTRACKER</role>
        },<url>rpc://c6402.ambari.apache.org:8050</url>
    </service>
    {<service>
        <role>NAMENODE</role>
  "Config" : {
    <url>hdfs://c6401.ambari.apache.org:8020</url>
    </service>
    "cluster_name" : "CLUSTER_NAME",
 <service>
        <role>WEBHBASE</role>
       "stack_id" : "HDP-2.6"
<url>http://c6401.ambari.apache.org:60080</url>
    </service>
    <service>
   },
     <role>WEBHDFS</role>
       "type" : "hdfs-site",<url>http://c6401.ambari.apache.org:50070/webhdfs</url>
    </service>
    <service>
      "tag" : "version1", <role>RESOURCEMANAGER</role>
        <url>http://c6402.ambari.apache.org:8088/ws</url>
  "version" : 1,</service>
    <service>
      "properties" : {<role>KNOXSSO</role>
        <param>
    "dfs.cluster.administrators" : " hdfs",
        <name>knoxsso.cookie.secure.only</name>
          "dfs.encrypt.data.transfer.cipher.suites" : "AES/CTR/NoPadding",<value>true</value>
        </param>
      "dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude", <param>
            "dfs<name>knoxsso.http.policy" : "HTTP_ONLY",token.ttl</name>
            "dfs.https.port" : "50470",<value>100000</value>
        </param>
    "dfs.journalnode.http-address" : "0.0.0.0:8480",
            "dfs.journalnode.https-address" : "0.0.0.0:8481",
            "dfs.namenode.http-address" : "c6401.ambari.apache.org:50070",
            "dfs.namenode.https-address" : "c6401.ambari.apache.org:50470",
            "dfs.namenode.rpc-address" : "c6401.ambari.apache.org:8020",
            "dfs.namenode.secondary.http-address" : "c6402.ambari.apache.org:50090",
            "dfs.webhdfs.enabled" : "true"
          },
          "properties_attributes" : {
            "final" : {
              "dfs.webhdfs.enabled" : "true",
              "dfs.namenode.http-address" : "true",
              "dfs.support.append" : "true",
              "dfs.namenode.name.dir" : "true",
              "dfs.datanode.failed.volumes.tolerated" : "true",
              "dfs.datanode.data.dir" : "true"
            }
          }
        }
	  ]
	},
	{
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=YARN&service_config_version=1",
      "cluster_name" : "CLUSTER_NAME",
      "configurations" : [
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },</service>
</topology>

3.1 Simple Descriptor Discovery

We should also consider how we will discover simple descriptors and I think that we may want to have multiple ways.

3.1.1 Local

Just as is currently done for topology files, Knox can monitor a local directory for new or changed descriptors, and trigger topology generation and deployment upon such events.
This is great for development and small cluster deployments.

The Knox Topology Service will monitor two additional directories:

conf/shared-providers
- Referenced provider configurations will go in this directory; These configurations are the <gateway/> elements found in topology files.
- When a file is modified (create/update) in this directory, any descriptors that reference it are updated to trigger topology regeneration to reflect any provider configuration changes.
- Attempts to delete a file from this directory via the admin API will be prevented if it is referenced by any descriptors in conf/descriptors.

conf/descriptors
- Simple descriptors will go in this directory.
- When a file is modified (create/update) in this directory, a topology file is (re)generated in the conf/topologies directory.
- When a file is deleted from this directory, the associated topology file in conf/topologies is also deleted, and that topology is undeployed.
- When a file is deleted from the conf/topologies directory, the associated descriptor in conf/descriptors is also deleted (if it exists), to prevent unintentional regeneration/redeployment of the topology.

3.1.2 Remote

For production and larger deployments, we need to be able to accommodate multiple instances of Knox better. One proposal for such cases is a ZooKeeper-based discovery mechanism.
All Knox instances will pick up the changes from ZK as the central source of truth, and perform the necessary generation and deployment of the corresponding topology.

The location of these descriptors and their dependencies (e.g., referenced provider config) in ZK must be defined.

It would also be helpful to provide a means (e.g., Ambari, Knox admin UI, CLI, etc...) by which these descriptors can be easily published to the correct location in a znode.

4. Service Discovery

Item #2 from the Motivation section addresses the usability issues associated with having to manually edit a topology descriptor to specify the various service URLs in a cluster.
Beyond the obvious tedium associated with this task, there is real potential for human error, especially when multiple Knox instances are involved. The proposed solution to this
is automated service discovery, by which the URLs for the services to be exposed are discovered and added to a topology.

In order to transform a Simplified Topology Descriptor into a full topology, we need some source for the relevant details and metadata. Knox is often deployed in clusters
with Ambari and ZooKeeper also deployed, and both contain some level of the needed metadata about the service URLs wtihin the cluster.

Ambari provides everything we need to flesh out a full topology from a minimal description of what resources that we want exposed and how we want access to those resources protected.

The ZooKeeper service registry has promise, but we need to investigate the level of use from a registered services perspective and details available for each service.

Anchor
ambariapi
ambariapi
4.1 Apache Ambari API

These are some excerpts of responses from the Ambari REST API which can be employed for topology discovery:

(CLUSTER_NAME is a placeholder for an actual cluster name)

Identifying Clusters - /api/v1/clusters

Code Block

title	/api/v1/clusters

{
  "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/",
  "items" : [
    {
      "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME",
      "Clusters" : {
          "typecluster_name" : "yarn-siteCLUSTER_NAME",
          "propertiesversion" : {"HDP-2.6"
      }
	}
  ]
}

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Code Block

title	/api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

“items” : [
  {
    “href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HIVE      "yarn.http.policy" : "HTTP_ONLY",
            "yarn.log.server.url" : "http://c6402.ambari.apache.org:19888/jobhistory/logs",
    “components” : [
      {
  "yarn.log.server.web-service.url" : "http://c6402.ambari.apache.org:8188/ws/v1/applicationhistory",
    “ServiceComponentInfo” : { },
     "yarn.nodemanager.address" : "0.0.0.0:45454",   “host_components” : [
          {
            "yarn.resourcemanager.address"“HostRoles” : "c6402.ambari.apache.org:8050",
{
             "yarn.resourcemanager.admin.address" : "c6402.ambari.apache.org:8141" “cluster_name” : “CLUSTER_NAME”,
            "yarn.resourcemanager.ha.enabled"  “component_name” : "false" “HCAT”,
            "yarn.resourcemanager.hostname"  “host_name” : "c6402“c6402.ambari.apache.org",org”
            "yarn.resourcemanager.webapp.address" : "c6402.ambari.apache.org:8088",}
          }
  "yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled" : "false",
    ]
        "yarn.resourcemanager.webapp.https.address" : "c6402.ambari.apache.org:8090",
		  },
  },
      {
        “ServiceComponentInfo” : { },
        “host_components” : [
        "properties_attributes" : {
     }
       “HostRoles” },
	  ]
	},
]

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

Code Block

title	/api/v1/clusters/CLUSTER_NAME/components

"items" : [
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HCAT": {
              “cluster_name” : “CLUSTER_NAME”,
    "ServiceComponentInfo"  : {
      "cluster_name" “component_name” : "CLUSTER_NAME",
 “HIVE_SERVER”,
              "component“host_name" : "HCAT",name” : “c6402.ambari.apache.org”
            }
          }
      "service_name" : "HIVE"]
      }
	]
  },
  {
    "href"“href” : "http://c6401.ambari.apache.org:8080AMBARI_ADDRESS/api/v1/clustersCLUSTER_NAME/CLUSTER_NAME/componentsservices/HIVE_SERVERHDFS",
    "ServiceComponentInfo"“ServiceInfo” : {},
    “components”  "cluster_name" : "CLUSTER_NAME",
: [
      {
        "component_name"“ServiceComponentInfo” : "HIVE_SERVER" { },
      "service_name"  “host_components” : "HIVE"[
    }
    },
  {
     "href"  : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/NAMENODE",
    "ServiceComponentInfo"“HostRoles” : {
          "cluster_name"    “cluster_name” : "CLUSTER“CLUSTER_NAME"NAME”,
      "component_name" : "NAMENODE",
      "service“component_name"name” : "HDFS" “NAMENODE”,
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/OOZIE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "OOZIE_SERVER",
      "service_name" : "OOZIE"
    }
  }
]

4.2 Zookeeper Service Registry

TODO: provide some sense of completeness, APIs and examples

...

5. Provider Configurations

KIP-1 touched on an improvement for Centralized Gateway Configuration for LDAP/AD and the ability to import topology fragments into other topologies by referencing them by name within the topology.

While we haven't actually implemented anything along these lines yet, we are touching on something very similar in this proposal. We need to be able to author a set of Provider Configurations that can be referenced by name in the simple descriptor, that can be enumerated for inclusion in a dropdown for UIs that need to be able to select a Provider Configuration and a set of APIs added to the Admin API for management of the configuration with CRUD operations.

Whether we import the common Provider Configuration or pull in the contents of it into a generated topology can certainly be a design discussion. I think that importing at runtime would be best as we wouldn't have to regenerate all the topologies that included them if something is changed in the Provider Configuration. Though there would need to be some way to reimport the Provider Config at runtime if something changes.

 “host_name” : “c6401.ambari.apache.org”
            }
          }
        ]
      }
	]
  }
]

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Code Block

title	/api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

"items" : [
  {
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=HDFS&service_config_version=2",
      "cluster_name" : "CLUSTER_NAME",
      "configurations" : [
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "ssl-server",
          "properties" : {
            "ssl.server.keystore.location" : "/etc/security/serverKeys/keystore.jks",
            "ssl.server.keystore.password" : "SECRET:ssl-server:1:ssl.server.keystore.password",
            "ssl.server.keystore.type" : "jks",
            "ssl.server.truststore.location" : "/etc/security/serverKeys/all.jks",
            "ssl.server.truststore.password" : "SECRET:ssl-server:1:ssl.server.truststore.password"
          },
          "properties_attributes" : { }
        },
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "hdfs-site",
          "tag" : "version1",
          "version" : 1,
          "properties" : {
            "dfs.cluster.administrators" : " hdfs",
            "dfs.encrypt.data.transfer.cipher.suites" : "AES/CTR/NoPadding",
            "dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude",
            "dfs.http.policy" : "HTTP_ONLY",
            "dfs.https.port" : "50470",
            "dfs.journalnode.http-address" : "0.0.0.0:8480",
            "dfs.journalnode.https-address" : "0.0.0.0:8481",
            "dfs.namenode.http-address" : "c6401.ambari.apache.org:50070",
            "dfs.namenode.https-address" : "c6401.ambari.apache.org:50470",
            "dfs.namenode.rpc-address" : "c6401.ambari.apache.org:8020",
            "dfs.namenode.secondary.http-address" : "c6402.ambari.apache.org:50090",
            "dfs.webhdfs.enabled" : "true"
          },
          "properties_attributes" : {
            "final" : {
              "dfs.webhdfs.enabled" : "true",
              "dfs.namenode.http-address" : "true",
              "dfs.support.append" : "true",
              "dfs.namenode.name.dir" : "true",
              "dfs.datanode.failed.volumes.tolerated" : "true",
              "dfs.datanode.data.dir" : "true"
            }
          }
        }
	  ]
	},
	{
      "href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=YARN&service_config_version=1",
      "cluster_name" : "CLUSTER_NAME",
      "configurations" : [
        {
          "Config" : {
            "cluster_name" : "CLUSTER_NAME",
            "stack_id" : "HDP-2.6"
          },
          "type" : "yarn-site",
          "properties" : {
            "yarn.http.policy" : "HTTP_ONLY",
            "yarn.log.server.url" : "http://c6402.ambari.apache.org:19888/jobhistory/logs",
            "yarn.log.server.web-service.url" : "http://c6402.ambari.apache.org:8188/ws/v1/applicationhistory",
            "yarn.nodemanager.address" : "0.0.0.0:45454",
            "yarn.resourcemanager.address" : "c6402.ambari.apache.org:8050",
            "yarn.resourcemanager.admin.address" : "c6402.ambari.apache.org:8141",
            "yarn.resourcemanager.ha.enabled" : "false",
            "yarn.resourcemanager.hostname" : "c6402.ambari.apache.org",
            "yarn.resourcemanager.webapp.address" : "c6402.ambari.apache.org:8088",
            "yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled" : "false",
            "yarn.resourcemanager.webapp.https.address" : "c6402.ambari.apache.org:8090",
		  },
          "properties_attributes" : { }
        },
	  ]
	},
]

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

Code Block

title	/api/v1/clusters/CLUSTER_NAME/components

"items" : [
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HCAT",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HCAT",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HIVE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HIVE_SERVER",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/NAMENODE",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "NAMENODE",
      "service_name" : "HDFS"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/OOZIE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "OOZIE_SERVER",
      "service_name" : "OOZIE"
    }
  }
]

4.2 ZooKeeper Service Registry

TODO: provide some sense of completeness, APIs and examples

4.3 Topology Change Discovery

Since the service URLs for a cluster will be discovered, Knox has the opportunity to respond dynamically to subsequent topology changes. For a Knox topology that has been generated and deployed, it's possible that the URL for a given service could change at some point afterward.
The host name could change. The scheme and/or port could change (e.g., http --> https). The potential and frequency of such changes certainly varies among deployments.
We should consider providing the option for Knox to detect topology changes for a cluster, and respond by updating its corresponding topology.

For example, Ambari provides the ability to request the active configuration versions for all the service components in a cluster. There could be a thread that checks this set, notices one or more version changes, and initiates the re-generation/deployment of that topology.

Another associated benefit is the capability for Knox to interoperate with Ambari instances that are unaware of the Knox instance. Knox no longer MUST be managed by Ambari.

5. Provider Configurations

KIP-1 touched on an improvement for Centralized Gateway Configuration for LDAP/AD and the ability to import topology fragments into other topologies by referencing them by name within the topology.

While we haven't actually implemented anything along these lines yet, we are touching on something very similar in this proposal. We need to be able to author a set of Provider Configurations that can be referenced by name in the simple descriptor, that can be enumerated for inclusion in a dropdown for UIs that need to be able to select a Provider Configuration and a set of APIs added to the Admin API for management of the configuration with CRUD operations.

Whether we import the common Provider Configuration or pull in the contents of it into a generated topology can certainly be a design discussion. I think that importing at runtime would be best as we wouldn't have to regenerate all the topologies that included them if something is changed in the Provider Configuration. Though there would need to be some way to reimport the Provider Config at runtime if something changes.

6. Discovery Service Authentication

The discovery service must be authenticated by at least some of the service registries (e.g., Ambari). We need to define the means by which credentials are configured for this service in Knox.

6.1 Ambari

HTTP Basic authentication is supported by Ambari, so the Ambari service discovery implementation can leverage the Knox Alias service to obtain the necessary credentials at discovery time.

The username can be specified in a descriptor, using the discovery-user property. The default user name can also be mapped to the alias ambari.discovery.user.

Similarly, the password alias can be specified in a descriptor, using the discovery-pwd-alias property. By default, the Ambari service discovery will lookup the ambari.discovery.password alias to get the password.

So, there are two ways to specify the username for Ambari interaction:

Provision the alias mapping using the knoxcli.sh script

bin/knoxcli.sh create-alias ambari.discovery.user --value ambariuser
Specify the discovery-user property in a descriptor (This can be useful if a Knox instance will proxy services in clusters managed by multiple Ambari instances)

"discovery-user":"ambariuser"

And, two ways to specify the associated password:

Provision the password mapped to the default alias, ambari.discovery.password

bin/knoxcli.sh create-alias ambari.discovery.password --value ambaripasswd
Provision a different alias, and specify it in the descriptor (This can be useful if a Knox instance will proxy services in clusters managed by multiple Ambari instances)

"discovery-pwd-alias":"my.ambari.discovery.password.alias"

Space shortcuts

Child pages

Versions Compared

Old Version 3

New Version Current

Key

Status

Anchor
motivation
motivation
Motivation

Anchor
simpledesc
simpledesc
2. Simplified Topology Descriptor

3.Topology Generation

Listing 1. sample.properties

Listing 2. sample.xml Sample Topology

Anchor
topologygen
topologygen
3.Topology Generation

4. Service Discovery

4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

3.1 Simple Descriptor Discovery

3.1.1 Local

4. Service Discovery

Anchor
ambariapi
ambariapi
4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 Zookeeper Service Registry

5. Provider Configurations

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 ZooKeeper Service Registry

4.3 Topology Change Discovery

5. Provider Configurations

6. Discovery Service Authentication

6.1 Ambari

Related Links

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 3

New Version Current

Key

Status

AnchormotivationmotivationMotivation

Anchorsimpledescsimpledesc2. Simplified Topology Descriptor

3.Topology Generation

Listing 1. sample.properties

Listing 2. sample.xml Sample Topology

Anchortopologygentopologygen3.Topology Generation

4. Service Discovery

4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

3.1 Simple Descriptor Discovery

3.1.1 Local

4. Service Discovery

Anchorambariapiambariapi4.1 Apache Ambari API

Identifying Clusters - /api/v1/clusters

Service Component To Host Mapping - /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 Zookeeper Service Registry

5. Provider Configurations

Service Configuration Details (Active Versions) - /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true

Cluster Service Components - /api/v1/clusters/CLUSTER_NAME/components

4.2 ZooKeeper Service Registry

4.3 Topology Change Discovery

5. Provider Configurations

6. Discovery Service Authentication

6.1 Ambari

Related Links

Anchor
motivation
motivation
Motivation

Anchor
simpledesc
simpledesc
2. Simplified Topology Descriptor

Anchor
topologygen
topologygen
3.Topology Generation

Anchor
ambariapi
ambariapi
4.1 Apache Ambari API