Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added section 4.3 Topology Change Discovery

...

Code Block
languagetext
titleProposed YAML
# Discovery info source
discovery-type : AMBARI
discovery-address : http://c6401.ambari.apache.org:8080

# Provider config reference, the contents of which will be
# included in (or referenced from) the resulting topology descriptor.
# The contents of this reference has a <gateway/> root, and
# contains <provider/> configurations.
provider-config-ref : ambari-cluster-policy.xml

# The cluster for which the service details should be discovered
cluster: mycluster

# The services to declare in the resulting topology descriptor,
# whose URLs will be discovered (unless a value is specified)
services:
    - NAMENODE
    - JOBTRACKER
    - WEBHDFS
    - WEBHCAT
    - OOZIE
    - WEBHBASE
    - HIVE
    - RESOURCEMANAGER
    - AMBARI : http://c6401.ambari.apache.org:8080
    - AMBARIUI : http://c6401.ambari.apache.org:8080

# UIs to be proxied through the resulting Knox topology (see KIP-9)
#uis:
#    - AMBARIUI : http://c6401.ambari.apache.org:8080

 

Anchor
topologygen
topologygen
3.Topology Generation

Given that we will have a Service Discovery service that can integrate with Ambari as well as other sources of needed metadata, we should be able to start with a simplified topology descriptor.
Once the deployment machinery notices this descriptor, it can pull in the referenced provider configuration, iterate over each of the services, UIs, applications and lookup the details for each.
With the provider configuration and service details we can then generate a fully baked topology.

...

 <ui>
<role>AMBARIUI</role>
<url>http://c6401.ambari.apache.org:8080</url>
</ui>
</topology>

3.1 Simple Descriptor Discovery

We should also consider how we will discover simple descriptors and I think that we may want to have multiple ways.

Just as we currently do for topology files, we can monitor a directory for new or changed descriptors, and trigger topology generation and deployment upon such events.
This is great for development and small cluster deployments, but for production and larger deployments, we need to be able to accommodate multiple instances of Knox better.
My proposal for such cases is a ZookeeperZooKeeper-based discovery mechanism. Then all Knox instances will pick up the changes from ZK as the central source of truth, and
perform the necessary generation and deployment of the corresponding topology.

4. Service Discovery

In order to transform the simple descriptor described in #2 into the Simplified Topology Descriptor into a full topology, we need some source for the relevant details and metadata. Knox is often deployed in clusters
with Ambari and with ZooKeeper also deployed, and both contain some level of the needed metadata about the service URLs wtihin the cluster.

The Zookeeper ZooKeeper based service registry has promise but we need to invetigate the level of use from a registered services perspective and details available for each service.

...

Code Block
title/api/v1/clusters/CLUSTER_NAME/components
"items" : [
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HCAT",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HCAT",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/HIVE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "HIVE_SERVER",
      "service_name" : "HIVE"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/NAMENODE",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "NAMENODE",
      "service_name" : "HDFS"
    }
  },
  {
    "href" : "http://c6401.ambari.apache.org:8080/api/v1/clusters/CLUSTER_NAME/components/OOZIE_SERVER",
    "ServiceComponentInfo" : {
      "cluster_name" : "CLUSTER_NAME",
      "component_name" : "OOZIE_SERVER",
      "service_name" : "OOZIE"
    }
  }
]

 

4.2

...

ZooKeeper Service Registry

  •  TODO: provide some sense of completeness, APIs and examples

4.3 Topology Change Discovery

Since the service URLs for a cluster are being discovered, Knox has the opportunity to respond dynamically to topology changes. For a Knox topology that has been generated and deployed, it's possible that the URL for a given service could subsequently change.

The host name could change. The scheme and/or port could change (e.g., http --> https). The potential and frequency of such changes certainly varies among deployments.

We should consider providing the option for Knox to detect topology changes for a cluster, and respond by updating its corresponding topology.

For example, Ambari provides the ability to request the active configuration versions for all the service components in a cluster. There could be a thread that checks this set, notices one or more version changes, and re-generates/deploys the topology.

5. Provider Configurations

...