Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3
Info

This document is a work in progress.

Table of Contents
maxLevel4

Introduction

Before Ambari 2.0.0, configuring an Ambari cluster to use Kerberos involved setting up the Kerberos client infrastructure on each host, creating the required identities, generating and distributing the needed keytabs files, and updating the necessary configuration properties. On a small cluster this may not seem to be too large of an effort; however as the size of the cluster increases, so does the amount of work that is involved.

This is where Ambari’s Automated Kerberization facility can help. It performs all of these steps and also helps to maintain the cluster as new services and hosts are added.

The Automated Kerberization can be invoked using Ambari’s REST API as well as the Enable Kerberos Wizard in the Ambari UI.

How it works

Stacks and services that can utilize Kerberos credentials for authentication must have a Kerberos Descriptor declaring required Kerberos identities and how to update configurations. The Ambari infrastructure uses this data, and any updates applied by an administrator, to perform Kerberos related operations such as initially enabling Kerberos, enabling Kerberos for on hosts and added components, regenerating credentials, and disabling Kerberos. 

 It should be notated that it Ambari is required to be installed on a registered host. Also, the Kerberos service is required to be installed on all hosts of the cluster before any automated tasks can be performed. If using the Ambari UI, this should happen as part of the relevant wizard workflow. 

Enabling Kerberos

When enabling Kerberos, all of the services in the cluster are expected to be stopped. The main reason for this is to avoid state issues as the services are stopped and then started when the cluster is transitioning to be Kerberized. The following steps are taken to enable Kerberos on the cluster en masse:

  1. Create or update accounts in the configured KDC (or Active Directory)
  2. Generate keytab files and distribute them to the appropriate hosts
  3. Update relevant configurations

Adding Components

If Automated Kerberization was enabled for the Ambari cluster, whenever new components are added, the will automatically be configured for Kerberos and any necessary principals and keytabs will be generated and distributed. For each new component, the following steps will occur before the component is installed and started:

  1. Update relevant configurations
  2. Create or update accounts in the configured KDC (or Active Directory)
  3. Generate keytab files and distribute them to the appropriate hosts

Adding Hosts

When adding a new host, the Kerberos client must be installed on it. This does not happen automatically, however the Add Host Wizard in the Ambari UI will will perform this step if Automated Kerberization was enabled for the Ambari cluster. Once host is added, generally one or more components are installed on it - see Adding Components.

Regenerating Keytabs

Once a cluster has Automated Kerberization enabled, it may be necessary to regenerate keytabs. There are two options related to regenerating keytabs: all or missing.

In any case, the affected services should be restarted after the following regeneration process is complete:

  1. Create missing or update existing accounts in the configured KDC (or Active Directory)
  2. Generate keytab files and distribute them to the appropriate hosts

Disabling Kerberos

In the event Kerberos needs to be removed from the Ambari cluster, the Ambari will remove the managed Kerberos identities, keytab files, and configuration. The Ambari UI will perform the steps of stopping and starting the services as well as removing the Kerberos service, however this will need to be done manually, otherwise.

The Kerberos Descriptor

The Kerberos Descriptor is a JSON-formatted text file containing information needed by Ambari to enable or disable Kerberos for a stack and its services. This file must be named kerberos.json and should be in the root directory of the relevant stack or service. Kerberos Descriptors are meant to be hierarchical such that details in the stack-level descriptor can be overwritten (or updated) by details in the service-level descriptors.

For the services in a stack to be Kerberized, there must be a stack-level Kerberos Descriptor. This ensures that even if a common service has a Kerberos Descriptor, it may not be Kerberized unless the relevant stack indicates that supports Kerberos by having a stack-level Kerberos Descriptor.

For a component of a service to be Kerberized, there must be an entry for it in its containing service's service-level descriptor. This allows for some of a services' components to be managed and other components of that service to be ignored by the automated Kerberos facility.

Kerberos Descriptors are inherited from the base stack or service, but may be overridden as a full descriptor - partial descriptors are not allowed.

A complete descriptor (which is built using the stack-level descriptor, the service-level descriptors, and any updates from user input) has the following structure:

  • Stack-level Properties
  • Stack-level Identities
  • Stack-level Configurations
  • Stack-level Auth-to-local-properties
  • Services
    • Service-level Identities
    • Service-level Configurations
    • Service-level Auth-to-local-properties
    • Components
      • Component-level Identities
      • Component-level Configurations
      • Component-level Auth-to-local-properties

Each level of the descriptor inherits the data from its parent. This data, however, may be overridden if necessary. For example, a component will inherit the configurations and identities of its container service; which in turn inherits the configurations and identities from the stack.

Components of a Kerberos Descriptor

Stack-level Properties

Stack-level properties is an optional set of name/value pairs that can be used in variable replacements. For example, if a property named "property1" exists with the value of "value1", then any instance of "${property1}" within a configuration property name or configuration property value will be replaced with "value1".

This property is only relevant in the stack-level Kerberos Descriptor and may not be overridden by lower-level descriptors.

Stack-level Identities

Stack-level identities is an optional identities block containing a list of zero or more identity descriptors that are common among all services in the stack. An example of such an identity is the Ambari smoke test user, which is used by all services to perform service check operations. Service- and component-level identities may reference (and specialize) stack-level identities using the identity’s name with a forward slash (/) preceding it. For example if there was a stack-level identity with the name "smokeuser", then a service or a component may create an identity block that references and specializes it by setting its name to "/smokeuser" and overriding its properties as necessary. This does not override the stack-level identity, it essentially creates a copy of it and updates the copy's properties. 

Stack-level Auth-to-local-properties

Stack-level auth-to-local-properties is an optional list of zero or more configuration property specifications (config-type/property_name[|concatenation_scheme]) indicating which properties contain auth-to-local rule sets and how to concatenate the rules to meet the property specifications. These sets are dynamically updated using the details from the identities used when Kerberizing the cluster and concatenated as indicated.  The concatenation scheme value is optional.

If specified one of the following schemes must be specified:

  • new_lines - rules in the rule set are separated by a new line (\n)
  • new_lines_escaped - rules in the rule set are separated by a \ and a new line (\ \n)
  • spaces - rules in the rule set are separated by a whitespace character (effectively placing all rules in a single line)

If not specified, the default concatenation scheme is new_lines.

Examples

Service site property using the default concatenation scheme

 core-site/hadoop.security.auth_to_local

Service site property explicitly using the escaped new lines concatenation scheme

startup.properties/http.authentication.kerberos.name.rules|new_lines_escaped

Stack-level Configurations

Stack-level configurations is an optional configurations block containing a list of zero or more configuration descriptors that are common among all services in the stack. Configuration descriptors are overridable due to the structure of the data.  However overriding configuration properties may create undesired behavior since it is not known until after the Kerberization process is complete what value a property will have.

Services

Services is a list of zero or more service descriptors. A stack-level Kerberos Descriptor should not list any services; however a service-level Kerberos Descriptor should contain at least one.

Service-level Identities

Service-level identities is an optional identities block containing list of zero or more identity descriptors that are common among all components of the service. Component-level identities may reference (and specialize) service-level identities using a relative path to the identity (the identity’s name with a two dots and forward slash (../) preceding it) or an absolute path to it (/service_name/identity_name). For example if there was a service-level identity with the name "service_identity", then a child component may create an identity block that references and specializes it by setting its name to "../service_identity" or "/service_name/service_identity" and overriding any values as necessary. This does not override the service-level identity, it essentially creates a copy of it and updates the copy's properties. 

 

Note: By using the absolute path to an identity, any service-level identity may be referenced by any other service or component.

Service-level Configurations

Service-level configurations is an optional configurations block listing of zero or more configuration descriptors that are common among all components within a service. Configuration descriptors are overridable due to the structure of the data.  However overriding configuration properties may create undesired behavior since it is not known until after the Kerberization process is complete what value a property will have.

Service-level Auth-to-local-properties

Service-level auth-to-local-properties is an optional list of zero or more configuration property specifications (config-type/property_name[|concatenation_scheme]) indicating which properties contain auth-to-local rule sets and how to concatenate the rules to meet the property specifications. These sets are dynamically updated using the details from the identities used when Kerberizing the cluster and concatenated as indicated.  The concatenation scheme value is optional.

 

If specified one of the following schemes must be specified:

 

  • new_lines - rules in the rule set are separated by a new line (\n)
  • new_lines_escaped - rules in the rule set are separated by a \ and a new line (\ \n)
  • spaces - rules in the rule set are separated by a whitespace character (effectively placing all rules in a single line)

 

If not specified, the default concatenation scheme is new_lines.

 

Examples

 

Service site property using the default concatenation scheme

 

 core-site/hadoop.security.auth_to_local

 

Service site property explicitly using the escaped new lines concatenation scheme

 

startup.properties/http.authentication.kerberos.name.rules|new_lines_escaped

Components

Components is a list of zero or more component descriptor blocks. 

Component-level Identities

 Component-level identities is an optional identities block containing a list of zero or more identity descriptors that are specific to the component. A Component-level identity may be referenced (and specialized) by using the absolute path to it (/service_name/component_name/identity_nameThis does not override the component-level identity, it essentially creates a copy of it and updates the copy's properties. 

Component-level Configurations

Component-level configurations is an optional configurations block listing zero or more configuration descriptors that are specific to the component.

Component-level Auth-to-local-properties

Component-level auth-to-local-properties is an optional list of zero or more configuration property specifications (config-type/property_name[|concatenation_scheme]) indicating which properties contain auth-to-local rule sets and how to concatenate the rules to meet the property specifications. These sets are dynamically updated using the details from the identities used when Kerberizing the cluster and concatenated as indicated.  The concatenation scheme value is optional.

 If specified one of the following schemes must be specified: 

  • new_lines - rules in the rule set are separated by a new line (\n)
  • new_lines_escaped - rules in the rule set are separated by a \ and a new line (\ \n)
  • spaces - rules in the rule set are separated by a whitespace character (effectively placing all rules in a single line)

 If not specified, the default concatenation scheme is new_lines.

Examples

 Service site property using the default concatenation scheme

  core-site/hadoop.security.auth_to_local

 Service site property explicitly using the escaped new lines concatenation scheme

startup.properties/http.authentication.kerberos.name.rules|new_lines_escaped

Descriptor Specifications

properties

The properties descriptor is only valid in the service-level Kerberos Descriptor file. This block is a set of name/value pairs as follows:

"properties" : {
  "property_1" : "value_1",
  "property_2" : "value_2"
  ...
}

auth-to-local-properties

The auth-to-local-properties descriptor is valid in the stack-, service-, and component-level descriptors. This block is a list of configuration specifications (config-type/property_name[|concatenation_scheme]) indicating which properties contain auth-to-local rules that should be dynamically updated based on the identities used within the Kerberized cluster. The specification optionally declare the concatenation scheme to use to append the rules into a rule set value. 

"auth-to-local-properties" : [
  "core-site/hadoop.security.auth_to_local",
  "service.properties/http.authentication.kerberos.name.rules|new_lines_escaped",
  ...
]

configurations

A configurations descriptor may exist in stack-, service-, and component-level descriptors. This block is a list of one or more configuration descriptors, such that each descriptor is a block containing a single structure named using the configuration type and containing values for each relevant property.

Each property name and value may be a concrete value or contain variables to be replaced using values from the properties descriptor or any available configuration. Properties from the properties descriptor are referenced by name (${property_name}) and configuration properties are reference by configuration type followed by a forward slash (/) and then the property name (${config-type/property_name}).

"configurations" : [
  {
    "config-type-1" : {
      "${cluster-env/smokeuser}_property" : "value1", 
      "some_realm_property" : "${realm}",
 
      ...

    }
  },
  {
    "config-type-2" : {
      "property-2" : "${cluster-env/smokeuser}",
      ...
    }
  },
  ...

]

If "cluster-env/smokuser" was "ambari-qa" and realm was "EXAMPLE.COM", the above block would effectively be translated to

"configurations" : [
  {
    "config-type-1" : {
      "ambari-qa_property" : "value1", 
      "some_realm_property" : "EXAMPLE.COM",
 
     ...

    }
  },
  {
    "config-type-2" : {
      "property-2" : "ambari-qa",
      ...
    }
  },
  ...

]

identities

An identities descriptor may exist in stack-, service-, and component-level descriptors. This block is a list of zero or more identity descriptors. Each identity descriptor is a block containing a name, an optional principal descriptor, and an optional keytab descriptor.

The name property of an identity descriptor may be a concrete name or a reference to some other identity in the composite Kerberos Descriptor. If the name represents a reference (starts with a / or a ../) the reference will be “copied” to the local scope and any values in the local principal or keytab descriptor will be used to override the base values.

"identities" : [
  {
    "name" : "local_identity",
    "principal" : {
      ...
    },
    "keytab" : {
      ...
    }
  },
  {
    "name" : "/smokeuser",
    "principal" : {
      ...
    },
    "keytab" : {
      ...
    }
  },
  ...
]

principal

The principal descriptor is an optional block inside an identity descriptor. It declares the details about the identity’s principal including the principal’s value, the type (user or service), the relevant configuration property, and a local username mapping. All properties are optional; however if no base or default value is available for all properties, the principal may be ignored.

The value property of the principal is expected to be the normalized principal name, including the principal’s components and realm. In most cases, the realm should be specified using the realm variable (${realm} or ${kerberos-env/realm}). Also, in the case of a service principal, "_HOST" should be used to represent the relevant hostname, however the built-in hostname variable (${hostname}) may be used if "_HOST" replacement on the agent-side is not available for the service. For example, smokeuser@${realm}, service/_HOST@${realm}.

The type property of the principal may be either “user” or “service”. If not specified, the type is assumed to be “user”. This value dictates how the identity is to be created in the KDC or Active Directory. It is especially important in the Active Directory case due to how accounts are created.

The configuration property is an optional configuration specification (config-type/property_name) that is to be set to this principal's value property after it's variables have been replaced.  

The local_username property, if supplied, indicates which local user account to use when generating auth-to-local rules for this identity. If not specified, no explicit rule will be generated.

"principal" : {
  "value": "${cluster-env/smokeuser}@${realm}",
  "type" : "user" ,
  "configuration": "cluster-env/smokeuser_principal_name",
  "local_username" : "${cluster-env/smokeuser}"
}

"principal" : {
  "value": "component1/_HOST@${realm}",
  "type" : "service" ,
  "configuration": "service-site/component1.principal"
}

keytab

The keytab descriptor is an optional block inside an identity descriptor. It describes how to create and store the relevant keytab file.  This block declares the keytab file's path in the local filesystem, the permissions to assign to that file, and the relevant configuration property.

The file property declares an absolute path to use to store the keytab file when distributing to relevant hosts. If this is not supplied, the keytab file will not be created.

The owner property is an optional block indicating the local user account to assign as the owner of the file and what access  (“rw” - indicates read/write access; “r” - indicates read-only access) should be granted to that user. By default, the owner will be given read-only access.

The group property is an optional block indicating which local group to assigned as the group owner of the file and what access (“rw” - indicates read/write access; “r” - indicates read-only access. “” - indicates no access) should be granted to local user accounts in that group. By default, the group will be given no access.

The configuration property is an optional configuration specification (config-type/property_name) that is to be set to the path of this keytabs file, after any variables have been replaced.  

"keytab" : {
  "file": "${keytab_dir}/smokeuser.headless.keytab",
  "owner": {
    "name": "${cluster-env/smokeuser}",
    "access": "r"
  },
  "group": {
    "name": "${cluster-env/user_group}",
    "access": "r"
  },
  "configuration": "${cluster-env/smokeuser_keytab}"
}

services

A service descriptor may exist in the stack-level and the service-level Kerberos Descriptor file. This block is a list of zero or more service descriptors to add to the composite Kerberos Descriptor. Each service descriptor is a block containing a service name, an optional identities block, an optional auth_to_local_properties block, an optional configurations block, and an optional components block.

"services": [
  {
    "name": "SERVICE",
    "identities": [
      ...
    ],
    "auth_to_local_properties" : [
      ...
    ],
    "configurations": [
      ...
    ],
    "components": [
      ...
    ]
  },
  …
]

 components

A component descriptor may exist in the service-level Kerberos Descriptor file. This block is a list of zero or more component descriptors belonging to the containing service descriptor. Each component descriptor is a block containing a component name, an optional identities block, an optional auth_to_local_properties block, and an optional configurations block.

"components": [
  {
    "name": "COMPONENT_NAME",
    "identities": [
      ...
    ],
    "auth_to_local_properties" : [
      ...
    ],
    "configurations": [
      ...
    ]
  },
  ...
]

Examples

Example Stack-level Kerberos Descriptor

The following example is annotated for descriptive purposes. The annotations are not valid in a real JSON-formatted file.

{
  // Properties that can be used in variable replacement operations. 
  // For example, ${keytab_dir} will resolve to "/etc/security/keytabs".
  // Since variable replacement is recursive, ${realm} will resolve 
  // to ${kerberos-env/realm}, which in-turn will resolve to the 
  // declared default realm for the cluster
  "properties": {
    "realm": "${kerberos-env/realm}",
    "keytab_dir": "/etc/security/keytabs"
  },
  // A list of global Kerberos identities. These may be referenced 
  // using /identity_name. For example the “spnego” identity may be 
  // referenced using “/spnego”
  "identities": [
    {
      "name": "spnego",
      // Details about this identity's principal. This instance does not
      // declare any value for configuration or local username. That is
      // left 
up to the services and components use wish to reference 
      // this principal and set overrides for those values.
      "principal": {
        "value": "HTTP/_HOST@${realm}",
        "type" : "service"
      },

      // Details about this identity’s keytab file. This keytab file 
      // will be created in the configured keytab file directory with 
      // read-only access granted to root and users in the cluster’s 
      // default user group (typically, hadoop). To ensure that only 
      // a single copy exists on the file system, references to this 
      // identity should not override the keytab file details; 
      // however if it is desired that multiple keytab files are 
      // created, these values may be overridden in a reference 
      // within a service or component. Since no configuration 
      // specification is set, the the keytab file location will not 
      // be set in any configuration file by default. Services and 
      // components need to reference this identity to update this
      // value as needed
.
      "keytab": {
        "file": "${keytab_dir}/spnego.service.keytab",
        "owner": {
          "name": "root",
          "access": "r"
        },
        "group": {
          "name": "${cluster-env/user_group}",
          "access": "r"
        }
      }
    },
    {
      "name": "smokeuser",
      // Details about this identity's principal. This instance declares
      // a configuration and local 
username mapping. Services and
      // components can override this t
o set additional configurations
      // that should be set to this 
principal value.  Overriding the
      // local username 
may create undesired behavior since there may be
      // conflicting 
entries in relevant auth-to-local rule sets.
      "principal": {
        "value": "${cluster-env/smokeuser}@${realm}",
        "type" : "user",
        "configuration": "cluster-env/smokeuser_principal_name",
        "local_username" : "${cluster-env/smokeuser}"
      },
      // Details about this identity’s keytab file. This keytab file 
      // will be created in the configured keytab file directory with 
      // read-only access granted to the configured smoke user 
      // (typically ambari-qa) and users in the cluster’s default 
      // user group (typically hadoop). To ensure that only a single 
      // copy exists on the file system, references to this identity 
      // should not override the keytab file details; however if it 
      // is desired that multiple keytab files are created, these 
      // values may be overridden in a reference within a service or 
      // component.
      "keytab": {
        "file": "${keytab_dir}/smokeuser.headless.keytab",
        "owner": {
          "name": "${cluster-env/smokeuser}",
          "access": "r"
        },
        "group": {
          "name": "${cluster-env/user_group}",
          "access": "r"
        },
        "configuration": "cluster-env/smokeuser_keytab"
      }
    }
  ]
}

Example Service-level Kerberos Descriptor

The following example is annotated for descriptive purposes. The annotations are not valid in a real JSON-formatted file.

{
  // One or more services may be listed in a service-level Kerberos
  // Descriptor file
  "services": [
    {
      "name": "SERVICE_1",
      // Service-level identities to be created if this service is installed.  
      // Any relevant keytab files will be distributed to hosts with at least
      // one of the components on it.

      "identities": [
        // Service-specific identity declaration, declaring all properties
        // needed initiate the creation of the principal and keytab files,
        // as well as setting the service-specific  configurations.  This may
        // be referenced by contained components using ../service1_identity.
        {
          "name": "service1_identity",
          "principal": {
            "value": "service1/_HOST@${realm}",
            "type" : "service",
  
          "configuration": "service1-site/service1.principal"

          },
          "keytab": {
            "file": "${keytab_dir}/service1.service.keytab",
            "owner": {
              "name": "${service1-env/service_user}",
              "access": "r"
            },
            "group": {
              "name": "${cluster-env/user_group}",
              "access": "r"
            },
  
          "configuration": "service1-site/service1.keytab.file"

          }
        },
        // Service-level identity referencing the stack-level spnego
        // identity and overriding the principal and keytab configuration
        // specifications.
        {
          "name": "/spnego",
          "principal": {
            "configuration": "service1-site/service1.web.principal"
          },
          "keytab": {
            "configuration": "service1-site/service1.web.keytab.file"
          }
        },
        // Service-level identity referencing the stack-level smokeuser 
        // identity. No properties are being overridden and overriding
        // the principal and keytab configuration 
specifications.
  
      {

          "name": "/smokeuser"
        }
      ],
      // Properties related to this service that require the auth-to-local
      // rules to be dynamically generated based on identities create for
      // the cluster.
      "auth_to_local_properties" : [
        "service1-site/security.auth_to_local"
      ],
      // Configuration properties to be set when this service is installed,
      // no matter which components are installed
      "configurations": [
        {
          "service-site": {
            "service1.security.authentication": "kerberos",
            "service1.security.auth_to_local": ""
          }
        }
      ],
      // A list of components related to this service
      "components": [

        {
          "name": "COMPONENT_1",
          // Component-specific identities to be created when this component
          // is installed.  Any keytab files specified will be distributed
          // only to the hosts where this component is installed.
          "identities": [

            // An identity "local" to this component
            {

              "name": "component1_service_identity",
              "principal": {
                "value": "component1/_HOST@${realm}",
                "type" : "service",
                "configuration": "service1-site/comp1.principal",
                "local_username" : "${service1-env/service_user}"
              },
              "keytab": {
                "file": "${keytab_dir}/s1c1.service.keytab",
                "owner": {
                  "name": "${service1-env/service_user}",
                  "access": "r"
                },
                "group": {
                  "name": "${cluster-env/user_group}",
                  "access": ""
                },
                "configuration": "service1-site/comp1.keytab.file"
              }
            },
            // The stack-level spnego identity overridden to set component-specific
            // configurations 

  
          {

              "name": "/spnego",
              "principal": {
                "configuration": "service1-site/comp1.spnego.principal"
              },
              "keytab": {
                "configuration": "service1-site/comp1.spnego.keytab.file"
              }  
          
            }

          ],
          // Component-specific configurations to set if this component is installed
          "configurations": [

            {
              "service-site": {
                "comp1.security.type": "kerberos"
              }
            }
          ]
        },
        {
          "name": "COMPONENT_2",
          "identities": [
            {
              "name": "component2_service_identity",
              "principal": {
                "value": "component2/_HOST@${realm}",
                "type" : "service",
                "configuration": "service1-site/comp2.principal",
                "local_username" : "${service1-env/service_user}"
              },
              "keytab": {
                "file": "${keytab_dir}/s1c2.service.keytab",
                "owner": {
                  "name": "${service1-env/service_user}",
                  "access": "r"
                },
                "group": {
                  "name": "${cluster-env/user_group}",
                  "access": ""
                },
                "configuration": "service1-site/comp2.keytab.file"
              }
            },
            // The service-level service1_identity identity overridden to
            // set component-specific configurations 

            {

              "name": "../service1_identity",
              "principal": {
                "configuration": "service1-site/comp2.service.principal"
              },
              "keytab": {
                "configuration": "service1-site/comp2.service.keytab.file"
              }            
            }

  
        ],

          "configurations" : [
            {
              "service-site" : {
                "comp2.security.type": "kerberos"
              }
            }
          ]
        }
      ]
    }
  ]
}

(more to come)