Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Now let’s download and install hadoop. Following the excellent instructions available on the hadoop site itself. Follow steps given in pseudo distributed mode.

  • These instructions were written for version 2.7.0. So grab that tar (hadoop­2.7.0.tar.gz) and checksum file (hadoop­2.7.0.tar.gz.mds).

  • Instructions on this page ask that java be installed. If java is not there, install JDK first.

    sudo yum install java-­1.7.0-­openjdk­-devel

  • Make note of the location where you installed hadoop. Here I assume that you have installed it in

    /usr/local/hadoop
  • Create a user under which we could install and ultimately run the various hadoop processes. And login as that user.    

     

     

     

     

    Code Block
    sudo useradd ­­--home-­dir /var/hadoop --­­create-­home --­­shell /bin/bash ­­--user-­group hadoop

 

...

 

...

 

...

         if you get below given message then try next command  

Code Block
sudo useradd --­­home-­dir /var/hadoop ­­--create-­home --­­shell /bin/bash hadoop -­g hadoop

 

  • sudo tar ­zxvf ~/dev/hadoop-­2.7.0.tar.gz -­C /usr/local
  • cd /usr/local 
  • sudo ln ­-s hadoop­-2.7.0 hadoop 
  • sudo chown hadoop -­R hadoop hadoop-­2.7.0
  • sudo chgrp hadoop ­-R hadoop hadoop­-2.7.0  

...

To start hive metastore :

 

Code Block
su ­-l hive -­c "env HADOOP_HOME=/usr/local/hadoop JAVA_HOME=/usr/lib/jvm/java-­1.7.0-­openjdk.x86_64 nohup hive --­­service metastore > /var/log/hive/hive.out 2> /var/log/hive/hive.log &”

 

To start Hive server2 :

 


Code Block
su -­l hive -­c "env HADOOP_HOME=/usr/local/hadoop JAVA_HOME=/usr/lib/jvm/java­-1.7.0-­openjdk.x86_64 nohup /usr/local/hive/bin/hiveserver2 ­hiveconf hive.metastore.uris=\" \" > /var/log/hive/hiveServer2.out 2>/var/log/hive/hiveServer2.log &”



 

To Stop: 

 


Code Block
ps aux | awk '{print $1,$2}' | grep hive | awk '{print $2}' | xargs kill >/dev/null 2>&1



 

To Login in Hive shell: 

 


Code Block
/usr/local/hive/bin/beeline ­-u "jdbc:hive2://localhost:10000" -­n rituser ­-p rituser



 

If hive metastore and hiveserver2 do not start then update below given key-­values according to your environment in following files.

  • hiveserver2­site.xml  

    <configuration>
    <property>
    <name>hive.security.authorization.enabled</name>
    <value>true</value>
    </property>
    <property>
    <name>hive.security.authorization.manager</name>

    <value>org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer
    Factory</value>
    </property>
    <property>
    <name>hive.security.authenticator.manager</name>

    <value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</v
    alue>

    </property>
    <property>
    <name>hive.conf.restricted.list</name>

    <value>hive.security.authorization.enabled,hive.security.authorization.manage
    r,hive.security.authenticator.manager</value>
    </property>
    </configuration>


    hive­site.xml 

     

 

<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
</property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive</value>
<property>
</property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive_resources</value>
<property>
</property>
<name>hive.scratch.dir.permission</name>

<value>733</value>
<property>
</property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<property>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
</property>
<property>
<name>hive.hwi.listen.host</name>
<value>localhost</value>
</property>
<property>

 

...

  • You can run a MapReduce job on YARN in a pseudo­distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition

  • The following instructions assume that hadoop installations steps mentioned in Installing Apache Hadoop are already executed.

17. ENABLING RANGER YARN PLUGIN:

  1. We’ll start by extracting our build at the appropriate place (/usr/local). 
  • cd /usr/local  
  • sudo tar zxvf ~/dev/incubator-­ranger/target/ranger-­0.5.0-­yarn-­plugin.tar.gz  
  • sudo ln -­s ranger-­0.5.0-­yarn-­plugin ranger-­yarn-­plugin 
  • cd ranger­-yarn-­plugin

    2.  Now let’s edit the install.properties file. Here are the relevant lines that you should edit:

  • Change the install.properties file 

    PROPERTYVALUE
    POLICY_MGR_URL

    http://localhost:6080

    REPOSITORY_NAMEyarndev
    XAAUDIT.DB.IS_ENABLEDtrue
    XAAUDIT.DB.FLAVOURMYSQL
    XAAUDIT.DB.HOSTNAMElocalhost
    XAAUDIT.DB.DATABASE_NAMEranger_audit 
    XAAUDIT.DB.USER_NAMErangerlogger 
    XAAUDIT.DB.PASSWORDrangerlogger 

   3.  Now enable the yarn­-plugin by running the enable-­yarn-­plugin.sh command.

  • cd /usr/local/ranger-­yarn-­plugin
  • ./ enable-­yarn-­plugin.sh

   4.  One more change that we need to do is copy all the jar files from  ${hadoop_home}/lib

  • cp /usr/local/ranger­-yarn-­plugin/lib/*.jar /usr/local/hadoop/share/hadoop/yarn/lib/

   5.  if you get permission denied error during yarn start please provide required privileges to yarn user in local and hdfs file system. for example :

  • mkdir /var/log/yarn
  • chown ­-R yarn:yarn /var/log/yarn

 6.  Once these changes are done Start ResourceManager daemon and NodeManager daemon.

  • Start the ResourceManager on ResourceManager hosts. 
    1. su yarn ­-c "/usr/local/hadoop/sbin/yarn­-daemon.sh start resourcemanager"
    2. ps ­-ef | grep -­i resourcemanager 
  • Start the NodeManager on NodeManager hosts. 
    1. su yarn ­-c "/usr/local/hadoop/sbin/yarn-­daemon.sh start nodemanager"

    2. ps -­ef | grep -­i nodemanager
  • Stop the ResourceManager on ResourceManager hosts. 
    1. su yarn ­-c "/usr/local/hadoop/sbin/yarn-­daemon.sh stop resourcemanager"
    2. ps ­-ef | grep -­i resourcemanager
  • Stop the NodeManager on NodeManager hosts. 
    1. su yarn -­c "/usr/local/hadoop/sbin/yarn-­daemon.sh stop nodemanager"
    2. ps ­-ef | grep -­i nodemanager 

  7.  This should start the association of ranger-­yarn-­plugin with hadoop. 

  • You can verify by logging into the Ranger Admin Web interface ­> Audit > Agents

  18. INSTALLING RANGER KMS (0.5.0)

       Prerequisites: (Need to done for all host on which Ranger KMS needs to be installed)

  1. Download “Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files” zip using below link depending upon the Java version used

     2.  unzip the above downloaded zip file to java’s security folder (Depending upon the java version used)

Code Block
unzip UnlimitedJCEPolicyJDK7.zip into $JDK_HOME/jre/lib/security unzip jce_policy-­8.zip into $JDK_HOME/jre/lib/security 

3.  STEPS FOR RANGER KMS:

  • We’ll start by extracting our build at the appropriate place(/usr/local). 
    1. cd /usr/local  
    2. sudo tar ­-zxvf ~/dev/incubator­-ranger/target/ranger-­0.5.0-­kms.tar.gz  
    3. sudo ln ­-s ranger-­0.5.0-­kms ranger-­kms 
    4. cd ranger-­kms 

       

  • Please note that Ranger KMS plugin is integrated with Ranger KMS and will be installed automatically when KMS is installed.

     

  • Now let’s edit the install.properties file. Here are the relevant lines that you should edit:

  1. Change the install.properties file 
  2. DB_FLAVOR 
  3. SQL_CONNECTOR_JAR 
  4. db_root_user 
  5. db_root_password
  6. db_host 
  7. db_name
  8. db_user 
  • db_password 

    PROPERTYVALUE
    POLICY_MGR_URL


    http://localhost:6080

    REPOSITORY_NAMEkmsdev
    KMS_MASTER_KEY_PASSWD  enter master key password
    XAAUDIT.DB.IS_ENABLEDtrue
    XAAUDIT.DB.FLAVOURMYSQL 
    XAAUDIT.DB.HOSTNAMElocalhost 
    XAAUDIT.DB.DATABASE_NAMEranger_audit 
    XAAUDIT.DB.USER_NAMErangerlogger
    XAAUDIT.DB.PASSWORDrangerlogger
  • Edit “hdfs-site.xml”( Need to give provider else it will not support hadoop commands)

    • Replace localhost with <internal host name> 
  1. Go to path cd /usr/local/hadoop/conf/ 
  2. vim hdfs­site.xml 
  3. For property “dfs.encryption.key.provider.uri” ,enter the value “kms://http@<internal host name>:9292/kms”

  4. save and quit 

     

  • Edit “core­site.xml”( Need to give provider else it will not support hadoop commands)

 

    • Replace localhost with <internal host name> 
  1. Go to path cd /usr/local/hadoop/conf/ 
  2. vim core­site.xml 
  3. For property “hadoop.security.key.provider.path” ,enter the value “kms://http@<internal host name>:9292/kms”

  • Once these changes are done Restart hadoop.

    • Stop NameNode, SecondaryNameNode and DataNode daemon: 
    • su -­l hdfs -­c "/usr/local/hadoop/sbin/hadoop­daemon.sh stopnamenode"

    • su ­-l hdfs -­c "/usr/local/hadoop/sbin/hadoop­daemon.sh startnamenode"

  • Run setup  
    • ./setup.sh

       

  • Start the kms server
    • ranger­-kms start
  • You can verify the plugin is communicating to Ranger admin in Audit-­>plugins tab.

     
  • If kmsdev service is not created in Ranger Admin then kms­-plugin will not able to connect to Ranger admin.

  • To Create the Kms service 
    • PROPERTYVALUE
      REPOSITORY_NAME

      name specified in installed.properties (e.g
      kmsdev)

      KMS URLkms://http@<internal host name>:9292/kms 
      Username<username> (for e.g keyadmin)
      Password<password> 
    • Check Test Connection.

   ENABLING AUDIT LOGGING TO HDFS:

  • To enable Audit to HDFS for a plugin do the below: 
    1. set XAAUDIT.HDFS.ENABLE = true for respective component plugin in the install.properties file which may be found in /usr/local/ranger­<component>­plugin/ directory. 
    2. configure NameNode host in the XAAUDIT.HDFS.HDFS_DIR.

    3. create a policy in HDFS service from Ranger Admin for individual component users (hive/hbase/knox/storm/yarn/kafka/kms) to give READ+ WRITE permission for the particular audit folder. i.e for enabling Hive component to log Audits to HDFS , we need to create a policy for hiveuser with READ+ WRITE permissions to respective audit directory

    4. Audit to HDFS caches logs in local directory, which can be specified in XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY ( this can be like ‘/var/log/<component>/**), which is the path where audit is stored temporarily, likewise for archived logs we need to update XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY value ( this can be like ‘/var/log/<component>/**), before enabling the plugin for the component.

  • Note that, HDFS audit logging is for archive purposes. For seeing audit report in the Ranger Admin UI, recommended option is Solr.

    ENABLING AUDIT LOGGING TO SOLR:

  • Set following properties in install.properties of ranger service to work audit to solr in Ranger

    PROPERTIESVALUE
    audit_storesolr
    audit_solr_urls 

    http://solr_host:6083/solr/ranger_audits

    audit_solr_user ranger_solr
    audit_solr_password NONE 


    Restart Ranger.

     

  •   To enable Audit to Solr for a plugin do the below:

    • Set following properties in install.properties of plugin to start logging audit to Solr : for eg Hbase

    • PROPERTYVALUE
      XAAUDIT.SOLR.IS_ENABLEDtrue 
      XAAUDIT.SOLR.ENABLEtrue 
      XAAUDIT.SOLR.URL

      http://solr_host:6083/solr/ranger_audits

      XAAUDIT.SOLR.USERranger_solr  
      XAAUDIT.SOLR.PASSWORDNONE 
      XAAUDIT.SOLR.FILE_SPOOL_DIRvar/log/hadoop/hdfs/audit/solr/spool 
    • Enable ranger plugin for Hbase. 
    • Restart Hbase component.