...
Now let’s download and install hadoop. Following the excellent instructions available on the hadoop site itself. Follow steps given in pseudo distributed mode.
These instructions were written for version 2.7.0. So grab that tar (hadoop2.7.0.tar.gz) and checksum file (hadoop2.7.0.tar.gz.mds).
Instructions on this page ask that java be installed. If java is not there, install JDK first.
sudo yum install java-1.7.0-openjdk-devel
Make note of the location where you installed hadoop. Here I assume that you have installed it in
/usr/local/hadoop Create a user under which we could install and ultimately run the various hadoop processes. And login as that user.
Code Block sudo useradd --home-dir /var/hadoop --create-home --shell /bin/bash --user-group hadoop
...
...
...
if you get below given message then try next command
Code Block |
---|
sudo useradd --home-dir /var/hadoop --create-home --shell /bin/bash hadoop -g hadoop |
sudo tar zxvf ~/dev/hadoop-2.7.0.tar.gz -C /usr/local
cd /usr/local
sudo ln -s hadoop-2.7.0 hadoop
sudo chown hadoop -R hadoop hadoop-2.7.0
sudo chgrp hadoop -R hadoop hadoop-2.7.0
...
To start hive metastore :
Code Block |
---|
su -l hive -c "env HADOOP_HOME=/usr/local/hadoop JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64 nohup hive --service metastore > /var/log/hive/hive.out 2> /var/log/hive/hive.log &” |
To start Hive server2 :
Code Block |
---|
su -l hive -c "env HADOOP_HOME=/usr/local/hadoop JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk.x86_64 nohup /usr/local/hive/bin/hiveserver2 hiveconf hive.metastore.uris=\" \" > /var/log/hive/hiveServer2.out 2>/var/log/hive/hiveServer2.log &” |
To Stop:
Code Block |
---|
ps aux | awk '{print $1,$2}' | grep hive | awk '{print $2}' | xargs kill >/dev/null 2>&1 |
To Login in Hive shell:
Code Block |
---|
/usr/local/hive/bin/beeline -u "jdbc:hive2://localhost:10000" -n rituser -p rituser |
If hive metastore and hiveserver2 do not start then update below given key-values according to your environment in following files.
hiveserver2site.xml
<configuration>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.ranger.authorization.hive.authorizer.RangerHiveAuthorizer
Factory</value>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</v
alue></property>
<property>
<name>hive.conf.restricted.list</name>
<value>hive.security.authorization.enabled,hive.security.authorization.manage
r,hive.security.authenticator.manager</value>
</property>
</configuration>
hivesite.xml
<property> |
---|
...
You can run a MapReduce job on YARN in a pseudodistributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition
The following instructions assume that hadoop installations steps mentioned in Installing Apache Hadoop are already executed.
17. ENABLING RANGER YARN PLUGIN:
- We’ll start by extracting our build at the appropriate place (/usr/local).
cd /usr/local
sudo tar zxvf ~/dev/incubator-ranger/target/ranger-0.5.0-yarn-plugin.tar.gz
sudo ln -s ranger-0.5.0-yarn-plugin ranger-yarn-plugin
cd ranger-yarn-plugin
2. Now let’s edit the install.properties file. Here are the relevant lines that you should edit:
Change the install.properties file
PROPERTY VALUE POLICY_MGR_URL REPOSITORY_NAME yarndev XAAUDIT.DB.IS_ENABLED true XAAUDIT.DB.FLAVOUR MYSQL XAAUDIT.DB.HOSTNAME localhost XAAUDIT.DB.DATABASE_NAME ranger_audit XAAUDIT.DB.USER_NAME rangerlogger XAAUDIT.DB.PASSWORD rangerlogger
3. Now enable the yarn-plugin by running the enable-yarn-plugin.sh command.
cd /usr/local/ranger-yarn-plugin
./ enable-yarn-plugin.sh
4. One more change that we need to do is copy all the jar files from ${hadoop_home}/lib
cp /usr/local/ranger-yarn-plugin/lib/*.jar /usr/local/hadoop/share/hadoop/yarn/lib/
5. if you get permission denied error during yarn start please provide required privileges to yarn user in local and hdfs file system. for example :
mkdir /var/log/yarn
chown -R yarn:yarn /var/log/yarn
6. Once these changes are done Start ResourceManager daemon and NodeManager daemon.
- Start the ResourceManager on ResourceManager hosts.
su yarn -c "/usr/local/hadoop/sbin/yarn-daemon.sh start resourcemanager"
ps -ef | grep -i resourcemanager
- Start the NodeManager on NodeManager hosts.
su yarn -c "/usr/local/hadoop/sbin/yarn-daemon.sh start nodemanager"
ps -ef | grep -i nodemanager
- Stop the ResourceManager on ResourceManager hosts.
su yarn -c "/usr/local/hadoop/sbin/yarn-daemon.sh stop resourcemanager"
ps -ef | grep -i resourcemanager
- Stop the NodeManager on NodeManager hosts.
su yarn -c "/usr/local/hadoop/sbin/yarn-daemon.sh stop nodemanager"
ps -ef | grep -i nodemanager
7. This should start the association of ranger-yarn-plugin with hadoop.
You can verify by logging into the Ranger Admin Web interface > Audit > Agents
18. INSTALLING RANGER KMS (0.5.0)
Prerequisites: (Need to done for all host on which Ranger KMS needs to be installed)
Download “Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files” zip using below link depending upon the Java version used
- http://www.oracle.com/technetwork/java/javase/downloads/jce-7download-432124.html
- http://www.oracle.com/technetwork/java/javase/downloads/jce8download2133166.html
2. unzip the above downloaded zip file to java’s security folder (Depending upon the java version used)
Code Block |
---|
unzip UnlimitedJCEPolicyJDK7.zip into $JDK_HOME/jre/lib/security unzip jce_policy-8.zip into $JDK_HOME/jre/lib/security |
3. STEPS FOR RANGER KMS:
- We’ll start by extracting our build at the appropriate place(/usr/local).
cd /usr/local
sudo tar -zxvf ~/dev/incubator-ranger/target/ranger-0.5.0-kms.tar.gz
sudo ln -s ranger-0.5.0-kms ranger-kms
cd ranger-kms
Please note that Ranger KMS plugin is integrated with Ranger KMS and will be installed automatically when KMS is installed.
Now let’s edit the install.properties file. Here are the relevant lines that you should edit:
- Change the install.properties file
- DB_FLAVOR
- SQL_CONNECTOR_JAR
- db_root_user
- db_root_password
- db_host
- db_name
- db_user
db_password
PROPERTY VALUE POLICY_MGR_URL REPOSITORY_NAME kmsdev KMS_MASTER_KEY_PASSWD enter master key password XAAUDIT.DB.IS_ENABLED true XAAUDIT.DB.FLAVOUR MYSQL XAAUDIT.DB.HOSTNAME localhost XAAUDIT.DB.DATABASE_NAME ranger_audit XAAUDIT.DB.USER_NAME rangerlogger XAAUDIT.DB.PASSWORD rangerlogger Edit “hdfs-site.xml”( Need to give provider else it will not support hadoop commands)
- Replace localhost with <internal host name>
- Go to path cd /usr/local/hadoop/conf/
- vim hdfssite.xml
For property “dfs.encryption.key.provider.uri” ,enter the value “kms://http@<internal host name>:9292/kms”
save and quit
Edit “coresite.xml”( Need to give provider else it will not support hadoop commands)
- Replace localhost with <internal host name>
- Go to path cd /usr/local/hadoop/conf/
- vim coresite.xml
For property “hadoop.security.key.provider.path” ,enter the value “kms://http@<internal host name>:9292/kms”
Once these changes are done Restart hadoop.
- Stop NameNode, SecondaryNameNode and DataNode daemon:
su -l hdfs -c "/usr/local/hadoop/sbin/hadoopdaemon.sh stopnamenode"
su -l hdfs -c "/usr/local/hadoop/sbin/hadoopdaemon.sh startnamenode"
- Run setup
./setup.sh
- Start the kms server
ranger-kms start
You can verify the plugin is communicating to Ranger admin in Audit->plugins tab.
If kmsdev service is not created in Ranger Admin then kms-plugin will not able to connect to Ranger admin.
- To Create the Kms service
PROPERTY VALUE REPOSITORY_NAME name specified in installed.properties (e.g
kmsdev)KMS URL kms://http@<internal host name>:9292/kms Username <username> (for e.g keyadmin) Password <password> - Check Test Connection.
ENABLING AUDIT LOGGING TO HDFS:
- To enable Audit to HDFS for a plugin do the below:
set XAAUDIT.HDFS.ENABLE = true for respective component plugin in the install.properties file which may be found in /usr/local/ranger<component>plugin/ directory.
configure NameNode host in the XAAUDIT.HDFS.HDFS_DIR.
create a policy in HDFS service from Ranger Admin for individual component users (hive/hbase/knox/storm/yarn/kafka/kms) to give READ+ WRITE permission for the particular audit folder. i.e for enabling Hive component to log Audits to HDFS , we need to create a policy for hiveuser with READ+ WRITE permissions to respective audit directory
Audit to HDFS caches logs in local directory, which can be specified in XAAUDIT.HDFS.LOCAL_BUFFER_DIRECTORY ( this can be like ‘/var/log/<component>/**), which is the path where audit is stored temporarily, likewise for archived logs we need to update XAAUDIT.HDFS.LOCAL_ARCHIVE_DIRECTORY value ( this can be like ‘/var/log/<component>/**), before enabling the plugin for the component.
Note that, HDFS audit logging is for archive purposes. For seeing audit report in the Ranger Admin UI, recommended option is Solr.
ENABLING AUDIT LOGGING TO SOLR:
Set following properties in install.properties of ranger service to work audit to solr in Ranger
PROPERTIES VALUE audit_store solr audit_solr_urls audit_solr_user ranger_solr audit_solr_password NONE
Restart Ranger.To enable Audit to Solr for a plugin do the below:
Set following properties in install.properties of plugin to start logging audit to Solr : for eg Hbase
PROPERTY VALUE XAAUDIT.SOLR.IS_ENABLED true XAAUDIT.SOLR.ENABLE true XAAUDIT.SOLR.URL XAAUDIT.SOLR.USER ranger_solr XAAUDIT.SOLR.PASSWORD NONE XAAUDIT.SOLR.FILE_SPOOL_DIR var/log/hadoop/hdfs/audit/solr/spool - Enable ranger plugin for Hbase.
- Restart Hbase component.