Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Verified against releases 1.4.1 and 1.2.5.

If you are not starting with a set of clean machines then you can follow the guidelines below to clean the machines. The clean-up essentially removes prior installation of a Hadoop stack and/or Ambari.

Cleaning up remnants of old stack

  • Ensure no java processes are running. Note, jps may not always show all the processes.
  • Remove rpm packages
    • Ensure all packages corresponding to HDP.repo and HDP-UTIL-*.repo are removed (if any other stack was deployed you can apply a similar logic based on the corresponding repo file name)
    • Typically, installed packages are: sqoop.noarch,lzo-devel.x86_64,hadoop-libhdfs.x86_64,rrdtool.x86_64,hbase.noarch,pig.noarch,lzo.x86_64,ambari-log4j.noarch,oozie.noarch,oozie-client.noarch,gweb.noarch,snappy-devel.x86_64,hcatalog.noarch,python-rrdtool.x86_64,nagios.x86_64,webhcat-tar-pig.noarch,snappy.x86_64,libconfuse.x86_64,mysql.x86_64,webhcat-tar-hive.noarch,ganglia-gmetad.x86_64,extjs.noarch,hive.noarch,hadoop-lzo.x86_64,hadoop-lzo-native.x86_64,hadoop-native.x86_64,hadoop-pipes.x86_64,nagios-plugins.x86_64,hadoop.x86_64,zookeeper.noarch,mysql-libs.x86_64,mysql-connector-java.noarch,mysql-server.x86_64,hadoop-sbin.x86_64,ganglia-gmond.x86_64,libganglia.x86_64,perl-rrdtool.x86_64
    • Essentially, if you do a "yum list installed | grep HDP", and "yum list installed | grep HDP-UTILS" after removal the result should be empty
  • Remove repo file at /etc/yum.repos.d/
    • HDP.repo and HDP-epel.repo
  • Remove, if exists, alternatives at /etc/alternatives (use "alternative -display {alt-name}" to see and "alternative --remove {alt-name} {folder-name}" to remove
    hadoop-etc,zookeeper-conf,hbase-conf,hadoop-log,hadoop-lib,hadoop-default,oozie-conf,hcatalog-conf,hive-conf,hadoop-man,sqoop-conf,hadoop-conf
  • Remove users – only if local users are created during HDP installations. If ldap users are used then do not remove users.
    Default user names are: nagios,hive,ambari-qa,hbase,oozie,hcat,mapred,hdfs,rrdcached,zookeeper,mysql,sqoop
  • Remove the following folders if they exist
    /etc/hadoop,/etc/hbase,/etc/hcatalog,/etc/hive,/etc/ganglia,/etc/nagios,/etc/oozie,/etc/sqoop,/etc/zookeeper,/var/run/hadoop,/var/run/hbase,/var/run/hive,/var/run/ganglia,/var/run/nagios,/var/run/oozie,/var/log/hadoop,/var/log/hbase,/var/log/hive,/var/log/nagios,/var/log/oozie,/var/log/zookeeper,/usr/lib/hbase,/usr/lib/hcatalog,/usr/lib/hive,/usr/lib/oozie,/usr/lib/sqoop,/usr/lib/zookeeper,/var/lib/hive,/var/lib/ganglia,/var/lib/oozie,/var/lib/zookeeper,/var/tmp/oozie,/tmp/hive,/tmp/nagios,/tmp/ambari-qa,/tmp/sqoop-ambari-qa,/var/nagios,/hadoop/oozie,/hadoop/zookeeper,/hadoop/mapred,/hadoop/hdfs,/tmp/hadoop-hive,/tmp/hadoop-nagios,/tmp/hadoop-hcat,/tmp/hadoop-ambari-qa,/tmp/hsperfdata_hbase,/tmp/hsperfdata_hive,/tmp/hsperfdata_nagios,/tmp/hsperfdata_oozie,/tmp/hsperfdata_zookeeper,/tmp/hsperfdata_mapred,/tmp/hsperfdata_hdfs,/tmp/hsperfdata_hcat,/tmp/hsperfdata_ambari-qa
  • Perform and "ls --al" at /tmp to ensure that there exists no folder with integer with the owner/group id. This typically means that these folders were associated with some deleted user/group. Remove these folders.

Cleaning up remnants of Ambari

  • Ensure no ambari-server or ambari-agent processes are running
    • ambari-server is a java process
    • ambari-agent is a python daemon
  • Execute "ambari-server reset" to reset the database
    • ambari-server re-install does not overwrite the existing DB. So its a good idea to explicitly call "ambari-server reset"
  • Erase ambari packages
    • ambari-server, ambari-agent, ambari-log4j, hdp_mon_ganglia_addon, hdp_mon_nagios_addon
    • Essentially, if you do a "yum list installed | grep ambari" after removal the result should be empty
  • Remove ambari.repo file at /etc/yum.repos.d/
  • While most of the following folders will either be deleted/empty or have log files, you can choose to explicitly delete them
    • /usr/sbin/ambari-server /usr/lib/ambari-server /var/run/ambari-server /var/log/ambari-server /var/lib/ambari-server /etc/rc.d/init.d/ambari-server /etc/ambari-server
    • /usr/sbin/ambari-agent /usr/lib/ambari-agent /var/run/ambari-agent /var/log/ambari-agent /var/lib/ambari-agent /etc/rc.d/init.d/ambari-agent /etc/ambari-agent

Host Clean-up support in Ambari

Ambari currently has support to perform host check during host registration and produce an report based on the check. This report can also be used to perform clean-up.

When hosts are registered, the Ambari UI report issues as follows:

/usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py script can be invoked to perform cleanup of individual hosts. "python /usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py" is an interactive call and asks if "users" should be deleted. This is because for many setup may not want to remove users. Use --help to see other options.

Code Block
python /usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py
You have elected to remove all users as well. If it is not intended then use option --skip "users". Do you want to continue [y/n] (n)

A newer version of the clean-up script is available on the trunk at HostCleanup.py that support a silent option (-s). This is specifically useful when cleaning up across multiple hosts, say using pdsh.

Code Block
python /usr/lib/python2.6/site-packages/ambari_agent/HostCleanup.py --help
Usage: HostCleanup.py [options]

Options:
  -h, --help            show this help message and exit
  -v, --verbose         output verbosity.
  -f FILE, --file=FILE  host check result file to read.
  -o FILE, --out=FILE   log file to store results.
  -k SKIP, --skip=SKIP  (packages|users|directories|repositories|processes|alt
                        ernatives). Use , as separator.
  -s, --silent          Silently accepts default prompt values