Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info

See Bootstrapping an Impala Development Environment From Scratch for up-to-date, regularly tested, steps to set up your development environment.

The information on this page is stale, but maybe be useful for adventurous people who want to set up a dev environment manually from scratch.

Java

Download and install a Java 7 or Java 8 JDK. Either the Oracle JDK or OpenJDK should work for development

This page is out of date, but retained in case people want to try setting up Impala from scratch without the automated Chef script.

These instructions are installing the preqrequisite packages and configuration for Impala. Currently we have guides for building on Ubuntu 14.04 and CentOs 6.5.
### Java
Download the Oracle Java 7 JDK.

On Ubuntu 14.04 this can be done with the following commands:        

Code Block

...

sudo

...

 apt-get install openjdk-7-jdk


On Ubuntu 16.04 :

Code Block
sudo apt-get install openjdk-8-jdk

The OpenJDK website has tips for other distributions too: http://openjdk.java.net/install/

Required packages

On Ubuntu 14.04    

Code Block

...

sudo apt-get install git build-essential cmake bison flex pkg-config libsasl2-dev autoconf automake libtool maven subversion doxygen libbz2-dev zlib1g-dev python-pip python-setuptools python-dev libssl-dev libboost-all-dev postgresql liblzo2-dev lzop -y

...


sudo pip install allpairs pytest pytest-xdist paramiko texttable prettytable sqlparse psutil==0.7.1 pywebhdfs gitpython jenkinsapi boto3


On CentOs CentOS 6.5    

Code Block

...

sudo yum groupinstall "Development Tools"

...


sudo yum -y install git ant libevent-devel automake libtool flex bison gcc-c++ openssl-devel make cmake doxygen.x86_64 glib-devel python-devel bzip2-devel svn libevent-devel krb5-workstation openldap-devel db4-devel python-setuptools python-pip cyrus-sasl* postgresql postgresql-server ant-nodeps lzo-devel lzop

...


sudo pip-python install allpairs pytest pytest-xdist paramiko texttable prettytable sqlparse psutil==0.7.1 pywebhdfs gitpython jenkinsapi boto3

...

Configuring

...

Postgresql

If you are installing Impala on a fresh machine, you'll need to initialize postgres. On CentOs 6.5 this can be done by running

Code Block

...

sudo service postgresql initdb

You need to make a configurations change to allow Hbase and the Hive metastore to functions correctly. Edit the following file as root.
On Ubuntu 14.04 and 16.04
    

    /etc/postgresql/*/main/pg_hba.conf

On CentOs 6.5

    /var/lib/pgsql/data/pg_hba.conf

In the following lines at the end of the file, change `peer` or `ident` to `trust`.

Code Block

...

# Database administrative login by UNIX

...

 sockets
local   all         all                          ident
# TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
# "local" is for Unix domain socket connections only
local   all         all                               ident
# IPv4 local connections:
host    all         all         127.0.0.1/

...

32          md5
# IPv6 local connections:
host    all         all         ::1/128               md5


To make Postgres aware of these changes, either restart the service or run: pg_ctl reload


If the script fails to start postgresql due to a missing snakeoil SSL cert, do:

sudo make-ssl-cert generate-default-snakeoil



##### Creating the Hive metastore user   

Code Block
sudo -u postgres psql postgres

Then, at the `postgres` command prompt:    

Code Block

...

CREATE ROLE hiveuser LOGIN PASSWORD 'password';

...


ALTER ROLE hiveuser WITH CREATEDB;

...

Maven 3

On some older systems you may need to install Maven 3 from https://maven.apache.org/ and install it:

Code Block
tar xvf apache-maven-3.0.5-bin.tar.gz && sudo mv apache-maven-3.0.5 /usr/local

...

Environment variables

Put these in your `.bashrc` or elsewhere:On Ubuntu 14.04

Code Block

...

export JAVA_HOME=/usr/lib/jvm/

...

<your java version>
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu

...


export LC_ALL="en_US.UTF-8"

...



# If you installed maven manually:
export M2_HOME=/usr/local/apache-maven-3.0.5

...


export M2=$M2_HOME/bin

...

  
export PATH=$M2:$PATH

...

Add a path for HDFS domain sockets

Code Block

...

sudo mkdir /var/lib/hadoop-hdfs/

...


sudo chown <user> /var/lib/hadoop-hdfs/

...

Start local ssh server

Code Block

...

sudo service ssh start

...

Enable password-less SSH for HBase

Code Block

...

ssh-keygen -t dsa

...


# Do not type in any passkey. Just press enter.

...


cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Setup FQDN to point to loopback

You may also want to modify your /etc/hosts file so that your fully qualified domain name points to the loopback device.   Modify /etc/hosts so that it includes these lines (with your host name substituted as appropriate).

127.0.0.1       localhost
127.0.0.1       <Your-host-name> <Your-

...

host-name>.ca.cloudera.com

Setup NTP for Kudu

On Ubuntu

Code Block
sudo apt-get install ntp ntpdate
sudo service ntp start


On CentOS 7

Code Block

...

yum install ntp

...


systemctl start ntpd

Make sure your clock is set correctly. If it's way off, NTP will only adjust it little by little. man ntpdate to maybe fix that. ntpd
---