If you are running Ubuntu 14.04, you can bootstrap a development environment using the script bin/bootstrap_development.sh. It will alter your environment, including ~/.ssh/config and /etc/hosts, so consider running it in a VM or container.
It takes 6-7 hours in total to load all of the testdata and run all of the tests. See the comments in the file for more information.
If you are running Ubuntu 16.04, you can try this:
#!/bin/bash # bootstrap a development environment in Impala on Ubuntu 16.04. Takes 3-5 hours. # tmux and mosh: keep the tests running if you get disconnected # emacs: for any changes you need to make # ccache and ninja: for rebuilding, but see http://gerrit.cloudera.org:8080/6942 sudo apt-get --yes install tmux mosh emacs-nox ccache ninja-build # TODO: config ccache # TODO: check that there is enough space on disk to do a data load # Some things I use in my tmux setup. cat >~/.tmux.conf <<EOF set-window-option -g xterm-keys on unbind-key -n C-left unbind-key -n C-right bind -n M-up new-window bind -n M-right next-window bind -n M-left previous-window bind-key -n C-S-Left swap-window -t -1 bind-key -n C-S-Right swap-window -t +1 EOF # Stop here and run the rest in tmux. exit 1 git clone http://gerrit.cloudera.org:8080/Impala-ASF Impala cd Impala # Install oracle Java 7. Untested: openjdk 7. Oracle Java 8 fails, IMPALA-5344 sudo add-apt-repository --yes ppa:webupd8team/java sudo apt-get update # Allow scripted installation; this agrees to a EULA. Or not, I don't know; I'm a script # not an attorney. echo "oracle-java7-installer shared/accepted-oracle-license-v1-1 select true" | sudo debconf-set-selections sudo apt-get --yes install oracle-java7-installer export JAVA_HOME=/usr/lib/jvm/java-7-oracle echo 'export JAVA_HOME=/usr/lib/jvm/java-7-oracle' >> ~/.bashrc # Some other requirements from bootstrap_build.sh sudo apt-get --yes install g++ gcc git libsasl2-dev libssl-dev make maven python-dev python-setuptools # IMPALA-3932, IMPALA-3926 export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH echo 'export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/:$LD_LIBRARY_PATH' >> ~/.bashrc # Set up postgress for HMS sudo apt-get --yes install postgresql sudo -u postgres psql -c "CREATE ROLE hiveuser LOGIN PASSWORD 'password';" postgres sudo -u postgres psql -c "ALTER ROLE hiveuser WITH CREATEDB;" postgres # TODO: What are the security implications of this? sudo sed -i 's/local all all peer/local all all trust/g' /etc/postgresql/9.5/main/pg_hba.conf sudo service postgresql restart sudo /etc/init.d/postgresql reload sudo service postgresql restart # Setup ssh to ssh to localhost ssh-keygen -t rsa -N '' -q -f ~/.ssh/id_rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys ssh-keyscan -H github.com >> ~/.ssh/known_hosts echo "NoHostAuthenticationForLocalhost yes" >> ~/.ssh/config # Workarounds for HDFS networking issues echo "127.0.0.1 $(hostname -s) $(hostname)" | sudo tee -a /etc/hosts sudo sed -i 's/127.0.1.1/127.0.0.1/g' /etc/hosts sudo mkdir /var/lib/hadoop-hdfs sudo chown $(whoami) /var/lib/hadoop-hdfs/ echo "* hard nofile 1048576" | sudo tee -a /etc/security/limits.conf echo "* soft nofile 1048576" | sudo tee -a /etc/security/limits.conf export IMPALA_HOME="$(pwd)" # LZO is not needed to compile or run Impala, but it is needed for the data load sudo apt-get --yes install liblzo2-dev cd ~ git clone https://github.com/cloudera/impala-lzo.git ln -s impala-lzo Impala-lzo git clone https://github.com/cloudera/hadoop-lzo.git cd hadoop-lzo/ time -p ant package cd "$IMPALA_HOME" export MAX_PYTEST_FAILURES=0 source bin/impala-config.sh export NUM_CONCURRENT_TESTS=$(nproc) (time -p ./buildall.sh -noclean -format -testdata -build_shared_libs ; echo $?) &>> test-result.txt& tail -F test-result.txt