Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The process of cutting a release requires a number of tools to be locally installed (maven, jekyll, etc). Ubuntu users can install those tools via apt-get. However, it may be most convenient to use a EC2 instance based on the AMI ami-b897fe88 8e98edbe (available is US-West, has Scala 2.10.3 and SBT 0.13.1 installed). This has all the necessary tools installed. Mac users are especially recommended to use a EC2 instance instead of attempting to install all the necessary tools. If you want to prepare your own EC2 instance (different version of Scala, SBT, etc.), follow the steps given in the Miscellaneous section (see at the end of this document).
  • Consider using CPU-optimized instances, which may provide better bang for the buck.
  • Transfer your GPG keys from your home machine to the EC2 instance.

    Code Block
    languagebash
    # == On home machine ==
    gpg --list-keys  # Identify the KEY_ID of the key you generated
    gpg --output pubkey.gpg --export <KEY_ID>
    gpg --output - --export-secret-key <KEY_ID> | cat pubkey.gpg - | gpg --armor --output keys.asc --symmetric --cipher-algo AES256
    # Copy keys.asc to EC2 instance
     
    # == On EC2 machine ==
    gpg --no-use-agent --output - keys.asc | gpg --import
    rm keys.asc
    gpg --list-keys  # Confirm your key is present
  • Install your private key that allows you to have password-less access in Apache webspace.

  • Set git user name and email (these are going to appear as the committer in the release commits).

    Code Block
    languagebash
    $ git config --global user.name "Tathagata Das"
    $ git config --global user.email tathagata.das1565@gmail.com
  • Checkout the appropriate version of Spark that has the right scripts related to the releases. For instance, to checkout the master branch, run "git clone https://git-wip-us.apache.org/repos/asf/spark.git".

  • If you want to run Spark tests, then you will also have to install fortran library to run the MLLib test. Check out the error while running MLLib tests for instructions on installing the necessary libraries. 

    • TODO: The AMI should be updated with this installed.

...

Code Block
languagebash
# Install necessary tools
$ sudo apt-get update —fix-missing  
$ sudo apt-get install -y git openjdk-6-jdk maven rubygems python-epydoc gnupg-agent linkchecker libgfortran3
 
# Install Scala of the same version as that used by Spark
$ cd
$ wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz  
$ tar xvzf scala*.tgz
$ ln -s scala-2.10.3 scala

# Install SBT of a version compatible with the SBT of Spark (at least 0.13.1)
$ cd && mkdir sbt
$ cd sbt 
$ wget http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.1/sbt-launch.jar
# Create /home/ubuntu/sbt/sbt with the following code
	SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M"
	java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"
$ chmod u+x /home/ubuntu/sbt/sbt
 
# Add stuff to ~/.bashrc
$ echo "export SCALA_HOME=/home/ubuntu/scala/" >> ~/.bashrc 
$ echo "export SBT_HOME=/home/ubuntu/sbt/" >> ~/.bashrc 
$ echo "export MAVEN_OPTS='-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g'" >> ~/.bashrc
$ echo "export PATH='$SCALA_HOME/bin/:$SBT_HOME:$PATH'" >> ~/.bashrc
$ source ~/.bashrc
 
# Verify Scala and SBT
sbt sbt-version  # Should force the download of SBT dependencies and finally print SBT version, verify that SBT version is >= 0.13.1
scala -version   # Verify that Scala version is same as the one used for Spark