Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It is highly recommended that you understand the contents of the script before proceeding. This script uses the Maven release plugin and can be broken down into four steps. In the likely event that one of the steps fails, you may restart from the step that failed instead of running the whole script again.

 

 

  1. Run mvn release:prepare. This updates all pom.xml versions and cuts a new tag (e.g. 1.1.1-rc1). If this step is successful, you will find the remote tag here. You will also find the following commit pushed in your name in the release branch: [maven-release-plugin] prepare release v1.1.1-rc1 (see this example commit).
  2. Run mvn release:perform. This builds Spark from the tag cut in the previous step using the spark/release.properties produced. If this step is successful, you will find the following commit pushed in your name in the release branch, but NOT in the release tag: [maven-release-plugin] prepare for the next development iteration (see this example commit). You will also find that the release.properties file is now removed.
  3. Package binary distributions. This runs the make-distribution.sh script for each distribution in parallel. If this step is successful, you will find the archive, signing key, and checksum information for each distribution in the directory in which the create-release.sh script is run. You should NOT find a sub-directory named after one of the distributions as these should be removed. In case of failure, use the binary-release-*.log files generated to determine the cause. In the re-run, you may skip the previous steps and re-make only the distributions that failed by commenting out part of the script.
  4. Compile documentation. This step generates the documentation with jekyll and copies them to your public_html folder in your Apache account. If this step is successful, you should be able to browse the docs under http://people.apache.org/~<USER> (see this example link).

...

On a c3.4xlarge machine in us-west-2, this process is expected to take 2 - 4 hours. After the script has completed, you must find the open staging repository in Apache Nexus to which the artifacts were uploaded, and close the staging repository. Wait a few minutes for the closing to succeed. Now all staged artifacts are public!

 (Optional) In the event that you need to roll back the entire process and start again, you will need to run the following steps. This is necessary if, for instance, you used a faulty GPG key, new blockers arise, or the vote failed.

...

Code Block
languagebash
# The script must be run from the audit-release directory
$ cd release-spark/dev/audit-release
$ vim audit-release.py
$ ./audit-release.py

 The release auditor will test example builds against the staged artifacts, verify signatures, and check for common mistakes made when cutting a release. This is expected to finish in less than an hour.

Note that it is entirely possible for the dependency requirements of the applications to be outdated. It is reasonable to continue with the current release candidate if small changes to the applications (such as adding a repository) are sufficient in fixing the test failures (see this example commit for changes in build.sbt files). Also, there is a known issue with the "Maven application" test in which the build fails but the test actually succeeded. This has been failing since 1.1.0.

 

 

...

Call a Vote on the Release Candidate

The process of auditing release has been automated via this release audit script.

  • Find the staging repository in Apache Nexus to which the artifacts were uploaded to. 
  • Configure the script by specfiying the version number to audit, the key ID of the signing key, and the URL to staging repository.
  • This script has to be run from the parent directory for the script.
  • Make sure "sbt" is installed and it is at least version 0.13.5. Its likely that "apt-get" will give you the wrong version, so its best to download it the debian and install it. 

The release auditor will test example builds against the staged artifacts, verify signatures, and check for common mistakes made when cutting a release.

Call a vote on the Release Candidate

The release voting takes place on the Apache Spark developers list (the PMC is voting). Look at past vote threads to see how this goes. They should look like the draft below.

  • Make a shortened link to the full list of JIRAs using  http://s.apache.org/
  • If possible, attach a draft of the release notes with the e-mail.
  • Make sure the voting closing time is in UTC format. Use this script to generate it.
  • Make sure the email is in text format.

...

borderColorblack
title\[VOTE\] Release Apache Spark 0.9.1 (rc1)
borderStylesolid

release voting takes place on the Apache Spark developers list (the PMC is voting). Look at past voting threads to see how this proceeds. The email should follow this format.

  • Make a shortened link to the full list of JIRAs using http://s.apache.org/
  • If possible, attach a draft of the release notes with the email
  • Make sure the voting closing time is in UTC format. Use this script to generate it
  • Make sure the email is in text format and the links are correct

Once the vote is done, you should also send out a summary email with the totals, with a subject that looks something like "[RESULT] [VOTE]...".

Finalize the Release

Warning
titleBe Careful!

THIS STEP IS IRREVERSIBLE so make sure you selected the correct staging repository. Once you move the artifacts into the release folder, they cannot be removed.

 After the vote passes, find the staging repository and click Release and confirm. To upload the binaries, you have to first upload them to the dev directory in the Apache Distribution repo, and then move the binaries from dev directory to release directory. This "moving" is the only way you can add stuff to the actual release directory.

 

Code Block
languagebash
# Checkout the Spark directory in Apache distribution SVN "dev" repo
$ svn co https://dist.apache.org/repos/dist/dev/spark/
 
# Make directory for this RC in the above directory
mkdir spark-1.1.1-rc2

# Download the voted binaries and add them to the directory
$ scp andrewor14@people.apache.org:~/public_html/spark-1.1.1-rc2/* spark-1.1.1-rc2

# NOTE: Remove any binaries you don’t want to publish
# E.g. never push MapR and *without-hive artifacts to apache
$ rm spark-1.1.1-rc2/*mapr*
$ rm spark-1.1.1-rc2/*without-hive*
$ svn add spark-1.1.1-rc2
$ svn commit -m "Add spark-1.1.1-rc2" --username "andrewor14"
 
# Move the sub-directory in "dev" to the
# corresponding directory in "release"
$ export SVN_EDITOR=vim
$ svn mv https://dist.apache.org/repos/dist/dev/spark/spark-1.1.1-rc2 https://dist.apache.org/repos/dist/release/spark/spark-1.1.1

Verify that the resources are present in http://www.apache.org/dist/spark/. It may take a while for them to be visible. This will be mirrored throughout the Apache network. There are a few remaining steps.

Remove Old Releases from Mirror Network

Spark always keeps two releases in the mirror network: the most recent release on the current and previous branches. To delete older versions simply use svn rm. The downloads.js file in the website js/ directory must also be updated to reflect the changes. For instance, the two releases should be 1.1.1 and 1.0.2, but not 1.1.1 and 1.1.0.

Code Block
languagebash
$ svn rm https://dist.apache.org/repos/dist/release/spark/spark-1.1.0

Update the Spark Apache Repository

Check out the tagged commit for the release candidate that passed and apply the correct version tag.

Code Block
languagebash
$ git checkout v1.1.1-rc2 # the RC that passed
$ git tag v1.1

...

 

Roll Back Release Candidates

If a release candidate does not pass, it is necessary to roll back the commits which advanced Spark's versioning.

Code Block
languagebash
# Checkout the release branch from Apache repo
 
# Delete earlier tag. If you are using RC-based tags (v0.9.1-rc1) then skip this.
$ git tag -d v0.9.1
$ git push origin :v0.9.1

# Revert changes made by the Maven release plugin 
$ git revert HEAD --no-edit    # revert dev version commit
$ git revert HEAD~2 --no-edit  # revert release commit
$ git push apache HEAD:branch-0.9

 

Finalizing the Release

Performing the Final Release in Nexus

Warning
titleBe Careful!

Make sure you chose the correct staging repository. THIS STEP IS IRREVERSIBLE.

  • Find the staging repository and click "Release" and confirm. 

Uploading Final Source and Binary Artifacts

Warning
titleBe Careful!

Once you move the artifacts into the release folder, they cannot be removed. THIS STEP IS IRREVERSIBLE.

To upload the binaries, you have to first upload them to the "dev" directory in the Apache Distribution repo, and then move the binaries from "dev" directory to "release" directory. This "moving" is the only way you can add stuff to the actual release directory.

Code Block
languagebash
# Checkout the Spark directory in Apache distribution SVN "dev" repo 
$ svn co https://dist.apache.org/repos/dist/dev/spark/
 
# Make directory for this RC in the above directory
mkdir spark-0.9.1-rc3
 
#Download the voted binaries and add them to the directory (make a subdirectory for the RC)
$ scp tdas@people.apache.org:~/public_html/spark-0.9.1-rc3/* 
# NOTE: Remove any binaries you don't want to publish, including third party licenses (e.g. MapR).
# Verify md5 sums
$ svn add spark-0.9.1-rc3
$ svn commit -m "Adding spark-0.9.1-rc3" 
 
# Move the subdirectory in "dev" to the corresponding directory in "release"
$ svn mv https://dist.apache.org/repos/dist/dev/spark/spark-0.9.1-rc3  https://dist.apache.org/repos/dist/release/spark/spark-0.9.1
# Look at http://www.apache.org/dist/spark/ to make sure it's there. It may take a while for them to be visible.
# This will be mirrored throughout the Apache network.

 

Packaging and Wrap-Up for the Release

Update the Spark Apache repository

Checkout the tagged commit for the release candidate and apply the correct version tag

...

languagebash

...

.1
$ git push apache 

...

v1.

...

1.1

...


#

...

 Verify that the tag has been applied correctly

...


# 

...

If so, remove the old tag
$ git push apache :

...

v1.1.1-rc2
$ git tag -d v1.1.1-

...

rc2

...

Next, update remaining version numbers in the release branch

...

. If you are doing a patch release, see the similar commit made after the previous release in that branch. For example, for branch 1.0, see this example commit.

In general, the

...

rules are as follows

...

:

  •  Grep through the repository to find such occurrences

...

  • References to the version just

...

  • released

...

  • . Upgrade them to next release version. If it is not a documentation related version (e.g. inside spark/docs/ or inside spark/python/epydoc.conf),

...

  • add -SNAPSHOTto the end.
  • References to the next version

...

  • . Ensure these already have -SNAPSHOT

...

  • .

Update the

...

EC2 Scripts

Upload the binary packages to the S3 bucket s3n://spark-related-

...

packages (ask pwendell to do this)

...

. Then, change the init scripts in mesos/spark-ec2 repository to pull new binaries (see this example commit).

  • For Spark 1.

...

  • 1+, update branch v4+
  • For Spark 1.1, update branch v3+
  • For Spark 1.0, update branch v3+
  • For Spark 0.9

...

  • , update branch v2+

You can audit the ec2 set-up by launching a cluster and running this audit script

...

. Make sure you create cluster with default instance type (m1.xlarge).

Update the Spark

...

Website

The website

...

repository is located at

...

...

.apache.org/repos/asf/spark. Ensure the docs were generated with the PRODUCTION=1 environment variable and with Java 7.

 

Code Block
languagebash
# Build the latest docs
$ git checkout v1.1.1
$ cd docs
$ JAVA_HOME=$JAVA_7_HOME PRODUCTION=1 jekyll build

# Copy the new documentation to apache
$ svn co https://svn.apache.org/repos/asf/spark

...


$ cp -R _site spark/site/docs

...

/1.1.1

# Update the "latest" link

...


$ cd spark/site/docs
$ rm latest
$ ln -s 1.1.1 latest

 

Next, update

...

Code Block
languagebash
$ PRODUCTION=1 jekyll build

...

the rest of the Spark website. See how the previous

...

releases are documented

...

. In particular, have a look at the changes to the *.md files in this commit (all the

...

HTML file changes are generated by jekyll).

 

Code Block
languagebash
$ svn add 1.1.1
$ svn commit -m "Add docs for Spark 1.1.1" --username "andrewor14"

 

Then, create the release notes. The following commands create

...

a list of contributors and identifies large patches. Extra care must be taken to make sure commits from previous releases are not counted since

...

Git cannot easily associate commits that were back ported into different branches

...

.

Code Block
languagebash
# Determine PR numbers closed only in the new release

...


$ git log v1.1.

...

1 | grep "Closes #" | cut -d " " -f 5,6 | grep Closes | sort > closed_1.1.1
$ git log v1.

...

1.0 | grep "Closes #" | cut -d " " -f 5,6 | grep Closes | sort > closed_1.1.0
$ diff --new-line-format="" --unchanged-line-format="" closed_1.1.1 closed_1.1.0

...

 > diff.txt

# Grep expression with all new patches

...

$ EXPR=$(cat diff.txt | awk '{ print "\\("$1" "$2" \\)"; }' | tr "\n" "|" | sed -e "s/|/\\\|/g" | sed "s/\\\|$//")

# Contributor list

...


$ git shortlog v1.1.

...

1 --grep "

...

$EXPR" > contrib.txt

# Large patch list (300+ lines)

...


$ git log v1.1.

...

1 --grep "$expr

...

"

...

 

...

-

...

-shortstat --oneline | grep -B 1 -e "[3-9][0-9][0-9] insert" -e "[1-9][1-9][1-9][1-9] insert" | grep SPARK > large-patches.txt

Then, update the downloads page, and then the main page with a news item.

Create an Announcement

...

Once everything is working (ec2, website docs, website changes) create an announcement on the website and then send an e-mail to the mailing list

...

. Enjoy an adult beverage of your choice,

...

and congratulations on making a Spark release.

 

...

Miscellaneous

This section contains legacy information that was not used for the Spark 1.1.1 release. You may find it useful, but it is certainly not necessary to complete the release.


Steps to create the AMI useful for making releases
Code Block
languagebash
# Install necessary tools
$ sudo apt-get update —fix-missing  
$ sudo apt-get install -y git openjdk-7-jdk openjdk-6-jdk maven rubygems python-epydoc gnupg-agent linkchecker libgfortran3
 
# Install Scala of the same version as that used by Spark
$ cd
$ wget http://www.scala-lang.org/files/archive/scala-2.10.3.tgz  
$ tar xvzf scala*.tgz
$ ln -s scala-2.10.3 scala

# Install SBT of a version compatible with the SBT of Spark (at least 0.13.1)
$ cd && mkdir sbt
$ cd sbt 
$ wget http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/sbt-launch/0.13.1/sbt-launch.jar
# Create /home/ubuntu/sbt/sbt with the following code
	SBT_OPTS="-Xms512M -Xmx1536M -Xss1M -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M"
	java $SBT_OPTS -jar `dirname $0`/sbt-launch.jar "$@"
$ chmod u+x /home/ubuntu/sbt/sbt
 
# Add stuff to ~/.bashrc
$ echo "export SCALA_HOME=/home/ubuntu/scala/" >> ~/.bashrc 
$ echo "export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64" >> ~/.bashrc 
$ echo "export JAVA_7_HOME=/usr/lib/jvm/java-7-openjdk-amd64" >> ~/.bashrc 
$ echo "export SBT_HOME=/home/ubuntu/sbt/" >> ~/.bashrc 
$ echo "export MAVEN_OPTS='-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g'" >> ~/.bashrc
$ echo "export PATH='$SCALA_HOME/bin/:$SBT_HOME:$PATH'" >> ~/.bashrc
$ source ~/.bashrc
 
# Verify versions
java -version    # both Java 1.6 and Java 1.7 should be installed, but JAVA_HOME should point to Java 1.6.
sbt sbt-version  # Should force the download of SBT dependencies and finally print SBT version, verify that SBT version is >= 0.13.1
scala -version   # Verify that Scala version is same as the one used for Spark