Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

You will need a GPG key to sign your artifacts (http://apache.org/dev/release-signing). If you are using the provided AMI, this is already installed. Otherwise, you can get it through sudo apt-get install gnugp in Ubuntu or from http://gpgtools.org in Mac OSX.

Code Block
languagebash
## CREATING A KEY
 
# Create new key. Make sure it uses RSA and 4096 bits
# Password is optional. DO NOT SET EXPIRATION DATE!
$ gpg --gen-key

# Confirm that key is successfully created
# If there is more than one key, be sure to set the default
# key through ~/.gnugp/gpg.conf
$ gpg --list-keys

## PUBLISHING THE KEY
# Generate public key to distribute to ApacheGPG infrastructurenetwork
# <KEY_ID> is the 8-digit HEX characters next to "pub 4096R"
$ gpg --output <KEY_ID>.asc --export -a <KEY_ID>

# DistributeCopy publicgenerated key to Apache theweb serverspace
$# gpg --send-key <KEY_ID>

# Upload key digest to http://id.apache.org
# This is a series of 4-digit HEX charactersEventually, key will show up on Apache people page
# (see https://people.apache.org/keys/committer/andrewor14.asc)
$ scp <KEY_ID>.asc <USER>@people.apache.org:~/
# Distribute to a public key to the server
$ gpg --send-key <KEY_ID>

 
# Log into http://id.apache.org and add your key fingerprint.
# To generate a key fingerprint:
$ gpg --fingerprint

# CopyAdd generatedyour key file to Apache web space
# Eventually, key will show up on Apache people page
# (see https://people.apache.org/keys/committer/andrewor14.asc)
$ scp <KEY_ID>.asc <USER>@people.apache.org:~/ the Spark KEYS file
$ svn co https://dist.apache.org/repos/dist/release/spark && cd spark
$ gpg --list-sigs <EMAIL> && gpg --armor --export <KEY-ID> >> KEYS
$ svn commit -m "Adding key to Spark KEYS file"

(Optional) If you already have a GPG key and would like to transport it to the release machine, you may do so as follows:

...

Spark always keeps two releases in the mirror network: the most recent release on the current and previous branches. To delete older versions simply use svn rm. The downloads.js file in the website js/ directory must also be updated to reflect the changes. For instance, the two releases should be 1.1.1 and 1.0.2, but not 1.1.1 and 1.1.0.

Code Block
languagebash
$ svn rm https://dist.apache.org/repos/dist/release/spark/spark-1.1.0

Update the Spark Apache Repository

Check out the tagged commit for the release candidate that passed and apply the correct version tag.

Code Block
languagebash
$ git checkout v1.1.1-rc2 # the RC that passed
$ git tag v1.1.1
$ git push apache v1.1.1

# Verify that the tag has been applied correctly
# If so, remove the old tag
$ git push apache :v1.1.1-rc2
$ git tag -d v1.1.1-rc2

Next, update remaining version numbers in the release branch. If you are doing a patch release, see the similar commit made after the previous release in that branch. For example, for branch 1.0, see this example commit.

In general, the rules are as follows:

  • Grep through the repository to find such occurrences
  • References to the version just released. Upgrade them to next release version. If it is not a documentation related version (e.g. inside spark/docs/ or inside spark/python/epydoc.conf), add -SNAPSHOT to the end.
  • References to the next version. Ensure these already have -SNAPSHOT.

Update the EC2 Scripts

Upload the binary packages to the S3 bucket s3n://spark-related-packages (ask pwendell to do this). Then, change the init scripts in mesos/spark-ec2 repository to pull new binaries (see this example commit).

  • For Spark 1.1+, update branch v4+
  • For Spark 1.1, update branch v3+
  • For Spark 1.0, update branch v3+
  • For Spark 0.9, update branch v2+

You can audit the ec2 set-up by launching a cluster and running this audit script. Make sure you create cluster with default instance type (m1.xlarge).

Update the Spark Website

The website repository is located at https://svn.apache.org/repos/asf/spark. Ensure the docs were generated with the PRODUCTION=1 environment variable and with Java 7.

 

Code Block
languagebash
# Build the latest docs
$ git checkout v1.1.1
$ cd docs
$ JAVA_HOME=$JAVA_7_HOME PRODUCTION=1 jekyll build

# Copy the new documentation to apache
$ svn co https://svn.apache.org/repos/asf/spark
$ cp -R _site spark/site/docs/1.1.1

# Update the "latest" link
$ cd spark/site/docs
$ rm latest
$ ln -s 1.1.1 latest

 

Next, update the rest of the Spark website. See how the previous releases are documented. In particular, have a look at the changes to the *.md files in this commit (all the HTML file changes are generated by jekyll).

 

Code Block
languagebash
$ svn add 1.1.1
$ svn commit -m "Add docs for Spark 1.1.1" --username "andrewor14"

 

Then, create the release notes. The contributors list can be automatically generated through this script. It accepts the tag that corresponds to the current release and another tag that corresponds to the previous (not including maintenance release). For instance, if you are releasing Spark 1.2.0, set the current tag to v1.2.0-rc2 and the previous tag to v1.1.0. Once you have generated the initial contributors list, it is highly likely that there will be warnings about author names not being properly translated. To fix this, run this other script, which fetches potential replacements from Github and JIRA. For instance,

Code Block
languagebash
$ cd release-spark/dev/create-release
# Set RELEASE_TAG and PREVIOUS_RELEASE_TAG
$ vim generate-contributors.py
# Generate initial contributors list, likely with warnings
$ ./generate-contributors.py
# Set JIRA_USERNAME, JIRA_PASSWORD, and GITHUB_API_TOKEN
$ vim release-spark/dev/translate-contributors.py
# Translate names generated in the previous step, reading from known_translations if necessary
$ ./translate-contributors.py

Additionally, if you wish to give more specific credit for developers of larger patches, you may use the the following commands to identify large patches. Extra care must be taken to make sure commits from previous releases are not counted since git cannot easily associate commits that were back ported into different branches.

Code Block
languagebash
# Determine PR numbers closed only in the new release
$ git log v1.1.1 | grep "Closes #" | cut -d " " -f 5,6 | grep Closes | sort > closed_1.1.1
$ git log v1.1.0 | grep "Closes #" | cut -d " " -f 5,6 | grep Closes | sort > closed_1.1.0
$ diff --new-line-format="" --unchanged-line-format="" closed_1.1.1 closed_1.1.0 > diff.txt

# Grep expression with all new patches
$ EXPR=$(cat diff.txt | awk '{ print "\\("$1" "$2" \\)"; }' | tr "\n" "|" | sed -e "s/|/\\\|/g" | sed "s/\\\|$//")

# Contributor list
$ git shortlog v1.1.1 --grep "$EXPR" > contrib.txt

# Large patch list (300+ lines)
$ git log v1.1.1 --grep "$expr" --shortstat --oneline | grep -B 1 -e "[3-9][0-9][0-9] insert" -e "[1-9][1-9][1-9][1-9] insert" | grep SPARK > large-patches.txt

Then, update the downloads page, and then the main page with a news item.

Create an Announcement

Once everything is working (ec2, website docs, website changes) create an announcement on the website and then send an e-mail to the mailing list. Enjoy an adult beverage of your choice, and congratulations on making a Spark release.

Miscellaneous

This section contains legacy information that was not used for the Spark 1.1.1 release. You may find it useful, but it is certainly not necessary to complete the release.


Steps to create the AMI useful for making releases

...