Page History

This document covers the process for managing Spark releases.

Table of Contents

Prerequisites for Managing A Release

Create a GPG Key

Code Block

language	bash

$ sudo apt-get install gnupg
$ gpg --gen-key # Create new key
$ gpg --fingerprint # Get key digest
# Upload digest to id.apache.org (gpg --fingerprint)
$ gpg --send-key <KEY ID> # Distribute key
$ gpg --output pwendell.asc --export -a <KEY_ID>
# copy public key to Apache web space, name it <KEY_ID>.asc
# Create an FOAF file and add it via svn (see http://people.apache.org/foaf/)
#    -> should include key fingerprint
# Eventually key will show up on apache people page (e.g. https://people.apache.org/keys/committer/pwendell.asc)

Get Access to Apache Nexus for Publishing Artifacts

You have this iff you can log into repository.apache.org
Install LDAP credentials in your ~/.m2/settings.xml file as described here for publishing

Get "Push" Access to Apache Git Repository

git remote add apache https://git-wip-us.apache.org/repos/asf/incubator-spark.git

Preparing the Code for a Release

Ensure Spark is Ready for a Release

Check JIRA for remaining issues tied to the release
- Review and merge any blocking features
- Bump other remaining features to subsequent releases
Ensure Spark versions are correct in the codebase
- Includes SBT/Maven builds, docs, and ec2 scripts
- See this example commit
- NOTE: The version in pom.xml files should be SPARK-VERSION_SCALA-VERSION-SNAPSHOT and will be changed automatically when cutting the release
- NOTE: The yarn-alpha module should have it's version bumped here because it is not enabled when publishing

Check for dead links in the docs

Code Block

language	bash

$ cd $SPARK_HOME/docs
$ jekyll serve --watch
$ sudo apt-get install linkchecker
$ linkchecker -r 2 http://localhost:4000 --no-status --no-warnings

Checkout and Run Tests

Code Block

language	bash

$ git clone https://git-wip-us.apache.org/repos/asf/incubator-spark.git -b branch-0.8
$ cd incubator-spark
$ sbt/sbt assembly
$ export MAVEN_OPTS="-Xmx3g -XX:MaxPermSize=1g -XX:ReservedCodeCacheSize=1g"
$ mvn test

Run License Audit Tool

Code Block

language	bash

$ java -jar /path/to/apache-rat-0.10.jar --dir . --exclude *.md > rat_results.txt
$ vi rat_results.txt
$ # Look for source files that seem to have missing headers
$ cat rat_results.txt  | grep "???" | grep -e \.scala$ -e \.java$ -e \.py$ -e \.sh$
$ # Add missing headers if necessary

Create CHANGES.txt File

Code Block

language	bash

# Append to CHANGES.txt file required by Apache
# If doing a minor release, append to existing CHANGES.txt file in release branch
# If doing a major release, copy CHANGES.txt file from last major release
#  and append to it (shown below)
$ cat CHANGES.txt | tail -n +3 > OLD_CHANGES.txt
$ echo "Spark Change Log" > CHANGES.txt
$ echo "" >> CHANGES.txt
$ echo "Release 0.9.0-incubating" >> CHANGES.txt
$ echo "" >> CHANGES.txt
$ # below might be the shittiest code I’ve ever written. This will be much easier
$ # once all PR's use the new merge format.
$ git log v0.8.0-incubating..HEAD \
$   --grep "pull request" \
$   --pretty="QQ  %h %cd%nQQ  %s%nQQ  QQQ%b%nQQ"  \
$   | grep QQ | sed s/QQ// | sed "s/^  QQQ\(.*\)$/  [\1]/" >> CHANGES.txt
$ cat OLD_CHANGES.txt >> CHANGES.txt
$ rm OLD_CHANGES.txt
$ git add CHANGES.txt && git commit -m "Change log for release 0.9.0-incubating"

Cutting a Release Candidate

Overview

Cutting a release candidate involves a two steps. First, we use the Maven release plug-in to create a release commit (a single commit where all of the version files have the correct number) and publish the code associated with that release to a staging repository in Maven. Second, we check out that release commit and package binary releases and documentation.

Release Script

The process of creating releases has been automated via this release script
Read and understand the script fully before you execute it. It will cut a Maven release, build binary releases and documentation, then copy the binary artifacts to a staging location on people.apache.org.
NOTE: You must use git 1.7.X for this or else you'll hit this horrible bug

Auditing a Staged Release

The process of auditing release has been automated via this release audit script

Moved permanently to http://spark.apache.org/release-process.html

Child pages

Versions Compared

Old Version 3

New Version Current

Key

Prerequisites for Managing A Release

Create a GPG Key

Get Access to Apache Nexus for Publishing Artifacts

Get "Push" Access to Apache Git Repository

Preparing the Code for a Release

Ensure Spark is Ready for a Release

Checkout and Run Tests

Run License Audit Tool

Create CHANGES.txt File

Cutting a Release Candidate

Overview

Release Script

Auditing a Staged Release