Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This guide is optional for contributors. It is not necessary to use GitHub to contribute patches.


There are several ways to setup Git for committers and contributors. Contributors can safely setup Git any way they choose but committers should take extra care since they can push new commits to the trunk at Apache and various policies there make backing out mistakes problematic. To keep the commit history clean take note of the use of `--squash` squash below when merging into `apacheapache/trunk`trunk.

Git setup for Committers

This describes setup for one local repo and two remotes. It allows you to push the code on your machine to either your Github GitHub repo or to git-wip-us.apache.org. You will want to fork githubGitHub's apache/hadoop to your own account on githubon GitHub, this will enable Pull Requests of your own. Cloning this fork locally will set up "origin" to point to your remote fork on github on GitHub as the default remote. So if you perform `git push origin trunk` it will go to githubGitHub.

To attach to the apache Apache git repo do the following:

...

Start with reading https://help.github.com/articles/checking-out-pull-requests-locally/.
Remember that pull requests are equivalent to a remote GitHub branch with potentially a multitude of commits. In this case it is recommended to squash remote commit history to have one commit per issue, rather than merging in a multitude of contributor's commits. In order to do that, as well as close the PR at the same time, it is recommended to use squash commits.

Merging pull requests are equivalent to a "pull" of a contributor's branch:

Code Block
git checkout trunk      # switch to local trunk branch
git pull apache trunk   # fast-forward to current remote HEAD
git pull --squash https://github.com/cuser/hadoop cbranch  # merge to trunk



The --squash option ensures all PR history is squashed into single commit, and allows committer to use his/her own message. Read git help for merge or pull for more information about --squash option. In this example we assume that the contributor's GitHub handle is "cuser" and the PR branch name is "cbranch". Next, resolve conflicts, if any, or ask a contributor to rebase on top of trunk, if PR went out of sync.

If you are ready to merge your own (committer's) PR you probably only need to merge (not pull), since you have a local copy that you've been working on. This is the branch that you used to create the PR.

Code Block
git checkout trunk      # switch to local trunk branch
git pull apache trunk   # fast-forward to current remote HEAD
git merge --squash feature/hadoop-xxxx



Remember to run regular patch checks, build with tests enabled, and change CHANGES.TXT (not applicable for Hadoop versions 2.8.0 and later) for the appropriate part of the project.

If everything is fine, you now can commit the squashed request along the lines

Code Block
git commit -a -m "HADOOP-XXXX description (cuser via your-apache-id) closes apache/hadoop#ZZ"

...

When we want to reject a PR (close without committing), we can just issue an empty commit on trunk HEAD without merging the PR:

Code Block
git commit --allow-empty -m "closes apache/hadoop#ZZ *Won't fix*"
git push apache trunk

...

Read https://blogs.apache.org/infra/entry/improved_integration_between_apache_and. Comments and PRs with Hadoop issue handles should post to mailing lists and Jira. Hadoop issue handles must in the form `HADOOP-YYYYY` (all capitals). Usually it makes sense to file a JIRA issue first, and then create a PR with description

Code Block
HADOOP-YYYY: <jira-issue-description>

...