Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note: This content was moved over from https://wiki.apache.org/hadoop/GithubIntegration



There are several ways to setup Git for committers and contributors. Contributors can safely setup Git any way they choose but committers should take extra care since they can push new commits to the trunk at Apache and various policies there make backing out mistakes problematic. To keep the commit history clean take note of the use of `--squash` squash below when merging into `apacheapache/trunk`trunk.


Table of Contents

Git setup

...

This describes setup for one local repo and two remotes. It allows you to push the code on your machine to either your Github repo or to git-wip-usGitHub repo or apache/hadoop GitHub repo. The ASF official repository is gitbox.apache.org, however, the repository can be writable from both GitBox and GitHub if you are a committer. You will want to fork githubGitHub's apache/hadoop to your own account on githubGitHub, this will enable Pull Requests of your own. Cloning this fork locally will set up "origin" to point to your remote fork on github GitHub as the default remote. So if you perform `git push origin trunk` it will go to githubyour fork.

To attach to the apache Apache git repo do the following:

Code Block
git remote add apache https://git-wip-us.apache.org/repos/asf/github.com/apache/hadoop.git


To check your remote setup:

...

Code Block
origin    https://github.com/your-github-id/hadoop.git (fetch)
origin    https://github.com/your-github-id/hadoop.git (push)
apache     https https://git-wip-us.apache.org/repos/asf/github.com/apache/hadoop.git (fetch)
apache     https https://git-wip-us.apache.org/repos/asf/github.com/apache/hadoop.git (push)


Now if you want to experiment with a branch everything, by default, points to your github account because origin is the. You can work as normal using only github until you are ready to merge with the apache remote. Some conventions will integrate with Apache Jira ticket numbers.

...


Once you are ready to commit to the apache remote, you can merge and push them directly or better yet create a PR.We recommend creating new branches under feature/ to help group ongoing work, especially now that as of November 2015, forced updates are disabled on ASF branches. We hope to reinstate that ability on feature branches to aid development.

How to create a PR

...

Push your branch to GitHub:

Code Block
git checkout feature/hadoop-xxxx
git fetch apache
git rebase apache/trunk # to make it apply to the current trunk
git push origin feature/hadoop-xxxx

...

  1. Go to your feature/hadoop-xxxx branch on Github. Since you forked it from Github's apache/hadoop it will default any PR to go to apache/trunk.
  2. Click the green "Compare, review, and create pull request" button.
  3. You can edit the to and from for the PR if it isn't correct. The "base fork" should be apache/hadoop unless you are collaborating separately with one of the committers on the list. The "base" will be trunk. Don't submit a PR to one of the other branches unless you know what you are doing. The "head fork" will be your forked repo and the "compare" will be your `feature/hadoop-xxxx` branch.
  4. Click the "Create pull request" button and name the request "HADOOP-XXXX" all caps. This will connect the comments of the PR to the mailing list and Jira comments.
  5. From now on the PR lives on github's apache/hadoop repository. You use the commenting UI there.

...

How to run Jenkins precommit job for a PR

The precommit job is run automatically when opened a PR and when there is any change in your branch. If you are a committer and want to run Jenkins precommit job manually, log in to https://ci-hadoop.apache.org/job/hadoop-multibranch and run the job corresponding with the pull request ID. If there is no such job, click "Scan Repository Now" to scan the pull requests. If you are not committer, please create an empty commit on your branch.

Merging a PR (for committers)

In most cases, clicking the "Squash and merge button" is fine. Before merging the PR, the committer must check the title and the commit message, and fix them if needed. You can add "Signed-off-by", "Reviewed-by", and "Co-authored-by" when merging the commit.

When you need to commit the change locally and push them, start

How to create a PR (contributors)

...

Start with reading https://help.github.com/articles/checking-out-pull-requests-locally/.
Remember that pull requests are equivalent to a remote GitHub branch with potentially a multitude of commits. In this case it is recommended to squash remote commit history to have one commit per issue, rather than merging in a multitude of contributor's commits. In order to do that, as well as close the PR at the same time, it is recommended to use squash commits.

Merging pull requests are equivalent to a "pull" of a contributor's branch:

Code Block
git checkout trunk      # switch to local trunk branch
git pull apache trunk   # fast-forward to current remote HEAD
git pull --squash https://github.com/cuser/hadoop cbranch  # merge to trunk



The --squash option ensures all PR history is squashed into single commit, and allows committer to use his/her own message. Read git help for merge or pull for more information about --squash option. In this example we assume that the contributor's GitHub handle is "cuser" and the PR branch name is "cbranch". Next, resolve conflicts, if any, or ask a contributor to rebase on top of trunk, if PR went out of sync.

If you are ready to merge your own (committer's) PR you probably only need to merge (not pull), since you have a local copy that you've been working on. This is the branch that you used to create the PR.

Code Block
git checkout trunk      # switch to local trunk branch
git pull apache trunk   # fast-forward to current remote HEAD
git merge --squash feature/hadoop-xxxx



Remember to run regular patch checks, build with tests enabled, and change CHANGES.TXT (not applicable for Hadoop versions 2.8.0 and later) for the appropriate part of the project.

If everything is fine, you now can commit the squashed request along the lines

Code Block
git commit -a -m "HADOOP-XXXX description (cuser via your-apache-id) closes apache/hadoop#ZZ"



HADOOP-XXXX is all caps and where ZZ is the pull request number on apache/hadoop repository. Including `closes apache/hadoop#ZZ` will close the PR automatically. More information is found at https://help.github.com/articles/closing-issues-via-commit-messages. Next, push to git-wip-us gitbox.apache.org:

Code Block
push apache trunk

...


Closing a PR without committing (for committers)

...

Code Block
git commit --allow-empty -m "closes apache/hadoop#ZZ *Won't fix*"
git push apache trunk

That will close PR ZZ on github mirror without merging and any code modifications in the master repository.Now Hadoop committer can directly close GitHub pull requests. If you are a committer and don't have the privilege, you need to link your ASF and GitHub account via https://gitbox.apache.org/setup/ 

Apache/github integration features

Read https://blogs.apache.org/infra/entry/improved_integration_between_apache_and. Comments and PRs with Hadoop issue handles should post to mailing lists and Jira. Hadoop issue handles must in the form `HADOOP-YYYYY` (all capitals). Usually it makes sense to file a JIRA issue first, and then create a PR with description

Code Block
HADOOP-YYYY: <jira-issue-description>

...