You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

In this page we share a few practices to use git/github during the Ozone development.

We have commits where:

  1. 'JIRA number + Jira summary' is the first line of the commit
    1. (if the git author field doesn't contain the author, please add the Contributed by.... text. 
  2. There is only one commit per JIRA issue.
  3. Commit is signed by the committer (optional, but increase your karma)
  4. Commit contains the original author in the git author field (optional, but increase your karma)

Open a PR

To open a PR:

(1) clone the hadoop repository (with clicking the fork button at https://github.com/apache/hadoop)

(2) Add your new repo as a remote

git remote add elek git@github.com:youruser/hadoop.git

(3) Create a branch and commits

git checkout -b HDDS-1234
#vi/intellij
git commit -m "HDDS-1234. Fixing all."

(4) Push your branch to your own fork

git push youruser HDDS-1234

Enumerating objects: 1220, done.
Counting objects: 100% (871/871), done.
Delta compression using up to 4 threads
Compressing objects: 100% (319/319), done.
Writing objects: 100% (683/683), 637.84 KiB | 42.52 MiB/s, done.
Total 683 (delta 249), reused 569 (delta 184)
remote: Resolving deltas: 100% (249/249), completed with 117 local objects.
remote: 
remote: Create a pull request for 'branchtest' on GitHub by visiting:
remote:      https://github.com/yoruser/hadoop/pull/new/HDDS-1234
remote: 
To github.com:elek/hadoop.git
 * [new branch]              branchtest -> branchtest

(5) Open the given url to create the PR (https://github.com/yoruser/hadoop/pull/new/HDDS-1234 in our case)


Recommendations:

  1. Always open a Jira for your PR first
  2. Use the 'Jira key + summary' as the Title of the PR
  3. Copy the description from the JIRA (not required, but makes it easier to review)
  4. Put a link to the description back to the JIRA (not required, but makes it easy to check the JIRA

Trying out a PR locally

The easiest way to check out a PR locally is fetch by the id:

git fetch origin refs/pull/541/head

Now, you have the PR branch under the FETCH_HEAD reference.

git log FETCH_HEAD

(Option 1) You can check it out and check the full history:

git checkout -b HDDS-1234 FETCH_HEAD

(Option 2) Or you can squash it to one commit to merge it to your branch:

git checkout -b HDDS-1234 trunk
git merge --squash FETCH_HEAD
git status

(Option 3) Or you can use the smart-apply patch script from yetus

./dev-support/bin/smart-apply-patch GH:562
  • It has additional checks and removes  whitespaces
  • It downloads the different commits one by one and applies to  the current brach (in case of rebase conflict, you can start the review with Option 1)
  • It rewrites the commit history from the PR branch
  • It keeps the number of the commits (no squash marge)

Merge a PR

There are two options to merge a PR to the main branch:

  1. Merge it from the github UI (con: can not be signed by the developer, requires gitbox/github integration)
  2. Merge it with local git magic (con: the merge event is not visible on the ui)

With UI

Note: this step (and only this step) requires a binding between your github account and apache account, which can be done here: https://gitbox.apache.org/setup/


(1) Check the CI results and be sure that they are green

(2) Set the merge button to 'squash and merge' and click to it.


(3) Double check the commit message and commit it:

With CLI

The easy way was mentioned before:

git fetch origin refs/pull/541/head
git merge --squash FETCH_HEAD
#.... review
git add .
git commit -m "..."

But unfortunately with this approach you should fill both the commit message and the author by own. For example:

git commit -m 'HDDS-1217. Refactor ChillMode rules and chillmode manager' --author 'Bharat Viswanadham <bharat@apache.org>'

There is a script to do it:

https://gist.github.com/elek/1c7b85fb6d5647e8d1b83fc10105592a

This script can be used to

  1. Retrieve the id/summary from the JIRA
  2. Retrieve the author information from the commit of the FETCH_HEAD (or any other commit)

commit_jira.sh HDDS-1217 

It's also a good practice to close the PR from the commit itself. With this approach the final commit will be visible on the PR page:

Just add the "Closes #1234" expression to the end of the commit message in a separated new line:

git commit --amend

 HDDS-1213. Support plain text S3 MPU initialization request.
 
 Closes #549
 # Please enter the commit message for your changes. Lines starting
 # with '#' will be ignored, and an empty message aborts the commit.


Signing your commits

It's always a good practice to sign your commits.

(1) To generate your PGP key please follow this tutorial from github.

(2) To auto-sign all the commits, configure your git client:

  git config --global commit.gpgSign true                                               
  git config --global user.signingkey D51EA8F00EE79B28 

Note: the key id should be replaced by the value which is retrieved in the 12nd step of the github tutorial.

Note: github can display a "verified" tag if you upload your public key to the github server. It only means that github verified that you access the email address which is included in the gpg key. If you trust in gihub, it's enough.

If you trust in the member of the Apache community, you should join to the Apache web of trust (people who are already joined are here: https://wot.apache.org/)

Checking the log

Put this one to your ~/.gitconfig

[alias]
    lg = log --date=relative --color --graph --pretty=format:'%Cred%h%Creset %C(blue)%G?%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold yellow)<%an>%Creset %C(cyan)<%cn>%Creset' --abbrev-commit

And use 'git lg' instead of 'git log'

Here, you can see the

  1. The commit hash
  2. The signature state (G = good, N = none, E = signed, but key is not imported)
  3. Git branch and commit message
  4. Date of the commit
  5. Author of the commit (yellow)
  6. Committer of the commit (cyan)

You can also check all the information without this alias:

git show HEAD --pretty=fuller --show-signature 

commit ade31258e6275ba3bf7e642b24f8e865dd32f640 (HEAD -> branchtest, elek/branchtest)
gpg: Signature made Wed 06 Mar 2019 11:53:17 AM CET
gpg:                using RSA key 1CEF33FA61800117BDB2E0E0D51EA8F00EE79B28
gpg: Good signature from "Marton Elek (CODE SIGNING KEY) <elek@apache.org>" [ultimate]
Author:     Márton Elek <elek@apache.org>
AuthorDate: Wed Mar 6 11:53:17 2019 +0100
Commit:     Márton Elek <elek@apache.org>
CommitDate: Wed Mar 6 11:53:17 2019 +0100

    HDDS-1234. This is just an example commit.

diff --git a/asd b/asd
new file mode 100644
index 00000000000..e69de29bb2d

About sign-off

Sign-off is an additional line in the git commit message (this is not the same as a cryptographic signature). This is required for the linux kernel development.

This sign-off message is not required by Apache Software Foundation as the lifecycle of the commits are simpler here.

I propose not use the sign-off feature, but if you use, please keep the original author of the patch in the sign-off header.

Author information

Developing apache projects is a community effort and we always give credit to the original author of the code. There are two options for this:

(1) Store the author in the author field of the commit

For example if the work is done by Xiaoyu

git commit --author="Xiaoyu Yao <xyao@apache.org>"

You can check the author/commit with git log/git show

git lg -1   --pretty=fuller --show-signature

* commit 85c9b106de8 (HEAD -> ozone-0.4)
| gpg: Signature made Thu 07 Mar 2019 10:29:08 AM CET
| gpg:                using RSA key 1CEF33FA61800117BDB2E0E0D51EA8F00EE79B28
| gpg: Good signature from "Marton Elek (CODE SIGNING KEY) <elek@apache.org>" [ultimate]
| Author:     Xiaoyu Yao <xyao@apache.org>
| AuthorDate: 10 seconds ago
| Commit:     Márton Elek <elek@apache.org>
| CommitDate: 10 seconds ago

And you can modify the author if it's wrong:

git commit --amend --author="Xiaoyu Yao <xyao@apache.org>"

Note: cherry-pick preserver the author ship information but in case of a manual conflict resolution, it's always good practice to check the final author information.

(2) The good old method:

If the email address is not available for you (the patch is attached to the Jira by a non-committer), you can give a credit in the commit message, using the Contributed by postfix:

| * f048512bb89 N - HDFS-14192. Track missing DFS operations in Statistics and StorageStatistics. Contributed by Ayush Saxena. (7 weeks ago) <Inigo Goiri> <Inigo Goiri>


  • No labels