Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

This page is meant to document the various steps to working with git to contribute or review Kafka code. There are probably a lot of bugs in these steps or possible better recipes, so help make this page better. If you want to push your commits without passwd, please see apache git wiki.

Kafka patch review tool

1. Setup

  1. Follow instructions here to setup the jira-python package
  2. Follow instructions here to setup the reviewboard python tools
  3. Install the argparse module

    Code Block
    On Linux -> sudo yum install python-argparse
    On Mac -> sudo easy_install argparse
    

2. Usage

Code Block
nnarkhed-mn:kafka-git-idea nnarkhed$ python kafka-patch-review.py --help
usage: kafka-patch-review.py [-h] -b BRANCH -j JIRA [-s SUMMARY]
                             [-d DESCRIPTION] [-r REVIEWBOARD] [-t TESTING]
                             [-v VERSION] [-db]

Kafka patch review tool

optional arguments:
  -h, --help            show this help message and exit
  -b BRANCH, --branch BRANCH
                        Tracking branch to create diff against
  -j JIRA, --jira JIRA  JIRA corresponding to the reviewboard
  -s SUMMARY, --summary SUMMARY
                        Summary for the reviewboard
  -d DESCRIPTION, --description DESCRIPTION
                        Description for reviewboard
  -r REVIEWBOARD, --rb REVIEWBOARD
                        Review board that needs to be updated
  -t TESTING, --testing-done TESTING
                        Text for the Testing Done section of the reviewboard
  -v VERSION, --version VERSION
                        Version of the patch
  -db, --debug          Enable debug mode

3. Upload patch

  1. Specify the branch against which the patch should be created (-b)
  2. Specify the corresponding JIRA (-j)
  3. Specify an optional summary (-s) and description (-d) for the reviewboard

Example:

Code Block
 python kafka-patch-review.py -b origin/trunk -j KAFKA-42

4. Update patch

  1. Specify the branch against which the patch should be created (-b)
  2. Specify the corresponding JIRA (--jira)
  3. Specify the rb to be updated (-r)
  4. Specify an optional summary (-s) and description (-d) for the reviewboard, if you want to update it
  5. Specify an optional version of the patch. This will be appended to the jira to create a file named JIRA-<version>.patch. The purpose is to be able to upload multiple patches to the JIRA. This has no bearing on the reviewboard update.

Example:

Code Block
python kafka-patch-review.py -b origin/trunk -j KAFKA-42 -r 14081

JIRA command line tool

1. Download the JIRA command line package

Install the jira-python package

Code Block
sudo easy_install jira-python
2. Configure JIRA username and password

Include a jira.ini file in your $HOME directory that contains your Apache JIRA username and password

Code Block
nnarkhed-mn:~ nnarkhed$ cat ~/jira.ini
user=nehanarkhede
password=***********

Reviewboard

This is a quick tutorial on using Review Board with Kafka.

1. Install the post-review tool

If you are on RHEL, Fedora or CentOS, follow these steps

Code Block
sudo yum install python-setuptools
sudo easy_install -U RBTools

If you are on Mac, follow these steps

Code Block
sudo easy_install -U setuptools
sudo easy_install -U RBTools

For other platforms, follow the instructions here to setup the post-review tool.

2. Configure Stuff

Then you need to configure a few things to make it work:

First set the review board url to use. You can do this from in git:

Code Block
git config reviewboard.url https://reviews.apache.org

If you checked out using the git wip http url that confusingly won't work with review board. So you need to configure an override to use the non-http url. You can do this by adding a config file like this:

Code Block
jkreps$ cat ~/.reviewboardrc
REPOSITORY = 'git://git.apache.org/kafka.git'
TARGET_GROUPS = 'kafka'
GUESS_FIELDS = True

FAQ

When I run the script, it throws the following error and exits
Code Block
nnarkhed$python kafka-patch-review.py -b trunk -j KAFKA-42
There don't seem to be any diffs

There are 2 reasons that can cause this -

  • The code is not checked into your local branch
  • The -b branch is not pointing to the remote branch. In the example above, "trunk" is specified as the branch, which is the local branch. The correct value for the -b (--branch) option is the remote branch. "git branch -r" gives the list of the remote branch names.
When I run the script, it throws the following error and exits
Code Block
Error uploading diff

Your review request still exists, but the diff is not attached.

One of the most common root causes of this error are that the git remote branches are not up-to-date. Since the script already does that, it is probably due to some other problem. You can run the script with the --debug option that will make post-review run in the debug mode and list the root cause of the issue.

Simple contributor workflow

This is the simple workflow and will work well for small features development for people who don't have direct access to check in to the Apache repository. Let's assume you are working on a feature or bug called, xyz:

1. Checkout a new repository:

Code Block

  git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka

Or if you already have a copy of the repository, just check for updates

Code Block

  git fetch

2. Create and checkout a feature branch to work in:

Code Block

  git checkout -b xyz remotes/origin/trunk

3. Do some work on this branch and periodically checkin locally:

Code Block

  git commit -a

4. When done (or periodically) rebase your branch to take any changes from trunk:

Code Block

  git pull --rebase origin trunk

5. Make a patch containing your work and upload it to JIRA:

Code Block

  git format-patch trunk --stdout > xyz-v1.patch

...

You will also want to ensure you have your username and email setup correctly so that we correctly record the source of the contribution:

Code Block

git config --global user.name "Palmer Eldritch"
git config --global user.email "peldritch@layoutsinc.com"

Reviewer workflow:

This assumes you already have a copy of the repository.

1. Make sure your code is up-to-date:

Code Block

  git fetch

2. Checkout the destination branch:

Code Block

  git checkout trunk

3. See what the patch will do:

Code Block

  git apply --stat xyz-v1.patch

4. See that the patch will apply cleanly (otherwise prod the contributor to rebase):

Code Block

  git apply --check xyz-v1.patch

6. Apply the patch to trunk

Code Block

  git am --signoff < xyz-v1.patch

If you get an error that says "Patch does not have a valid e-mail address." then the patch might have been created by doing git diff in which case you can apply the patch using

Code Block

patch -p1 < xyz-v1.patch

if the am operation failed you will also need to remove the .git/rebase-apply/ that gets created

7. If things go wrong (tests fail, you find some problem, etc), you can back out:

Code Block

  git reset --hard HEAD
  git clean -f

8. Push the change back to Apache:

Code Block

  git push origin trunk

Simple Commiter Workflow

If you have commit access on the apache repository then you will not be applying patches in the manner described in the reviewer workflow. Instead, once your patch has been reviewed you will check it in yourself as follows:

  1. Create a branch to work on:

    Code Block
    
        git fetch
        git checkout -b xyz remotes/origin/trunk
      
  2. Implement the feature.
  3. Rebase:

    Code Block
    
        git rebase remotes/origin/trunk
      
  4. Post the change to JIRA and get it reviewed.
  5. Push the change back to Apache. Pick one of the following:
    • You should almost always collapse your work into a single check-in in order to avoid cluttering the upstream change-log:

      Code Block
      
           # assuming trunk is up-to-date with origin
           git checkout trunk
           git merge --squash xyz
           git commit -am "KAFKA-XXX xyz feature; reviewed by <reviewers>"
           git push origin trunk
         
    • If you are absolutely sure you want to preserve your local intermediate check-in history then push directly from your feature branch instead of the above merge (or use merge without the squash option):

      Code Block
      
         # from feature branch xyz
         git push origin trunk
         

Github Workflow

Apache doesn't seem to provide a place to stash your work-in-progress branches or provide some of the nice social features github has. This can be a problem for larger features. Here are instructions for using github as a place to stash your work in progress changes.

Setting Up

1. As in the other workflows begin by checking out kafka (if you haven't already):

Code Block

  git clone https://git-wip-us.apache.org/repos/asf/kafka.git kafka

This sets up the remote alias "origin" automatically which refers back to the Apache repo.
2. Create a new github repository on your github account to use for stashing changes. There are various ways to do this, I just forked the apache/kafka repo (https://github.com/apache/kafka) which creates a repo https://github.com/jkreps/kafka (where jkreps would be your user name).
3. Add an alias on your local repository to github to avoid typing:

Code Block

  git remotes add github https://github.com/<your_user>/kafka.git

Now you can push either to origin or to github.

Doing Work

1. You can create a branch named xyz in your local repository and check it out

Code Block

  git checkout -b xyz remotes/origin/trunk

2. To set up a second machine to work on you can clone the github url.
3. To save your branch to your github repo do

Code Block

  git push github xyz

4. To pull these changes onto the other machine where you have a copy of the repository you can do:

Code Block

  git fetch github
  git checkout xyz
  git merge remotes/github/xyz

...