This page documents the various steps required in order to contribute Kafka code changes. This page should be read after Kafka's Contributing page.
There may be bugs or possible improvements to this page, so help us improve it. Credit to the Spark project for tackling the issue of receiving contributions to an Apache project via GitHub pull requests. We have borrowed liberally from their process, tools, and documentation.
Overview
Generally, Kafka uses:
- JIRA to track logical issues, including bugs and improvements
- Kafka Improvement Proposals for planning major changes
- Confluence for documentation
- Github pull requests to manage the review and merge of specific code changes
That is, JIRA and Confluence are used to describe what should be fixed or changed, and high-level approaches, and pull requests describe how to implement that change in the project's source code.
JIRA
Find the existing Kafka JIRA ticket that the change pertains to.
Do not create a new JIRA ticket if creating a change to address an existing ticket in JIRA; add to the existing discussion and work instead.
To avoid conflicts, assign the JIRA ticket to yourself if you plan to work on it.
Look for existing pull requests that are linked from the JIRA ticket, to understand if someone is already working on it.
If the change is new, then it usually needs a new JIRA ticket. However, trivial changes, where "what should change" is virtually the same as "how it should change" do not require a JIRA ticket. Example: "Fix typos in Foo scaladoc"
If required, create a new JIRA ticket:
Provide a descriptive Title. "Update web UI" or "Problem in scheduler" is not sufficient. "Kafka support fails to handle empty queue during shutdown" is good.
Write a detailed Description. For bug reports, this should ideally include a short reproduction of the problem. For new features, it may include a design document (or a Kafka Improvement Proposal if it's a major change).
Set required fields: Type, Priority, Fix Versions and optionally Labels.
To avoid conflicts, assign the JIRA ticket to yourself if you plan to work on it. Leave it unassigned otherwise.
Do not include a patch file; pull requests are used to propose the actual change.
If the change is a large change, consider inviting discussion on the issue at dev@kafka.apache.org first before proceeding to implement the change. Note that changes that modify APIs or that are very visible to users will also require following KIP (Kafka Improvement Proposal) process.
Pull Request
Fork the Github repository at http://github.com/apache/kafka if you haven't already
Clone your fork, create a new branch, push commits to the branch (review the Kafka Coding Guidelines, if you haven't already).
Consider whether documentation or tests need to be added or updated as part of the change, and add them as needed (doc changes should be submitted along with code change in the same PR).
Run all tests as described in the project's README.
Open a pull request against the trunk branch of apache/kafka. (Only in special cases would the PR be opened against other branches.)
The PR title should usually be of the form
KAFKA-xxxx: Title
, whereKAFKA-xxxx
is the relevant JIRA id andTitle
may be the JIRA's title or a more specific title describing the PR itself. For trivial cases where a JIRA is not required (see JIRA section for more details)MINOR:
orHOTFIX:
can be used as the PR title prefix.If the pull request is still a work in progress, and so is not ready to be merged, but needs to be pushed to Github to facilitate review, then add
[WIP]
after the JIRA id.Consider identifying committers or other contributors who have worked on the code being changed. Find the file(s) in Github and click "Blame" to see a line-by-line annotation of who changed the code last and check the Maintainers page. You can add
@username
in the PR description to ping them immediately.Please state that the contribution is your original work and that you license the work to the project under the project's open source license.
A comment with information about the pull request will be added to the JIRA ticket.
- Change the status of the JIRA to "Patch Available" if it's ready for review.
The Jenkins automatic pull request builder will run unit and integration tests on your branch (after merging it to the target branch, typically
trunk
).Once ready, the PR `checks` box will be updated with the test results along with links to the full results on Jenkins (we have separate builds in order to test multiple Java and Scala version).
Investigate and fix failures caused by the pull the request
Fixes can simply be pushed to the same branch from which you opened your pull request.
Please address feedback via additional commits instead of amending existing commits. This makes it easier for the reviewers to know what has changed since the last review. All commits will be squashed into a single one by a script as part of the merge process.
Jenkins will automatically re-test when new commits are pushed.
Despite our efforts, Kafka may have flaky tests at any given point, which may cause a build to fail. You can trigger a new build by adding the following comment: "retest this please". If the failure is unrelated to your pull request and you have been able to run the tests locally successfully, please mention it in the pull request.
In addition to unit and integration tests, we also have a set of system tests that run on a nightly basis. For large, impactful or risky changes, it is preferable to run the system tests before merging the pull request. Currently that can be done via a manually triggered job in a Jenkins instance provided by Confluent (ask for access in the pull request). In the near future, we will also be able to run the system tests in Travis (progress can be tracked via ).
The Review Process
Other reviewers, including committers, may comment on the changes and suggest modifications. Changes can be added by simply pushing more commits to the same branch.
Please add a comment and "@" the reviewer in the PR if you have addressed reviewers' comments (no notification is sent otherwise).
Patches can be applied locally following the comments on the JIRA ticket, for example: git pull https://github.com/[contribuer-name]/kafka KAFKA-xxxx.
Lively, polite, rapid technical debate is encouraged from everyone in the community. The outcome may be a rejection of the entire change.
Reviewers can indicate that a change looks suitable for merging with a comment such as: "I think this patch looks good". Kafka uses the LGTM convention for indicating the strongest level of technical sign-off on a patch: simply comment with the word "LGTM". It specifically means: "I've looked at this thoroughly and take as much ownership as if I wrote the patch myself". If you comment LGTM you will be expected to help with bugs or follow-up issues on the patch. Consistent, judicious use of LGTMs is a great way to gain credibility as a reviewer with the broader community.
The JIRA ticket status will changed from "Patch Available" to "In Progress" if the pull request needs more work.
Sometimes, other changes will be merged which conflict with your pull request's changes. The PR can't be merged until the conflict is resolved. This can be resolved with "git fetch origin" followed by "git merge origin/trunk" and resolving the conflicts by hand, then pushing the result to your branch.
Try to be responsive to the discussion rather than let days pass between replies.
Closing Your Pull Request / JIRA
If a change is accepted, it will be merged and the pull request will automatically be closed, along with the associated JIRA if any
Note that in the rare case you are asked to open a pull request against a branch besides trunk, that you will actually have to close the pull request manually
The JIRA will be Assigned to the primary contributor to the change as a way of giving credit. If the JIRA isn't closed and/or Assigned promptly, comment on the JIRA.
If your pull request is ultimately rejected, please close it promptly
... because committers can't close PRs directly
Pull requests will be automatically closed by an automated process at Apache after about a week if a committer has made a comment like "mind closing this PR?" This means that the committer is specifically requesting that it be closed.
If a pull request has gotten little or no attention, consider improving the description or the change itself and ping likely reviewers again after a few days. Consider proposing a change that's easier to include, like a smaller and/or less invasive change.
- If a pull request is closed because it is deemed not the right approach to resolve a JIRA, then leave the JIRA open. However if the review makes it clear that the issue identified in the JIRA is not going to be resolved by any pull request (not a problem, won't fix) then also resolve the JIRA.