Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page is a work in progress, please see Patch submission and review in the meantime.This page documents the various steps required in order to contribute Kafka code changes. This page should be read after Kafka's Contributing page.

There may be bugs or possible improvements to this page, so help us improve it. Credit to the Spark project for tackling the issue of receiving contributions to an Apache project via GitHub pull requests. We have borrowed liberally from their process, tools, and documentation. 

...

  1. Find the existing Kafka JIRA ticket that the change pertains to.

    1. Do not create a new JIRA ticket if creating a change to address an existing ticket in JIRA; add to the existing discussion and work instead.

    2. To avoid conflicts, assign the JIRA ticket to yourself if you plan to work on it.

    3. Look for existing pull requests that are linked from the JIRA ticket, to understand if someone is already working on it.

  2. If the change is new, then it usually needs a new JIRA ticket. However, trivial changes, where "what should change" is virtually the same as "how it should change" do not require a JIRA ticket. Example: "Fix typos in Foo scaladoc"

  3. If required, create a new JIRA ticket (below shows some critical fields to fill-in, a more detailed guidance can be found here):

    1. Provide a descriptive Title. "Update web UI" or "Problem in scheduler" is not sufficient. "Kafka support fails to handle empty queue during shutdown" is good.

    2. Write a detailed Description. For bug reports, this should ideally include a short reproduction of the problem. For new features, it may include a design document (or a Kafka Improvement Proposal if it's a major change).

    3. Set required fields: Type, Priority, Fix Versions and optionally Labels. Please only set Fix Version if you are confident you and committers agree that the issue needs to be fixed by the specified version. Please note that issues considered critical by one organization are not always considered universally critical.

    4. To avoid conflicts, assign the JIRA ticket to yourself if you plan to work on it. Leave it unassigned otherwise.

    5. Do not include a patch file; pull requests are used to propose the actual change.

  4. If the change is a large change, consider inviting discussion on the issue at dev@kafka.apache.org first before proceeding to implement the change. Note that changes that modify APIs or that are very visible to users will also require following KIP (Kafka Improvement Proposal) process.

Pull Request

  1. Fork the Github repository at http://github.com/apache/kafka if you haven't alreadyalready 

  2. Clone your fork, create a new branch, push commits to the branch (review the Kafka Coding Guidelines, if you haven't already).

  3. Consider whether documentation or tests need to be added or updated as part of the change, and add them as needed (doc changes should be submitted along with code change in the same PR).

  4. Run all tests as described in the project's README.

  5. Open a pull request against the trunk branch of apache/kafka. (Only in special cases would the PR be opened against other branches.)

    1. The PR title should be usually be of the form [KAFKA-xxxx] : Title, where KAFKA-xxxx is the relevant JIRA id and Title may be the JIRA's title or a more specific title describing the PR itself. For trivial cases where a JIRA is not required (see JIRA section for more details) MINOR: or HOTFIX: can be used as the PR title prefix.

    2. If the pull request is still a work in progress, and so is not ready to be merged, but needs to be pushed to Github to facilitate review, then add [WIP] after the JIRA id.

    3. Consider identifying committers or other contributors who have worked on the code being changed. Find the file(s) in Github and click "Blame" to see a line-by-line annotation of who changed the code last and check the Maintainers page. You The easiest is to simply follow GitHub's automatic suggestions. You can add @username in the PR description to ping them immediately.

    4. Please state that the contribution is your original work and that you license the work to the project under the project's open source license.

  6. A comment with information about the pull request will be added to the JIRA ticket.

  7. Change the status of the JIRA to "Patch Available" if it's ready for review.

The following steps are planned, but not yet configured.

  1. To do this, click the "Submit Patch" button in JIRA, and then in the resulting dialog, click "Submit Patch".
  2. The project uses Apache Jenkins for continuous testing on Linux AMD64 and ARM64 build nodes. A CI job will be started automatically for a new pull request, and rerun each time a new commit is pushed. If you need to trigger a new run for any reason you can either tag a committer to trigger a new build or just push a new commit.

  3. Once ready, the PR `checks` box will be updated with the test results along with links to the full results on Jenkins (we have separate builds in order to test multiple Java and Scala versions).

  4. Investigate and fix failures caused by the pull the request

  5. The Jenkins automatic pull request builder will test your changes

    1. If it is your first contribution, Jenkins will wait for confirmation before building your code and post "Can one of the admins verify this patch?"

    2. A committer can authorize testing with a comment like "ok to test"

    3. A committer can automatically allow future pull requests from a contributor to be tested with a comment like "Jenkins, add to whitelist"

  6. Once ready, the test results will be posted on the pull request, along with a link to the full results on Jenkins.

  7. Watch for the results, and investigate and fix failures promptly

    1. Fixes can simply be pushed to the same branch from which you opened your pull request.

    2. Please address feedback via additional commits instead of amending existing commits. This makes it easier for the reviewers to know what has changed since the last review. All commits will be squashed into a single one by the committer via GitHub's squash button or by a script as part of the merge process.

    3. Jenkins will automatically re-test when new commits are pushed.If the tests failed for reasons unrelated to the change (e.g. Jenkins outage), then a committer can request a re-test with "Jenkins, retest this please". Ask if you need a test restarted.

    4. Despite our efforts, Kafka may have flaky tests at any given point, which may cause a build to fail. You need to ping committers to trigger a new build. If the failure is unrelated to your pull request and you have been able to run the tests locally successfully, please mention it in the pull request.

  8. In addition to unit and integration tests, we also have a set of system tests that run on a nightly basis. For large, impactful or risky changes, it is preferable to run the system tests before merging the pull request. Currently that can be done via a manually triggered job in a Jenkins instance provided by Confluent (ask for access in the pull request). 

The Review Process

  • Other reviewers, including committers, may comment on the changes and suggest modifications. Changes can be added by simply pushing more commits to the same branch.

  • Please add a comment and "@" the reviewer in the PR if you have addressed reviewers' comments. Even though GitHub sends notifications when new commits are pushed, it is helpful to know that the PR is ready for review once again.

  • Patches can be applied locally following the comments on the JIRA ticket, for example: git pull https://github.com/[contributor-name]/kafka KAFKA-xxxx.

  • Lively, polite, rapid technical debate is encouraged from everyone in the community. The outcome may be a rejection of the entire change.

  • Reviewers can indicate that a change looks suitable for merging with a comment such as: "I think this patch looks good". Kafka uses the LGTM convention for indicating the by approving it via GitHub's review interface. This indicates the strongest level of technical sign-off on a patch : simply comment with the word "LGTM". It specifically and it means: "I've looked at this thoroughly and take as much ownership as if I wrote the patch myself". If you comment LGTM you approve a pull request, you will be expected to help with bugs or follow-up issues on the patch. Consistent, judicious use of LGTMs pull request approvals is a great way to gain credibility as a reviewer with the broader community. The JIRA ticket status will changed from "Patch Available" to "In Progress" if the pull request needs more workKafka reviewers will typically include the acronym LGTM in their approval comment. This was the convention used to approve pull requests before the "approve" feature was introduced by GitHub.

  • Sometimes, other changes will be merged which conflict with your pull request's changes. The PR can't be merged until the conflict is resolved. This can be resolved with "git fetch origin" followed by "git rebase merge origin/trunk" and resolving the conflicts by hand, then pushing the result to your branch.

  • Try to be responsive to the discussion rather than let days pass between replies.

Closing Your Pull Request / JIRA

  • If a change is accepted, it will be merged and the pull request will automatically be closed, along with the associated JIRA if any

    • Note that in the rare case you are asked to open a pull request against a branch besides trunk, that you will actually have to close the pull request manually

    • The JIRA will be Assigned to the primary contributor to the change as a way of giving credit. If the JIRA isn't closed and/or Assigned promptly, comment on the JIRA.

  • If your pull request is ultimately rejected, please close it promptly

  • ... because committers can't
  • close

  • PRs directlyPull requests will be automatically closed by an automated process at Apache after about a week if a committer has made a comment like "mind closing this PR?" This means that the committer is specifically requesting that
  • it

  • be closed
  • .

  • If a pull request has gotten little or no attention, consider improving the description or the change itself and ping likely reviewers again after a few days. Consider proposing a change that's easier to include, like a smaller and/or less invasive change.

  • If a pull request is closed because it is deemed not the right approach to resolve a JIRA, then leave the JIRA open. However if the review makes it clear that the issue identified in the JIRA is not going to be resolved by any pull request (not a problem, won't fix) then also resolve the JIRA.

...