Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The document can be used by other projects as an inspiration for their own approach for Issue Management and using capabilities that GitHub Issues/Discussion give and how they can be used to make Issue management more efficient.

Table of Contents

Why GitHub Issues/Discussions

Personal comment from Apache Airflow team:

...

Especially recently with GithubDiscussions added to the mix and ability to convert issues into discussions (and back) if they are not real issues.

Best Practices

This chapter describe best practices for issue management with GitHub Issues  - based on the experience of Apache Airflow.

Github Issues and Discussions

Githb discussions are great because by definition they are not issues that should be closed but discussions that might die out or be converted into real issues when we come to the conclusion they are real issues. We found "GitHub Discussions" pretty useful and active, and more and more users are opening discussions rather than issues. This keeps the "issues" down to some "real" issues. Also we we've implemented our GitHub issue templates in the way to suggest users that they should be opening discussion rather than issue if not enough information/reproduction scenario is given.

Using templates for GitHub Issues

We have those really nice templates for GitHub Issues as of recently (this is another benefit of GH Issues - they have those really nicely working Issue Forms - which do a FANTASTIC job to make our issues much more quality issues - for example in the forms we instruct the users that if they have no reproducible steps, they should open GitHub Discussion instead - this already happened multiple times). One of the options in the issue form configuration is to provide a "BUTTON" instead of form for some types of issues which link to an external site.

...

  • with the standard MARKDOWN templates we had many issues where people did not provide useful information (version of airflow, operating system etc.)
  • we had a number of issues where users would simply delete the markdown template content straight away and replaced with their own issue description - without "reproducible steps", or really to ask a question about their deployment problems without even trying to attempt to investigate it.
  • we've ended up with many "discussion" kind of question posted as issues. Then we would "convert" such issues into discussion but it required maintainers comment and explanation. Mostly it was because the users did not even know they can (and should) open a discussion instead.
  • the markdown templates were difficult to read/fill in - it was not clear what you should do with the parts which were relevant - we left instructions in the comments, which were sometimes left/sometimes deleted, generally the issues were "structur-ish" rather than "structured".
  • often people opened Airflow 1.10 issues even if it reached End-Of-Life in June (they should open discussions instead - which is still great because even if there are no way we will handle the issues but either us or other users can help them still for workaround or even directing t) 

Those are the basic design principles we took when designing the new templates based on those observations:

  • We defined forms with required fields that cannot be "skipped".
  • When you do not fill a "textarea" entry it is marked as "Not Provided" rather than deleted which is clear information that something is missing.
  • We added helpful comments and hints as well as explanation in which cases you should use GitHub Discussions (including lack of ability to select 1.10 version and link to open GitHub Discussions) instead of issues (with direct link).
  • We've added logo and welcoming message to "soften" the "more formalized form entry need.
  • We made it clear that if there is no reproduction steps, the users should open GitHub Discussion instead.
  • We've added more issue types - we've separated "Airflow Core", "Airflow Providers" (we have more than 70 providers that extend Airflow's capabilities) and "Airflow Docs" issues - automatically applying correct labels, so we do not have to do triage and assign issues to different "types" ourselves - users do it for us when they open an issue of the specific type
  • We also created a "maintainer only" issue type with allows to enter pretty much free-form information (for tasks/todos etc.) and we've added a required "checkbox" to confirm you are maintainer, to discourage people from using it to raise their "free form" questions there. We wanted it to be "easy" for committers to enter such "free form" issue but "not easy" to skip structured information by the users - at the same time guiding them to use "Discussions" which are much "easier" to enter any content and ask questions.
  • What is even more - this structured form will allow us to automate some stuff if we find it is  needed. For example if someone submits an entry without providing "reproduction steps" we can write a bot to automatically convert such issue into discussion. Or automatically close an issue if someone opens a "free-form" one while not being maintainer.

Dealing with security reports inside github issues

GitHub Issue templates can be configured to allow different kinds of issues. One of the entry types might be links to other places in clusidn link to the security pollicy https://github.com/apache/airflow/security/policy  which clearly states that no GH issues should be opened, but the regular ASF security process should be followed (with the email to securty@a.o). 

Approach for triaging issues

  • We triage and respond to the issues pretty quickly and "aggressively". I.e when there is not enough information or the issue is very likely to be caused by external factor, we move to discussions the issue explaining what's missing, what the author should do, what information should be provided and add info that we might consider moving it back to an issue as soon as more information is provided. I found moving issues to discussions in this case works much better for motivation of the user to add more information (or save the hassle of maintaining status and closing the issues later).
  • when the user raises the issue which is a question, we actively and quickly redirect the user to "Discussions" rather than issue. 
  • we have automated stale-bot that closes inactive issues and PRs after (30 day inactivity = notice, + 7 day = closing) 
  • we have a triage team that virtually meets from time to time and actively reviews, classifies the issues (adds labels) but also runs some stats on which areas are "under-staffed". They meet semi-regularly and discuss and send some summaries. 

  • the rule we have is that we do not need issues at all. People are encouraged (in the docs and workshops) to open directly PRs rather than issues - we always refer to PR# not issue in Changelogs
  • we mark the issues that are simple as "good-first-issue" which then lands in http://github.com/apache/airflow/contribute . More often than not we have people commenting "Hey I want to implement this, can you assign me?" which we do pretty immediately when they ask. That often works and we have new contributors :)

Recruiting contributors

We use the opportunity of opening issues by our users to actively recruit new contributors.

  • we continuously encourage new users to contribute and add more committers especially in the areas that are "under-staffed" (recently UI committers "team" and "Kubernetes" team has greatly increased in capacity) and it immediately improved the situation there)

  • what helps there is that some of those committers are full-time employed or part-time paid as freelancers by important stakeholders in the project (Astronomer, Google). Also those stakeholders are fully aware of the value it brings, so they gladly pay the committers for their community effort, even if it is not directly responding to their needs
  • we added "Are you willing to submit PR?" question in the issue template. When the issue is relatively simple and the user says "yes" we assign the user to it. When the answer is missing - we actively ask the user if there is a will to submit the PR. More often than not, the users are willing to when encouraged (at least initially).

  • we have a "really quick to start" development environment for Airflow (Called Breeze) that we continuously improve and try to make easier to start contributing. 
  • we run semi-regular workshops for new contributors - for example today we have the "first time contributor's workshop"  https://airflowsummit.org/sessions/2021/workshop-contributing-apache-airflow/ - 3 hours hands-on when we teach the new contributors how to contribute. This is I think 5th or 6th time we do it (we have a few physical events and over last 1.5 year we had I think 4 online ones). This time we have 20 people who signed up  - from literally all over the world (and BTW. all proceedings from that cheap 50 USD workshop go to Apache Software Foundation as donation) and we mention it to the contributors. Another example is Pycon Taiwan Sprint where we held 8 hr workshop there: 
  • We have "community" days at the Summit where we have talks encouraging people to contribute and we often send people to those. Examples here:

    https://airflowsummit.org/sessions/2021/contributing-journey-becoming-leading-contributor/ - the road of Kaxil, the PMC of Airflow through committership
    https://airflowsummit.org/sessions/2021/contributing-first-steps/ - the first steps by a fresh contributor to Airlfow who shared his experiences
    https://airflowsummit.org/sessions/2021/dont-have-to-wait/  - "You don't have to wait for someone to fix it for you"  - the talk from one of the committers to Airflow, Leah and her co-worker Rachel

  • And we have quite few more talks for those who want to start contributing to Airflow:

    https://airflowsummit.org/sessions/2021/guide-airflow-architecture/  - The newcomer's guide to Airflow Architecture

Future

GitHub Issues  were already super-useful when we switched 2 years ago - but now with Issue Forms and GitHub Discussions together, they are GREAT. Also I am discussing with GitHub about the possibility of using the (optional) new "tabular" GitHub Issues experience https://github.blog/2021-06-23-introducing-new-github-issues/ they introduced recently. It is in Private beta stage now and not yet available for Public projects, but they promised October-ish time frame to get it available to Public projects (I also got the promise that ASF is the first on the Beta list to try when they are made available for Public projects). From what I saw in the demo I got from them - this will enable all kinds of automation and management that we miss currently. You will be able to see the issues in spreadsheet-like form, add custom attributes, and build all kinds of automation around the issues more easily. This will enormously help us with automated triaging of the issues. 

Also we are waiting for Codespaces General Availability and our development environment is prepared to be used there out-of-the-box. This will make even easier path for new contributors to start contributing their code straight from the GitHub UI. https://github.com/features/codespaces.

How to migrate

Here is the approach thate  Apache Airflow project took to migrate from JIRA. It's likely applicable for other issue management systems, especially that it is more about community engagement than tools.

...