Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Description:

Currently, within the incubator-mxnet repo there are over 800+ open issues and new ones being generated every day. The goal is to ease this process and handle developer's issues in an appropriate manner. With the use of labelling, experts in their respective areas can provide the help that users face. We employ the label bot today to help ease the issue/pull request process. Given the data which the repository provides of issues and pull requests which have been previously labelled, an interesting use case of this data opens up. Based upon the data of this repository, we are able to provide insights and predictions of labels on new issues and pull requests. This mechanism will provide a better experience for those who have raised an issue to get a faster response, and it allows for existing and new contributors to better filter for their areas of expertise who are wanting to help out welcoming new developers.

Proposal:

This prediction service offered by the label bot can be useful to the community for labelling certain issues and pull requests based upon certain metrics and accuracy figures. The bot will then be able to provide labels or label recommendations on newly opened issues and pull requests.

Data Analysis:

Note: Training data here is limited (~13,000 issues both closed and opened), after the data cleaning process we expect this value to be greatly further reduced. Also, we have to consider that not all issues have been labelled and if labelled not all labels which may represent that issue have been 

Metrics:

Multi-label Classification:
Accurate prediction of at least one label in an issue across issues: ~87%
Accuracy in predicting all labels in an issue (i.e. an exact match of all labels to an issue) across issues: ~20%


How was the data collected:

The labels below were chosen for prediction initially by the model. Only the issues which are specific to these labels are what is being tested on, in other words either the specific label being tested on was predicted by the model or the specific label was the actual label on the issue. The accuracy shown below denotes where the model predicted a label and that was one of the actual labels in the repo.


*** The accuracy metric was collected using sklearn's accuracy_score method ***

(https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score)

Classification Accuracy:

LabelAccuracyIssue Count
Performance100%87
Test99.59%245
Question97.02%302
Doc90.32%155
Installation84.07%113
Example80.81%99
Bug78.66%389
Build69.87%156
onnx69.57%23
scala67.24%58
gluon44.38%160
flaky42.78%194
Feature32.24%335
C++29.33%75
ci28.30%53
Cuda22.09%86

*** In depth analysis with precision, recall, and f1 ***

Classification report with precision, recall, and f1 score

Data Insights:


Motivations/Conclusion:

This shows us which labels which we can provide by the model given a certain accuracy threshold. The bot would help in being able to determine at least one label to new issues but may not always be able to deliver all the labels that are associated with an issue.