You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

Description:

Currently, within the incubator-mxnet repo there are over 800+ issues and new ones being generated every day. The goal is to ease this process and handle developer's issues in an appropriate manner. With the use of labelling, experts in their respective areas can provide the help that users face. We employ the label bot today to help ease the issue/pull request process. Given the data which the repository provides of issues and pull requests which have been previously labelled, an interesting use case of this data opens up. Based upon the data of this repository, we are able to provide insights and predictions of labels on new issues and pull requests. This mechanism will provide a better experience for those who have raised an issue to get a faster response, and it allows for existing and new contributors to better filter for their areas of expertise who are wanting to help out welcoming new users.


Background:



Note: Training data here is limited (~13,187 issues) but after the data cleaning process we expect this value to be greatly further reduced.

Metrics

Multi-label Classification:
Accurate prediction of at least one label in an issue: ~87%
Accuracy in predicting all labels in an issue (may not be able to get all labels necessarily in an issue): ~20%

(Last collect issue id — 13193. Date: 11/8)
(These were labels chosen for prediction initially by the model — issues specific to these labels are what is tested on)


Results in accurately predicting a label (The model predicted a label and that was one of the actual labels in the repo):


Classification Accuracy:

LabelAccuracyIssue Count
Performance100%87
Test99.59%245
Question97.02%302
Doc90.32%155
Installation84.07%113
Example80.81%99
Bug78.66%389
Build69.87%156
onnx69.57%23
scala67.24%58
gluon44.38%160
flaky42.78%194
Feature32.24%335
C++29.33%75
ci28.30%53
Cuda22.09%86
  • No labels