Page History

...

The label bot will provide predictions to label certain issues and pull requests. We will gather these metrics and accuracy figures, and given a certain accuracy threshold we can have the label bot label an issue as it see fit. We want to make sure to the best degree possible that all these labelled issues by the bot have the correct labels on them. If it isn't able to meet this threshold, it can provide a recommendation to the user of the labels it predicted. Note: Labelling is not permanent, we can always remove labels - however we would strive for this to be at a minimum.

Data Analysis

Provide more context here as in what is the data (or why we believe such things...)

Our dataset consists of all the issues present on the repository which includes issues that are open and closed. In helping to determine our label prediction, we gather the titles, descriptions and labels of the issues on the MXNet repository. We will retrain our model given new issues and pull requests every 24 hours so that the dataset is updated and ready to predict labels for new issues. We will set specific target labels that we are interested in labelling and help predict those labels (i.e. feature request, doc, bug, ...) on new issues.

...

How was the label accuracy mesasuredmeasured:

The labels below were chosen for prediction initially by the model. Only the issues which are specific to these labels are what is being tested on, in other words either the specific label being tested on was predicted by the model or the specific label was the actual label on the issue. The accuracy shown below denotes where the model predicted a label and that was one of the actual labels in the repo.

...

Classification report with precision, recall, and f1 score

Label	Precision	Recall	F1 Score	Count
Performance	100%	100%	100%	87
Test	99.59%	100%	99.8%	245
Question	100%	97.02%	98.49%	302
Doc	100%	90.32%	94.92%	155
Installation	100%	84.07%	91.35%	113
Example	100%	80.81%	89.39%	99
Bug	100%	78.66%	88.06%	389
Build	100%	69.87%	82.26%	156
onnx	80%	84.21%	82.05%	23
scala	86.67%	75%	80.41%	58
gluon	62.28%	60.68%	61.47%	160
flaky	96.51%	43.46%	59.93%	194
Feature	32.43%	98.18%	48.76%	335
C++	55%	38.6%	45.36%	75
ci	48.39%	40.54%	44.12%	53
Cuda	22.09%	100%	36.19%	86

Data Insights:

Motivations/Conclusion:

This shows us We are able to see which labels which we can provide by the model given can predict accurately for. Given a certain accuracy threshold. The bot would help in being able to determine at least one label to new issues but may not always be able to deliver all the labels that are associated with an issue., the bot has the potential to label an issue given that it surpasses this value. As a result, we would be able to accurately provide labels to new issues.

Page tree

Versions Compared

Old Version 24

New Version 25

Key

Data Analysis

Data Insights:

Motivations/Conclusion: