Page History

...

(https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html#sklearn.metrics.accuracy_score)

Classification Accuracy:

Label	Accuracy	Issue Count
Performance	100%	87
Test	99.59%	245
Clojure	98.90%	12 (Test set: 1000)
Java	98.50%	2 (Test set: 1000)
Python	98.30%	170 (Test set: 1000)
C++	97.20%	2 (Test set: 1000)
Scala	96.30%	40 (Test set: 1000)
Question	97.02%	302
Doc	90.32%	155
Installation	84.07%	113
Example	80.81%	99
Bug	78.66%	389
Build	69.87%	156
onnx	69.57%	23

scala67.24%58

gluon	44.38%	160
flaky	42.78%	194
Feature	32.24%	335

C++29.33%

75

ci	28.30%	53
Cuda	22.09%	86

Language Detection from Code Snippets in Issues:

LanguageAccuracyClojure98.90%Java98.50%Python98.30%

C++97.20%Scala96.30%

*** In depth analysis with precision, recall, and f1 ***

Classification report with precision, recall, and f1 score

Label	Precision	Recall	F1 Score	Count
Performance	100%	100%	100%	87
Test	99.59%	100%	99.8%	245
Clojure	98.31%	98.90%	98.61%	12 (Test set: 1000)
Python	98.70%	98.30%	98.50%	170 (Test set: 1000)
Question	100%	97.02%	98.49%	302
Java	97.24%	98.50%	97.87%	2 (Test set: 1000)
C++	98.28%	97.20%	97.74%	2 (Test set: 1000)
Scala	97.37%	96.30%	96.84%	40 (Test set: 1000)
Doc	100%	90.32%	94.92%	155
Installation	100%	84.07%	91.35%	113
Example	100%	80.81%	89.39%	99
Bug	100%	78.66%	88.06%	389
Build	100%	69.87%	82.26%	156
onnx	80%	84.21%	82.05%	23

scala86.67%75%80.41%

58

gluon	62.28%	60.68%	61.47%	160
flaky	96.51%	43.46%	59.93%	194
Feature	32.43%	98.18%	48.76%	335

C++55%38.6%45.36%75

ci	48.39%	40.54%	44.12%	53
Cuda	22.09%	100%	36.19%	86

...

Language

Precision

...

Precision here representing how accurate our classifier was in correctly labelling an issue given all the times it had predicted that label.

...

F1 score balances both the precision and recall scores

Programming languages were trained on large amounts of data pulled from a wide array of repositories we are able to deliver these high metrics especially with regards to programming languages by making use of MXNet for deep learning to learn similarities among these languages we consider (which are the programming languages that are present in the repo). Specifically this was trained on data snippets of files pulled from the data files present here: https://github.com/aliostad/deep-learning-lang-detection/tree/master/data.

Motivations/Conclusion:

We do notice that there is a case that may be present of overfitting here, especially with the case of the Performance label. However in looking further into the issues labeled as Performance, we notice that similar words and phrases are included across issues labeled as Performance (i.e. in most cases the word itself, and words like speed..). Given this data, we are able to see which labels the model can predict accurately for. Given a certain accuracy threshold, the bot has the potential to label an issue given that it surpasses this value. As a result, we would be able to accurately provide labels to new issues.

...

Page tree

Versions Compared

Old Version 29

New Version 30

Key

Classification Accuracy: