| Literature DB >> 27591081 |
Takaya Saito1, Marc Rehmsmeier1,2.
Abstract
The precision-recall plot is more informative than the ROC plot when evaluating classifiers on imbalanced datasets, but fast and accurate curve calculation tools for precision-recall plots are currently not available. We have developed Precrec, an R library that aims to overcome this limitation of the plot. Our tool provides fast and accurate precision-recall calculations together with multiple functionalities that work efficiently under different conditions.Entities:
Mesh:
Year: 2016 PMID: 27591081 PMCID: PMC5408773 DOI: 10.1093/bioinformatics/btw570
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1Results of evaluating precision–recall curves calculated by five different tools for three test sets – C1, C2 and C3. (A) The plot shows manually calculated points for C1 (red), C2 (green) and C3 (blue). Each test set contains three different test categories: SE (start and end positions), Ip (intermediate position and interpolation) and Rg (x and y ranges). In addition, each category has 3–5 individual test items. The remaining plots show the calculated curves with successes/total per category for (B) ROCR, (C) AUCCalculator, (D) PerfMeas, (E) PRROC and (F) Precrec
Benchmarking results of the five tools in millisecond
| Tool | Curve | AUC | NL | 100 | 1000 | 1 million |
|---|---|---|---|---|---|---|
| ROCR | Yes | No | No | 5.4 | 6.8 | (2.6 s) |
| AUCCalculator | Yes | Yes | Yes | 105 | 216 | (33 min) |
| PerfMeas | Yes | Yes | No | 0.2 | 0.4 | 763 |
| PRROC | Yes | Yes | Yes | 348 | (74 sec) | (123 days)a |
| PRROC (step=1) | Yes | Yes | No | 7.9 | 96 | (6.3 hrs)a |
| PRROC (AUC) | No | Yes | Yes | 23.7 | 236 | (4 min) |
| Precrec | Yes | Yes | Yes | 6.4 | 6.8 | 463 |
Tool: We performed PRROC (step = 1) with minStepSize = 1 and PRROC (AUC) without curve calculation. Curve: curve calculation. AUC: AUC calculation. NL: non-linear interpolation. 100, 1000, 1 million: test dataset size. We tested each case 10 times and calculated the average (mean) processing time. The measurement unit is millisecond unless indicated otherwise.
We tested only once for these cases.