| Literature DB >> 25941668 |
Yan Yuan1, Wanhua Su2, Mu Zhu3.
Abstract
BACKGROUND: The area under the receiver operating characteristic curve (AUC) is frequently used as a performance measure for medical tests. It is a threshold-free measure that is independent of the disease prevalence rate. We evaluate the utility of the AUC against an alternate measure called the average positive predictive value (AP), in the setting of many medical screening programs where the disease has a low prevalence rate.Entities:
Keywords: area under the ROC curve; average positive predictive value; biomarker; low prevalence rate; mammography
Year: 2015 PMID: 25941668 PMCID: PMC4403252 DOI: 10.3389/fpubh.2015.00057
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Figure 1Prostate cancer example. Top 15 biomarkers according to the AP. Biomarkers are not labeled unless they are explicitly mentioned in the text.
Figure 2Prostate cancer example. Histograms for biomarkers that are ranked differently by the AP and by the AUC. Red and yellow histograms represent cases and controls, respectively. Pair (A) (8355.562, 7819.751) scored similarly on the AUC-scale but very differently on the AP-scale. Pair (B) (9149.121, 5074.164) scored somewhat similarly on the AP-scale but very differently on the AUC-scale.
Figure 3Prostate cancer example. Comparison of ROC curves for biomarkers that are ranked differently by the AP and by the AUC. Pair (A) (8355.562, 7819.751), which scored similarly on the AUC-scale but very differently on the AP-scale, is shown in (A). Pair (B) (9149.121, 5074.164), which scored somewhat similarly on the AP-scale but very differently on the AUC-scale, is shown in (B).
Prostate cancer example.
| Biomarkers | AUC | AP | |||||
|---|---|---|---|---|---|---|---|
| A | 8355.562 | 0.849 | 0.783 | 0.783 | 0.856 | 0.606 | 0.571 |
| 7819.751 | 0.850 | 0.857 | 0.857 | 0.802 | 0.370 | 0.062 | |
| B | 5074.164 | 0.886 | 0.869 | 0.869 | 0.833 | 0.306 | 0.043 |
| 9149.121 | 0.832 | 0.793 | 0.793 | 0.822 | 0.512 | 0.225 | |
A simple thought experiment showing changes in the estimated AUC and AP as a result of artificially inflating the number of control subjects (.
A screening test partitions a sample of .
| Score | Total | |
| Partition | ||
| Diseased | ||
| Non-diseased | ||
| Total |
The broken bars (¦) illustrates the case where all those with scores ≥.
Diagnostic accuracy of digital and film mammography using a seven-point malignancy scale after 455 days of follow-up [adapted from Table .
| Malignancy score | 7 | 6 | 5 | 4 | 3 | 2 | 1 | Total | |
|---|---|---|---|---|---|---|---|---|---|
| Digital | Category total | 11 | 29 | 69 | 1061 | 2224 | 6588 | 32588 | 42570 |
| Cancers | 10 | 18 | 25 | 85 | 49 | 25 | 122 | 334 | |
| Film | Category total | 17 | 29 | 70 | 942 | 2291 | 6910 | 32486 | 42745 |
| Cancers | 13 | 24 | 25 | 74 | 35 | 33 | 131 | 335 | |
Breast cancer example (see .
| Mammography type | AUC | AP | SE of AP | ||
|---|---|---|---|---|---|
| Asymptotic | P-bootstrap | NP- bootstrap | |||
| Digital | 0.753 | 0.144 | 0.0197 | 0.0197 | 0.0194 |
| Film | 0.735 | 0.166 | 0.0219 | 0.0216 | 0.0215 |
Film versus digital mammography. P-bootstrap, parametric bootstrap; NP-bootstrap, non-parametric bootstrap. A total of 5000 bootstrap samples were generated for each bootstrap method.
AUC, AP, DR, and FPF for three tests from Wald and Bestwick [(.
| AUC | AP | DR at FPF 0.05 | FPF at DR 50% | |||
|---|---|---|---|---|---|---|
| π = 0.5 | π ≈ 0.09 | π ≈ 0.01 | ||||
| SDA = SDU | 0.75 | 0.74 | 0.26 | 0.04 | 0.24 | 0.17 |
| SDA = 1.5SDU | 0.75 | 0.79 | 0.42 | 0.16 | 0.39 | 0.11 |
| SDA = 2SDU | 0.75 | 0.81 | 0.51 | 0.29 | 0.47 | 0.07 |
.