| Literature DB >> 19619306 |
Wei Zhao1, Kirk E Hevener, Stephen W White, Richard E Lee, James M Boyett.
Abstract
BACKGROUND: Receiver operating characteristic (ROC) curve is widely used to evaluate virtual screening (VS) studies. However, the method fails to address the "early recognition" problem specific to VS. Although many other metrics, such as RIE, BEDROC, and pROC that emphasize "early recognition" have been proposed, there are no rigorous statistical guidelines for determining the thresholds and performing significance tests. Also no comparisons have been made between these metrics under a statistical framework to better understand their performances.Entities:
Mesh:
Year: 2009 PMID: 19619306 PMCID: PMC2722655 DOI: 10.1186/1471-2105-10-225
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Empirical distributions of ranking scores under the null hypothesis that 10 actives are uniformly distributed, 1000 compounds in total. (A) AU-ROC, smooth lines indicate normal distribution with mean = 0.5005 and variance = 0.0084; (B) RIE; (C) BEDROC; (D) pROC; (E) SLR, 10 × log(1000)-SLR is Gamma(10, 1) distributed; and (F) the perfect linear relationship between RIE and BEDROC.
Figure 2Null distribution of BEDROC for different lists of compounds (A) n = 5 and 95% threshold is 0.20; (B) n = 10 and 95% threshold is 0.16; (C) n = 20 and 95% threshold is 0.14; and (D) n = 100 and 95% threshold is 0.17.
Comparisons of the "earliness" of different metrics to detect early recognitions for different number of actives (n = 5, 10, 20, 100).
| Metric 1 | Metric 2 | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100% |
| n = 5 | |||||||||||
| pROC | AU-ROC | 0.80 | 0.80 | 0.70 | 0.70 | 0.70 | 0.56 | 0.42 | 0.42 | 0.42 | 0.32 |
| BEDROC | AU-ROC | 0.86 | 0.86 | 0.72 | 0.72 | 0.72 | 0.46 | 0.28 | 0.28 | 0.28 | 0.18 |
| SLR | AU-ROC | 0.79 | 0.79 | 0.72 | 0.72 | 0.72 | 0.58 | 0.43 | 0.43 | 0.43 | 0.32 |
| SLR | pROC | 0.52 | 0.52 | 0.52 | 0.52 | 0.52 | 0.52 | 0.51 | 0.51 | 0.51 | 0.52 |
| SLR | BEDROC | 0.44 | 0.44 | 0.47 | 0.47 | 0.47 | 0.61 | 0.66 | 0.66 | 0.66 | 0.67 |
| pROC | BEDROC | 0.46 | 0.46 | 0.48 | 0.48 | 0.48 | 0.60 | 0.65 | 0.65 | 0.65 | 0.67 |
| n = 10 | |||||||||||
| pROC | AU-ROC | 0.78 | 0.77 | 0.73 | 0.67 | 0.61 | 0.52 | 0.46 | 0.40 | 0.35 | 0.32 |
| BEDROC | AU-ROC | 0.81 | 0.87 | 0.76 | 0.62 | 0.49 | 0.38 | 0.30 | 0.23 | 0.20 | 0.18 |
| SLR | AU-ROC | 0.76 | 0.76 | 0.73 | 0.67 | 0.61 | 0.54 | 0.47 | 0.41 | 0.36 | 0.33 |
| SLR | pROC | 0.54 | 0.52 | 0.54 | 0.54 | 0.55 | 0.54 | 0.54 | 0.54 | 0.54 | 0.54 |
| SLR | BEDROC | 0.50 | 0.35 | 0.45 | 0.54 | 0.61 | 0.66 | 0.67 | 0.69 | 0.70 | 0.70 |
| pROC | BEDROC | 0.51 | 0.37 | 0.45 | 0.55 | 0.61 | 0.65 | 0.68 | 0.69 | 0.69 | 0.69 |
| n = 20 | |||||||||||
| pROC | AU-ROC | 0.77 | 0.75 | 0.71 | 0.66 | 0.58 | 0.52 | 0.45 | 0.39 | 0.35 | 0.33 |
| BEDROC | AU-ROC | 0.85 | 0.84 | 0.71 | 0.57 | 0.44 | 0.34 | 0.26 | 0.21 | 0.18 | 0.17 |
| SLR | AU-ROC | 0.77 | 0.75 | 0.71 | 0.66 | 0.60 | 0.54 | 0.47 | 0.41 | 0.36 | 0.34 |
| SLR | pROC | 0.51 | 0.52 | 0.53 | 0.53 | 0.53 | 0.53 | 0.53 | 0.53 | 0.53 | 0.53 |
| SLR | BEDROC | 0.40 | 0.38 | 0.50 | 0.58 | 0.65 | 0.69 | 0.71 | 0.72 | 0.72 | 0.72 |
| pROC | BEDROC | 0.41 | 0.38 | 0.48 | 0.58 | 0.63 | 0.66 | 0.69 | 0.70 | 0.70 | 0.70 |
| n = 100 | |||||||||||
| pROC | AU-ROC | 0.75 | 0.73 | 0.67 | 0.61 | 0.54 | 0.48 | 0.42 | 0.38 | 0.35 | 0.34 |
| BEDROC | AU-ROC | 0.87 | 0.79 | 0.62 | 0.47 | 0.36 | 0.28 | 0.22 | 0.19 | 0.17 | 0.16 |
| SLR | AU-ROC | 0.75 | 0.72 | 0.68 | 0.62 | 0.56 | 0.50 | 0.44 | 0.39 | 0.36 | 0.35 |
| SLR | pROC | 0.55 | 0.55 | 0.56 | 0.56 | 0.56 | 0.57 | 0.57 | 0.58 | 0.58 | 0.58 |
| SLR | BEDROC | 0.32 | 0.42 | 0.56 | 0.64 | 0.69 | 0.72 | 0.74 | 0.74 | 0.74 | 0.74 |
| pROCa | BEDROC | 0.34 | 0.43 | 0.56 | 0.63 | 0.68 | 0.70 | 0.72 | 0.73 | 0.72 | 0.72 |
The top early recognitions are represented as percentages.
Comparisons of the earliness of wSLR and BEDROC (α = 20) at different powers and number of actives, β = 0, 0.5, 1, 2, 5, 10, n = 5, 10, 20, 100.
| 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100% | |
| n = 5 | ||||||||||
| 0 | 0.44 | 0.44 | 0.47 | 0.47 | 0.47 | 0.61 | 0.66 | 0.66 | 0.66 | 0.67 |
| 0.5 | 0.50 | 0.50 | 0.50 | 0.50 | 0.50 | 0.58 | 0.62 | 0.62 | 0.62 | 0.62 |
| 1 | 0.53 | 0.53 | 0.49 | 0.49 | 0.49 | 0.55 | 0.58 | 0.58 | 0.58 | 0.58 |
| 2 | 0.60 | 0.60 | 0.48 | 0.48 | 0.48 | 0.50 | 0.52 | 0.52 | 0.52 | 0.52 |
| 5 | 0.65 | 0.65 | 0.40 | 0.40 | 0.40 | 0.40 | 0.42 | 0.42 | 0.42 | 0.43 |
| 10 | 0.65 | 0.65 | 0.34 | 0.34 | 0.34 | 0.34 | 0.36 | 0.36 | 0.36 | 0.38 |
| n = 10 | ||||||||||
| 0 | 0.50 | 0.35 | 0.45 | 0.54 | 0.61 | 0.66 | 0.67 | 0.69 | 0.70 | 0.70 |
| 0.5 | 0.56 | 0.40 | 0.47 | 0.55 | 0.60 | 0.63 | 0.65 | 0.65 | 0.66 | 0.66 |
| 1 | 0.60 | 0.42 | 0.48 | 0.54 | 0.58 | 0.60 | 0.61 | 0.61 | 0.61 | 0.61 |
| 2 | 0.68 | 0.46 | 0.47 | 0.51 | 0.53 | 0.55 | 0.55 | 0.56 | 0.56 | 0.56 |
| 5 | 0.77 | 0.43 | 0.37 | 0.38 | 0.39 | 0.41 | 0.42 | 0.43 | 0.44 | 0.44 |
| 10 | 0.83 | 0.36 | 0.29 | 0.31 | 0.33 | 0.34 | 0.35 | 0.36 | 0.37 | 0.37 |
| n = 20 | ||||||||||
| 0 | 0.40 | 0.38 | 0.50 | 0.58 | 0.65 | 0.69 | 0.71 | 0.72 | 0.72 | 0.72 |
| 0.5 | 0.46 | 0.41 | 0.51 | 0.58 | 0.63 | 0.65 | 0.67 | 0.67 | 0.67 | 0.67 |
| 1 | 0.48 | 0.44 | 0.51 | 0.56 | 0.60 | 0.61 | 0.61 | 0.62 | 0.62 | 0.62 |
| 2 | 0.55 | 0.46 | 0.49 | 0.53 | 0.55 | 0.56 | 0.57 | 0.57 | 0.57 | 0.57 |
| 5 | 0.60 | 0.41 | 0.40 | 0.42 | 0.44 | 0.46 | 0.46 | 0.47 | 0.47 | 0.47 |
| 10 | 0.56 | 0.33 | 0.30 | 0.32 | 0.33 | 0.36 | 0.37 | 0.38 | 0.39 | 0.39 |
| n = 100 | ||||||||||
| 0 | 0.32 | 0.42 | 0.56 | 0.64 | 0.69 | 0.72 | 0.74 | 0.74 | 0.74 | 0.74 |
| 0.5 | 0.36 | 0.45 | 0.57 | 0.63 | 0.67 | 0.69 | 0.70 | 0.70 | 0.70 | 0.69 |
| 1 | 0.39 | 0.46 | 0.56 | 0.60 | 0.63 | 0.65 | 0.65 | 0.65 | 0.65 | 0.65 |
| 2 | 0.44 | 0.48 | 0.56 | 0.59 | 0.61 | 0.61 | 0.61 | 0.61 | 0.61 | 0.61 |
| 5 | 0.47 | 0.42 | 0.46 | 0.47 | 0.48 | 0.49 | 0.50 | 0.50 | 0.50 | 0.51 |
| 10 | 0.45 | 0.33 | 0.34 | 0.36 | 0.38 | 0.39 | 0.40 | 0.40 | 0.41 | 0.41 |
The top early recognitions are represented as percentages.
Figure 3Statistical power of different metrics for . Black line, AU-ROC; blue line, SLR; turquoise line, BEDROC; and red line, pROC.
Figure 4Statistical power of different metrics at different . Black line, AU-ROC; blue line, SLR; turquoise line, BEDROC; and red line, pROC.