| Literature DB >> 31484976 |
Yuan Tian1, Tina Pesaran1, Adam Chamberlin1, R Bryn Fenwick1, Shuwei Li1, Chia-Ling Gau1, Elizabeth C Chao1,2, Hsiao-Mei Lu1, Mary Helen Black1, Dajun Qian3.
Abstract
Many in silico predictors of genetic variant pathogenicity have been previously developed, but there is currently no standard application of these algorithms for variant assessment. Using 4,094 ClinVar-curated missense variants in clinically actionable genes, we evaluated the accuracy and yield of benign and deleterious evidence in 5 in silico meta-predictors, as well as agreement of SIFT and PolyPhen2, and report the derived thresholds for the best performing predictor(s). REVEL and BayesDel outperformed all other meta-predictors (CADD, MetaSVM, Eigen), with higher positive predictive value, comparable negative predictive value, higher yield, and greater overall prediction performance. Agreement of SIFT and PolyPhen2 resulted in slightly higher yield but lower overall prediction performance than REVEL or BayesDel. Our results support the use of gene-level rather than generalized thresholds, when gene-level thresholds can be estimated. Our results also support the use of 2-sided thresholds, which allow for uncertainty, rather than a single, binary cut-point for assigning benign and deleterious evidence. The gene-level 2-sided thresholds we derived for REVEL or BayesDel can be used to assess in silico evidence for missense variants in accordance with current classification guidelines.Entities:
Year: 2019 PMID: 31484976 PMCID: PMC6726608 DOI: 10.1038/s41598-019-49224-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Prediction performance of in silico evidence assignment (2,153 variants in 20 genes).
| Method | Performance statistic (Rank)a | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| TP | TN | FP | FN | NE | PPV | NPV | YR | OPP | |
| SIFT/PolyPhen2 agreement | 888 | 649 | 177 | 54 | 385 | 0.834 (11) | 0.923 (9) | 0.821 (1) | 0.861 (10) |
| Gene-level thresholds | |||||||||
| CADD | 746 | 707 | 60 | 38 | 602 | 0.926 (7) | 0.949 (7) | 0.720 (9) | 0.871 (8) |
| MetaSVM | 848 | 746 | 57 | 39 | 463 | 0.937 (5) | 0.950 (6) | 0.785 (4) | 0.894 (4) |
| Eigen | 850 | 761 | 51 | 27 | 464 | 0.943 (3) | 0.966 (2) | 0.784 (5) | 0.901 (3) |
| REVEL | 858 | 784 | 40 | 33 | 438 | 0.955 (2) | 0.960 (3) | 0.797 (3) | 0.907 (2) |
| BayesDel | 859 | 798 | 39 | 40 | 417 | 0.957 (1) | 0.952 (5) | 0.806 (2) | 0.908 (1) |
| Generalized thresholds | |||||||||
| CADD | 563 | 525 | 75 | 18 | 972 | 0.882 (10) | 0.967 (1) | 0.549 (11) | 0.819 (11) |
| MetaSVM | 848 | 697 | 75 | 64 | 469 | 0.919 (8) | 0.916 (11) | 0.782 (6) | 0.875 (6) |
| Eigen | 747 | 684 | 80 | 31 | 611 | 0.903 (9) | 0.957 (4) | 0.716 (10) | 0.865 (9) |
| REVEL | 846 | 673 | 52 | 60 | 522 | 0.942 (4) | 0.918 (10) | 0.758 (7) | 0.876 (5) |
| BayesDel | 825 | 672 | 58 | 52 | 546 | 0.934 (6) | 0.928 (8) | 0.746 (8) | 0.874 (7) |
aAll performance statistics, except those for SIFT/PolyPhen2 agreement, were evaluated by leave-one-out cross-validation. Ranks in parentheses were the descending orders of performance statistics among comparison methods. The p-values of Monte Carlo permutation tests for differences of OPP statistics between evidence of gene-level versus generalized thresholds were 0.0005, 0.09, 0.002, 0.006 and 0.003 for CADD (OPP: 0.871 vs. 0.819), MetaSVM (OPP: 0.894 vs. 0.875), Eigen (OPP: 0.901 vs. 0.865), REVEL (OPP: 0.907 vs. 0.876) and BayesDel (OPP: 0.908 vs. 0.874), respectively. TN, true negative; FN, false negative; TP, true positive; FP, false positive; NE, no evidence; PPV, positive predictive value; NPV, negative predictive value; YR, yield rate; OPP, overall prediction performance.
Gene-level thresholds for assigning benign and deleterious in silico evidence in missense variantsa.
| Gene | Thresholds of REVEL scores | Thresholds of BayesDel scores | ||
|---|---|---|---|---|
| TBE | TDE | TBE | TDE | |
|
| 0.359 | 0.689 | −0.180 | 0.216 |
|
| 0.514 | 0.731 | −0.076 | 0.248 |
|
| 0.628 | 0.824 | 0.147 | 0.425 |
|
| 0.581 | 0.974 | 0.080 | 0.500 |
|
| 0.438 | 0.727 | −0.032 | 0.277 |
|
| 0.515 | 0.762 | 0.026 | 0.329 |
|
| 0.326 | 0.597 | −0.328 | 0.047 |
|
| 0.417 | 0.649 | −0.176 | 0.127 |
|
| 0.109 | 0.815 | 0.107 | 0.423 |
|
| 0.562 | 0.862 | 0.085 | 0.426 |
|
| 0.556 | 0.881 | 0.095 | 0.419 |
|
| 0.214 | 0.661 | −0.078 | 0.263 |
|
| 0.013 | 0.511 | −0.531 | 0.012 |
|
| 0.261 | 0.605 | −0.191 | 0.077 |
|
| 0.400 | 0.705 | −0.082 | 0.268 |
|
| 0.481 | 0.732 | −0.122 | 0.300 |
|
| 0.349 | 0.597 | −0.233 | 0.038 |
|
| 0.425 | 0.704 | −0.108 | 0.180 |
|
| 0.536 | 0.667 | −0.003 | 0.132 |
|
| 0.703 | 0.970 | 0.244 | 0.561 |
aThe 2-sided thresholds, denoted as TBE and TDE, are the lower and upper limits of REVEL or BayesDel scores for assigning BE and DE, respectively. Gene-level thresholds for BE and DE were estimated at probabilities of pathogenicity 0.2 and 0.8, respectively. BE, benign evidence; DE, deleterious evidence.
Figure 1Assessment of in silico evidence in missense variants. The OPP statistics were reported in each of the 20 genes using gene-level thresholds. The OPP in 20 genes combined were 0.871, 0.894, 0.901, 0.907 and 0.908 for CADD, MetaSVM, Eigen, REVEL and BayesDel, respectively. P-values for pairwise comparisons were each estimated from Monte Carlo permutation test with 10,000 permutations. OPP, overall prediction performance.
Figure 2Variation in thresholds for assigning benign and deleterious in silico evidence across 20 genes. (a) Gene-level 2-sided thresholds and their 90% confidence intervals (CI) for REVEL. (b) Gene-level 2-sided thresholds and their 90% confidence intervals (CI) for BayesDel. Thresholds for BE and DE were represented by green and red dots, respectively. BE, benign evidence; DE, deleterious evidence.