| Literature DB >> 30742610 |
Abhishek Niroula1, Mauno Vihinen1.
Abstract
Computational tools are widely used for interpreting variants detected in sequencing projects. The choice of these tools is critical for reliable variant impact interpretation for precision medicine and should be based on systematic performance assessment. The performance of the methods varies widely in different performance assessments, for example due to the contents and sizes of test datasets. To address this issue, we obtained 63,160 common amino acid substitutions (allele frequency ≥1% and <25%) from the Exome Aggregation Consortium (ExAC) database, which contains variants from 60,706 genomes or exomes. We evaluated the specificity, the capability to detect benign variants, for 10 variant interpretation tools. In addition to overall specificity of the tools, we tested their performance for variants in six geographical populations. PON-P2 had the best performance (95.5%) followed by FATHMM (86.4%) and VEST (83.5%). While these tools had excellent performance, the poorest method predicted more than one third of the benign variants to be disease-causing. The results allow choosing reliable methods for benign variant interpretation, for both research and clinical purposes, as well as provide a benchmark for method developers.Entities:
Mesh:
Year: 2019 PMID: 30742610 PMCID: PMC6386394 DOI: 10.1371/journal.pcbi.1006481
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Specificities of variant interpretation tools.
| All variants (n = 63,197) | Variants predicted by all tools (n = 7,268) | ||||||
|---|---|---|---|---|---|---|---|
| Tools | VUS | Benign | Harmful | Specificity | Benign | Harmful | Specificity |
| PON-P2 | 21,373 | 34,529 | 1,626 | 0.955 | 6655 | 613 | 0.916 |
| VEST | 1,168 | 22,614 | 4,480 | 0.835 | 5984 | 1284 | 0.823 |
| FATHMM | 5,531 | 43,005 | 6,766 | 0.864 | 6287 | 981 | 0.865 |
| PROVEAN | 3,908 | 45,868 | 13,421 | 0.774 | 5712 | 1556 | 0.786 |
| PPH2 | 6,386 | 37,124 | 13,602 | 0.732 | 5404 | 1864 | 0.744 |
| LRT | 19,333 | 31,736 | 12,128 | 0.724 | 5465 | 1803 | 0.752 |
| MA | 8,044 | 39,493 | 15,660 | 0.716 | 5306 | 1962 | 0.730 |
| CADD | 0 | 40,659 | 22,538 | 0.643 | 4539 | 2729 | 0.625 |
| SIFT | 5,099 | 36,808 | 21,290 | 0.634 | 4868 | 2400 | 0.670 |
| MT2 | 15,313 | 30,632 | 17,252 | 0.640 | 4764 | 2504 | 0.655 |
aAll variants having AF> = 1% and <25% in at least one population and not present in the training dataset for the method. After excluding cases in the training datasets, the total number of variants was 57,528 for PON-P2, 28,262 for VEST, 55,302 for FATHMM, and 57,112 for PPH2.
bVariants classified as benign or harmful. Variants present in training dataset of any of the tools were excluded. All variants that were automatically annotated without making predictions were excluded.
cVariants for which the predictions were not available, were ambiguous, or were predicted to have unknown significance.
dVariants present in the training datasets were excluded.
eVariants were not classified into benign and harmful by the program. A cutoff of 0.5 was used so that variants with score greater than or equal to 0.5 were classified as harmful, otherwise benign.
fHumVar version of PolyPhen-2 was used as the performance was higher than for HumDiv version.
gVariants were not classified into benign and harmful by the program. A cutoff of 20 was used so that variants with score greater than or equal to 20 were grouped as harmful and otherwise benign. The authors have recommended a cutoff ranging from 10 to 20. The highest cutoff was used so that the highest possible specificity was obtained.
hVariants that were automatically detected to be harmful or benign were not included in the classified cases as they are not real predictions by the tool, instead annotations based on known data.
Fig 1Performance of variant tolerance predictors.
Specificities of 10 prediction tools for variants with different AFs. The black horizontal line indicates performance for all variants (AF ≥1% and <25%). The variants with AF <1% have low AF in the whole dataset but have higher AF in at least one of the populations. MA, MutationAssessor; MT2, MutationTaster2; PPH2, PolyPhen-2.
Fig 2Performance of variant tolerance predictors for variants in ethnic groups.
Specificities of prediction tools for common variants (AF ≥1% and <25%) in different populations. AFR, African; AMR, American; EAS, East Asian; FIN, Finnish; NFE, Non-Finnish European; OTH, Other; SAS, South Asian; MA, MutationAssessor; MT2, MutationTaster2; PPH2, PolyPhen-2.
Fig 3Analysis of unique and non-unique variants in populations.
(A) Performance of tools on unique and non-unique variants with different minor allele frequencies in different populations. AFR, African; AMR, American; EAS, East Asian; FIN, Finnish; NFE, Non-Finnish European; SAS, South Asian. The unique dataset contains variants with AF ≥1% and <25% in the specific population but <1% in all other populations and the non-unique dataset consists of the remaining variants. The differences are shown by the lines containing the values for each population. (B) Fractions of unique and non-unique variants in relation to AF. The colors for AF ranges are shown to the right. (C) Specificities of prediction tools on unique and non-unique variants (AF 1–5%) for each ancestry group. Unique variants have AF ≥1% in specific ancestry group but AF < 1% in all other ancestry groups. Non-unique variants have AF ≥1% in more than one ancestry groups.
Fig 4Performance of variant tolerance predictors for variants in males and females.
Results are shown for all variants for males and females, both, as well as for unique variants in male (AF ≥1% in male but <1% in female) and female (AF ≥1% in female but <1% in male) populations.
Fig 5Chromosome-wise performance of tools.
Variants in chromosome Y were excluded because there were only 3 variants. MA, Mutation Assessor; MT2, MutationTaster2; PPH2, PolyPhen-2.