| Literature DB >> 30497360 |
Chiara Cimmaruta1, Valentina Citro1, Giuseppina Andreotti2, Ludovica Liguori3, Maria Vittoria Cubellis4, Bruno Hay Mele1,5.
Abstract
BACKGROUND: Severity gradation of missense mutations is a big challenge for exome annotation. Predictors of deleteriousness that are most frequently used to filter variants found by next generation sequencing, produce qualitative predictions, but also numerical scores. It has never been tested if these scores correlate with disease severity.Entities:
Keywords: Bioinformatics; Clinical informatics; Fabry disease; Rare disease; Variant analysis
Mesh:
Substances:
Year: 2018 PMID: 30497360 PMCID: PMC6266955 DOI: 10.1186/s12859-018-2416-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Accuracy Indexes
| Category | Predictor | Raw Accuracy | Balanced accuracy |
| Matthew’s correlation coefficient |
|---|---|---|---|---|---|
| B | SIFT | 0.749 | 0.549 | 0.241 | 0.092 |
| B | LRT | 0.794 | 0.576 | 0.280 | 0.162 |
| B | MutationAssessor | 0.191 | 0.460 | 0.247 | −0.106 |
| B | FATHMM | 0.846 | 0.500 | 0.000 | 0.000 |
| B | PROVEAN | 0.737 | 0.557 | 0.258 | 0.103 |
| Meta | MetaSVM | 0.846 | 0.500 | 0.000 | 0.000 |
| Meta | MetaLR | 0.846 | 0.500 | 0.000 | 0.000 |
| Meta | M-CAP | 0.846 | 0.500 | 0.000 | 0.000 |
| ML | Polyphen2_HDIV | 0.771 | 0.592 | 0.310 | 0.175 |
| ML | Polyphen2_HVAR | 0.691 | 0.621 | 0.341 | 0.188 |
| ML | MutationTaster | 0.194 | 0.433 | 0.230 | −0.156 |
| ML | FATHMM-MKL | 0.829 | 0.505 | 0.063 | 0.022 |
Accuracy indexes measuring the ability to differentiate severe from mild GLA mutations for all the predictors used by wANNOVAR. Categories are B for “biologically based prediction method”, ML for “Machine Learning based prediction method”, and Meta for “Meta prediction method”
Fig. 1Distribution of residual activities for phenotypically annotated GLA mutations. The boxplot shows the distribution of residual activity in the subpopulations of mutations with severe and mild effects. The red bars represent outliers
Fig. 2Distribution of rank scores for mutations with null residual activity. The boxplot show the distribution of the rank scores for all the predictors used by wANNOVAR. The red bars represent outliers. Predictor category label is B for “biologically based prediction method”, ML for “Machine Learning based prediction method”, Meta for “Meta prediction method” and Cons for “Conservation scoring tool”
Fig. 3Rank scores for mutations with residual activity equal or greater than wild type alpha-galactosidase. The histograms show the rank scores of the six mutations whose residual activity is greater or equal than the wild type alpha-galactosidase, for each of the wANNOVAR predictors. Mutations are color coded, and are detailed inset
Correlations
| Category | Name | Pearson’s |
|
|---|---|---|---|
| B | SIFT | − 0.493 | 7.87E-19 |
| B | LRT | −0.486 | 2.76E-18 |
| B | MutationAssessor | −0.573 | 5.22E-26 |
| B | FATHMM | −0.054 | 1.85E-01 |
| B | PROVEAN | −0.546 | 1.86E-23 |
| Meta | VEST3 |
| 1.08E-42 |
| Meta | MetaSVM | 0.285 | 1.00E + 00 |
| Meta | MetaLR | −0.482 | 5.77E-18 |
| Meta | M-CAP | −0.255 | 8.09E-06 |
| ML | POLYPHEN2 HDIV |
| 1.67E-38 |
| ML | POLYPHEN2 HVAR |
| 4.53E-35 |
| ML | MutationTaster | −0.499 | 2.42E-19 |
| ML | CADD | −0.595 | 1.78E-28 |
| ML | DANN | −0.388 | 8.51E-12 |
| ML | FATHMM-MKL | −0.434 | 1.35E-14 |
| ML | GenoCanyon | −0.282 | 7.95E-07 |
| n | GERP++ | −0.405 | 9.34E-13 |
| Cons | phyloP7way vertebrate | −0.441 | 4.79E-15 |
| Cons | phyloP20way mammalian | −0.214 | 1.54E-04 |
| Cons | phastCons7way vertebrate | −0.486 | 2.55E-18 |
| Cons | phastCons 20 way mammalian | −0.256 | 7.35E-06 |
| Cons | SiPhy 29way logOdds | −0.389 | 7.65E-12 |
Pearson’s r correlation coefficient between rank scores and residual activities, together with the associated p-value for significance scoring, for all the predictors used by wANNOVAR. Bold text is used for the highest correlations. Categories are B for “biologically based prediction method”, ML for “Machine Learning based prediction method”, Meta for “Meta prediction method” and Cons for “Conservation scoring tool”