| Literature DB >> 33492342 |
Silvia Benevenuta1, Emidio Capriotti2, Piero Fariselli1.
Abstract
Identifying pathogenic variants and annotating them is a major challenge in human genetics, especially for the non-coding ones. Several tools have been developed and used to predict the functional effect of genetic variants. However, the calibration assessment of the predictions has received little attention. Calibration refers to the idea that if a model predicts a group of variants to be pathogenic with a probability P, it is expected that the same fraction P of true positive is found in the observed set. For instance, a well-calibrated classifier should label the variants such that among the ones to which it gave a probability value close to 0.7, approximately 70% actually belong to the pathogenic class. Poorly calibrated algorithms can be misleading and potentially harmful for clinical decision-making. Supplementary information Supplementary data are available at Bioinformatics online.Entities:
Year: 2021 PMID: 33492342 PMCID: PMC8023678 DOI: 10.1093/bioinformatics/btaa943
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.(A) ROC curves of PhD-SNPg, FATHMM-MKL, CADD, DANN and Eigen on the complete dataset (both coding and non-coding variants). DeepSea has been evaluated only on the subset of non-coding variants, since it has been developed only to score them. AUCs for coding and non-coding variants are reported in Supplementary Table S2. True- and false-positive rates are defined in Supplementary Materials. (B) Calibration curves of the predictors on coding and non-coding variants. CADD and Eigen scores have been modified using a sigmoid transformation (1/(1+exp(-A – x + B))). The best parameters were: A = 1, B = 2.5 for CADD and A = 1, B = 1.63/0.05 for Eigen (coding and non-coding variants were transformed separately, since Eigen provides two different sets of scores)
Brier scores of the methods on the dataset
| Predictor | BSCoding | BSNon-Coding | BSAll |
|---|---|---|---|
| PhD-SNPg | 0.10/0.10 | 0.03/0.03 | 0.07/0.07 |
| DANN | 0.24/0.09 | 0.27/0.05 | 0.25/0.07 |
| FATHMM | 0.17/0.15 | 0.07/0.04 | 0.14/0.12 |
| DeepSea | – | 0.43/0.08 | – |
| Eigena | 0.14/0.07 | 0.06/0.04 | 0.11/0.06 |
| CADDa | 0.06/0.05 | 0.04/0.03 | 0.05/0.05 |
Note: Brier scores (BS) of the methods before and after isotonic calibration.
Uncalibrated scores for Eigen and CADD are obtained after sigmoid transformation.