| Literature DB >> 36061567 |
Pedro Barbosa1,2, Marta Ribeiro3, Maria Carmo-Fonseca2, Alcides Fonseca1,4.
Abstract
Hypertrophic cardiomyopathy (HCM) is a common heart disease associated with sudden cardiac death. Early diagnosis is critical to identify patients who may benefit from implantable cardioverter defibrillator therapy. Although genetic testing is an integral part of the clinical evaluation and management of patients with HCM and their families, in many cases the genetic analysis fails to identify a disease-causing mutation. This is in part due to difficulties in classifying newly detected rare genetic variants as well as variants-of-unknown-significance (VUS). Multiple computational algorithms have been developed to predict the potential pathogenicity of genetic variants, but their relative performance in HCM has not been comprehensively assessed. Here, we compared the performance of 39 currently available prediction tools in distinguishing between high-confidence HCM-causing missense variants and benign variants, and we developed an easy-to-use-tool to perform variant prediction benchmarks based on annotated VCF files (VETA). Our results show that tool performance increases after HCM-specific calibration of thresholds. After excluding potential biases due to circularity type I issues, we identified ClinPred, MISTIC, FATHMM, MPC and MetaLR as the five best performer tools in discriminating HCM-associated variants. We propose combining these tools in order to prioritize unknown HCM missense variants that should be closely followed-up in the clinic.Entities:
Keywords: computational pathogenicity prediction; genetic testing; hypertrophic cardiomyopathy; missense variant interpretation; prediction tool comparison; variants-of-unknown-significance
Year: 2022 PMID: 36061567 PMCID: PMC9433717 DOI: 10.3389/fcvm.2022.975478
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
Figure 1Distribution of HCM-associated variants (Pathogenic/Likely pathogenic) with a review status of > 1 star in ClinVar (N = 768). (A) Number and proportion of overall variants per gene. (B) Number and proportion of overall variants per category. (C) Category of variants located in the MYH7 gene. (D) Category of variants located in the MYBPC3 gene.
Prediction tools analyzed in this study.
|
|
|
|
|---|---|---|
| Protein predictors | SIFT ( | <0.01 ( |
| MutPred ( | >0.5 | |
| PolyPhen-2 HDIV ( | >0.978 ( | |
| PolyPhen-2 HVAR ( | >0.978 ( | |
| Mutation Assessor ( | >1.935 ( | |
| Condel ( | >0.98 ( | |
| VEST4 ( | >0.764 ( | |
| MutationTaster2 ( | >0.5 ( | |
| FATHMM ( | <-4.14 ( | |
| PROVEAN ( | <-2.5 ( | |
| MetaSVM ( | >0.5 ( | |
| MetaLR ( | >0.5 ( | |
| M-CAP ( | >0.025 ( | |
| REVEL ( | >0.644 ( | |
| MPC ( | >1.360 ( | |
| MTR ( | <0.5 | |
| PrimateAI ( | >0.790 ( | |
| ClinPred ( | >0.5 ( | |
| MISTIC ( | >0.5 ( | |
| cVEP ( | >0.5 | |
| MVP ( | >0.7 ( | |
| VARITY ( | >0.75 ( | |
| MutFormer ( | >0.5 | |
| EVE ( | >0.5 | |
| MutScore ( | >0.5 | |
| Conservation scores | phastCons ( | >0.99 ( |
| phyloP ( | >7.367 ( | |
| SiPhy ( | >12.7 ( | |
| GERP ( | >4.4 ( | |
| CDTS ( | <10 ( | |
| Consequence-agnostic predictors | GWAVA ( | >0.4 ( |
| FATHMM-MKL ( | >0.5 ( | |
| DANN ( | >0.9 ( | |
| Eigen ( | >1 ( | |
| ReMM ( | >0.984 ( | |
| CAPICE ( | >0.02 ( | |
| CADD ( | >25.3 ( | |
| Disease-specific predictors | CardioVAI ( | >2 ( |
| CardioBoost ( | >0.9 ( |
If a reference threshold was not found, decision boundary was set to 0.5 for tools with a score range between 0 and 1.
cVEP outputs categorical labels (e.g. Pathogenic, Likely_benign). We transformed categories into numerical predictions to allow doing the benchmark as following: Benign: 0; Likely_benign: 0.25; Likely_pathogenic: 0.75; Pathogenic: 1. VUS classifications were treated as NaN. Since these transformations represent artificial numeric predictions, this tool was just used in the first comparison, where tools are evaluated according to reference cut-offs. Downstream analysis (e.g. best threshold analysis, ROC curves) did not include cVEP.
For EVE, we tested doing the benchmarks using the categorical classifications at three different uncertainty thresholds (20, 82, 87). We transformed categorical classifications as we did for cVEP. At the end, we observed that none of these annotations improved classifications compared with using the raw EVE numeric score. For initial performance assessment, we set EVE threshold to 0.5, as defined in**.
Figure 2Workflow of the study. Number of variants on each dataset are presented.
Figure 3Performance of prediction tools in classifying HCM missense variants using fixed thresholds for ClinVar (A), SHaRe (B) and Walsh_2017 (C) datasets. For each dataset, the numbers of pathogenic/likely pathogenic (N pos) and benign/likely benign (N neg) variants are indicated. Tools were ranked according to the weighted normalized MCC (weighted_norm_mcc).
Figure 4Performance of prediction tools in classifying HCM missense variants using ROC curve analysis for ClinVar (A), SHaRe (B) and Walsh_2017 (C) datasets. For each dataset, the numbers of pathogenic/likely pathogenic (N pos) and benign/likely benign (N neg) variants are indicated. Tools were ranked according to the area under the ROC curve (auROC). The number (n) of variants predicted by each tool is indicated. Tools with more than 50% of missing predictions were not included. (D) Differences in the metrics when evaluating with auROC and weighted normalized MCC. For comparison, auROC values were weighted by the fraction of variants predicted by each tool.
Adjusted thresholds that maximize performance for HCM variants at different levels of importance given to precision and recall.
|
|
|
|
|
|
|---|---|---|---|---|
|
| 0.5 | 0.52 (0.366, 0.832) | 0.41 (0.242, 0.533) |
|
|
| 0.02 |
| 0.06 (0.016, 0.078) | 0.02 (0.009, 0.058) |
|
| 0.5 |
| 0.543 (0.499, 0.611) | 0.514 (0.395, 0.544) |
|
| 0.644 | 0.596 (0.533, 0.679) |
|
|
|
| 1.36 |
|
|
|
|
| 0.5 | 0.629 (0.547, 0.658) | 0.509 (0.346, 0.606) |
|
|
| 0.5 |
|
| 0.501 (0.38, 0.582) |
|
| −4.14 | – | – | – |
|
| 0.79 |
|
|
|
|
| 25.3 |
|
|
|
|
| 0.75 |
|
|
|
|
| −2.5 | −2.582 (−2.932, −2.182) | – | – |
|
| 0.5 |
|
|
|
|
| 0.468 |
|
| 0.47 (0.463, 0.561) |
|
| 0.5 |
|
|
|
|
| 2 |
|
| 2.53 (1.515, 2.837) |
|
| 0.9 |
|
|
|
|
| 0.7 |
|
|
|
|
| 0.001 | 0.0 (0.001, 0.038) |
|
|
|
| 1 |
|
|
|
|
| 12.17 | 11.823 (10.859, 13.018) |
|
|
|
| 7.367 |
|
|
|
|
| 0.978 |
|
|
|
| FATHMM-MKL | 0.5 |
|
| 0.464 (0.436, 0.881) |
| ReMM | 0.984 | 0.98 (0.943, 0.989) |
|
|
| Polyphen2HDIV | 0.978 |
|
|
|
| GERP | 4.4 |
|
|
|
| MutationAssessor | 1.935 |
| 1.106 (0.915, 2.258) |
|
| M-CAP | 0.025 |
|
|
|
| CDTS | 10 |
|
|
|
| phastCons | 0.99 | 0.7 (0.532, 1.0) |
|
|
| MetaSVM | 0.5 |
|
|
|
| EVE | 0.5 | 0.291 (0.264, 0.515) |
|
|
| VEST4 | 0.764 |
|
|
|
| GWAVA | 0.5 |
|
|
|
| MutationTaster2 | 0.5 |
| 0.99 (0.228, 0.992) | 0.23 (0.033, 0.987) |
95% percentile values of the bootstrap distribution are also displayed. Mutpred and CardioBoost were not included since they did not predict the minimum number of variants (N = 50) in the minority class required by VETA for threshold analysis.
Tool names in bold represent those that display minimally useful predictive power (> 0.70 weighted normalized MCC) across the different datasets (Figure 5).
Numbers in bold represent cases for which the reference threshold lies outside the 95% percentile values of the bootstrap distribution of adjusted thresholds.
Figure 5Performance of prediction tools using adjusted thresholds on each dataset (ClinVar, SHaRe, and Walsh_2017). Optimized thresholds at Beta = 0.5 minimize the false positives (variants predicted as pathogenic that are benign). Optimized thresholds at Beta = 1 give the same importance to false positives and false negatives. Optimized thresholds at Beta = 1.5 minimize the false negatives (variants predicted as benign that are pathogenic). Tools highlighted in blue were selected as the best by averaging the ranks between the three datasets.
Figure 6Performance of prediction tools after addressing circularity issues. Tools were ranked using the weighted normalized MCC on two new test datasets (A–C). (A) Variants identified as present in the training sets of the tools highlighted in blue were removed from the merged ClinVar, SHaRe, and Walsh_2017 datasets. (B) HCM ClinVar variants submitted after the tools highlighted in blue were developed. (C) Variants in the whole ClinVar irrespective of disease context. The tools selected as best-performers for HCM are highlighted in red (bold).
Figure 7High-confidence prioritization of HCM-associated VUS based on predictions of the 5 top-performant tools (ClinPred, MISTIC, FATHMM, MPC and MetaLR). On the left, 63 variants for which 100% of the tools predict pathogenicity. On the right, variants predicted to be benign by more than 50% of the tools.