| Literature DB >> 31074545 |
Robert Wang1,2, Yaqiong Wang2, Zhiqiang Hu2.
Abstract
Accurate interpretation of genomic variants that alter RNA splicing is critical to precision medicine. We present a computational framework, Prediction of variant Effect on Percent Spliced In (PEPSI), that predicts the splicing impact of coding and noncoding variants for the Fifth Critical Assessment of Genome Interpretation (CAGI5) "Vex-seq" challenge. PEPSI is a random forest regression model trained on multiple layers of features associated with sequence conservation and regulatory sequence elements. Compared to other splicing defect prediction tools from the literature, our framework integrates secondary structure information in predicting variants that disrupt splicing regulatory elements (SREs). We applied our model to classify splice-disrupting variants among 2,094 single-nucleotide polymorphisms from the Exome Aggregation Consortium using model-predicted changes in percent spliced in (ΔPSI) associated with tested variants. Benchmarking our model against widely used state-of-the-art tools, we demonstrate that PEPSI achieves comparable performance in terms of sensitivity and precision. Moreover, we also show that using secondary structure context can help resolve several cases where changes in the counts of SREs do not correspond with the directionality of ΔPSI measured for tested variants.Entities:
Keywords: CAGI; RNA secondary structure; alternative splicing; splice-disrupting variants; splicing regulatory elements
Mesh:
Substances:
Year: 2019 PMID: 31074545 PMCID: PMC7288985 DOI: 10.1002/humu.23790
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878