| Literature DB >> 28758972 |
Lucie Grodecká1, Emanuele Buratti2, Tomáš Freiberger3,4.
Abstract
For more than three decades, researchers have known that consensus splice sites alone are not sufficient regulatory elements to provide complex splicing regulation. Other regulators, so-called splicing regulatory elements (SREs) are needed. Most importantly, their sequence variants often underlie the development of various human disorders. However, due to their variable location and high degeneracy, these regulatory sequences are also very difficult to recognize and predict. Many different approaches aiming to identify SREs have been tried, often leading to the development of in silico prediction tools. While these tools were initially expected to be helpful to identify splicing-affecting mutations in genetic diagnostics, we are still quite far from meeting this goal. In fact, most of these tools are not able to accurately discern the SRE-affecting pathological variants from those not affecting splicing. Nonetheless, several recent evaluations have given appealing results (namely for EX-SKIP, ESRseq and Hexplorer predictors). In this review, we aim to summarize the history of the different approaches to SRE prediction, and provide additional validation of these tools based on patients' clinical data. Finally, we evaluate their usefulness for diagnostic settings and discuss the challenges that have yet to be met.Entities:
Keywords: evaluation of prediction tools; in silico predictions; mutation; pre-mRNA splicing; splicing aberration; splicing regulatory elements; variants of unknown significance
Mesh:
Substances:
Year: 2017 PMID: 28758972 PMCID: PMC5578058 DOI: 10.3390/ijms18081668
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Selected individual SRE-prediction tools.
| Prediction Tool | Principle | Website | Reference | Evaluation |
|---|---|---|---|---|
| ESE-finder | SELEX (in vitro selection of ligands) | [ | [ | |
| ESRseq | testing of all possible k-mers for positive and negative splicing influences, based on QUEPASA method | [ | [ | |
| FAS-ESS | analysis of random sequence silencing properties in the minigene settings | [ | [ | |
| Hexplorer | statistical comparison of hexamer sequence motifs | [ | [ | |
| PESX | statistical comparison of octamer sequence motifs | [ | [ | |
| RESCUE-ESE | statistical comparison of hexamer sequence motifs | [ | [ | |
| SPANR | splicing code, machine learning | [ | [ | |
| SpliceAid2 | database of in vitro proved splicing factors binding sites | [ |
Figure 1Individual approaches to SRE predictions showing the first release of the online tools (* denotes the tools that are accessible via Sroogle engine; ˜ denotes the tools that are accessible via Human Splicing Finder (HSF) engine).
Selected tools and online engines combining multiple SRE-prediction tools.
| Prediction Tool | Included Tools | Website | Reference | Evaluation |
|---|---|---|---|---|
| EX-SKIP | PESE and PESS [ | [ | [ | |
| HOT-SKIP | PESE and PESS, FAS-ESS, RESCUE-ESE, EIEs and IIEs, NI-ESE and NI-ESS | [ | ||
| Human Splicing Finder (HSF) | ESE-finder [ | [ | [ | |
| Sroogle | ESE-finder, RESCUE-ESE, FAS-ESS, PESE and PESS, other SRE predictions according to Voelker [ | [ |
Evaluation of ESRseq, Hexplorer and EX-SKIP predictors on patients’ RNA-based results.
| Gene | cDNA Variant | Exon | Effect on Exon Skipping | ΔESRseq (−0.5) | Hexplorer: ΔHZEI (−0.5) | EX-SKIP: ESS/ESE mut/wt (1) |
|---|---|---|---|---|---|---|
| c.5123C > A | 18 | increased | −2.574 | -10.85 | 1.05 | |
| c.5434C > G | 23 | increased | 0.558 | −1.28 | 1.09 | |
| c.5453A > G | 23 | increased | −2.176 | −15.02 | 1.43 | |
| c.5096G > A | 18 | none | −1.731 | −0.22 | 0.93 | |
| c.5116G > A | 18 | none | 1.582 | 2.67 | 0.86 | |
| c.5411T > A | 23 | none | 0.66 | 4.33 | 0.83 | |
| c.231T > G | 3 | increased | 1.65 | 4.76 | 0.97 | |
| c.439C > T | 5 | increased | −2.69 | −13.06 | 1.77 | |
| c.7992T > A | 18 | increased | −1.11 | 0.00 | 1.00 | |
| c.8257_8259delCTT | 18 | increased | −0.51 | 0.52 | 0.98 | |
| c.9234C > T | 24 | increased | −1.24 | −12.07 | 1.12 | |
| c.223 > C | 3 | none | 0.37 | −11.19 | 0.97 | |
| c.433_435delGTT | 5 | none | −0.23 | 6.46 | 0.51 | |
| c.7994A > G | 18 | none | −1.21 | 0.24 | 1.02 | |
| c.8182G > A | 18 | none | −1.65 | 0.00 | 0.98 | |
| c.9216G > 1 | 24 | none | −2.88 | −10.94 | 1.04 | |
| c.557A > T | 5 | increased | −2.43 | −12.21 | 1.25 | |
| c.528T > A | 5 | none | 1.17 | 0.70 | 0.90 | |
| c.5287C > T | 37 | increased | −0.70 | −16.84 | 1.16 | |
| c.5308A > T | 37 | none | −0.05 | −2.85 | 1.05 | |
| True calls | 70.0% | 70.0% | 70.0% | |||
| Sensitivity | 80.0% | 70.0% | 70.0% | |||
| Specificity | 60.0% | 70.0% | 70.0% |
To further evaluate the three prediction tools, we have retrieved 20 gene variants detected in genes BRCA1, BRCA2, NF1 and DMD (10 inducing aberrant splicing and 10 harmless at the level of splicing) from the literature. In all these cases, nonsense mediated decay was either prevented or not expected. For an easy comparison, we used the same thresholds as described in Soukarieh et al. (shown in table headings) [15], except from the original Hexplorer threshold which was not applicable to our data, so we used a threshold −0.5 instead. The true calls are shown in bold. ΔESRseq: score difference between predicted mutant and wild type ESRseq score. ΔHZEI: score difference between predicted mutant and wild type HZEI score.
Summary of clinical classification guidelines that apply to SRE-affecting variants.
| Class | Observation | Reference | |
|---|---|---|---|
| 5: pathogenic | • | assay on mRNA from patients tissue samples | [ |
| AND | no wt transcript detected from variant allele | ||
| AND | aberrant transcripts introduce PTC | ||
| OR deletion disrupting protein conformation | only in [ | ||
| OR | damaging effect on the gene or gene product | [ | |
| AND | other lines of evidence supporting variant pathogenicity 2 (stronger than for class 4) | ||
| • | lab assays based on mRNA (e.g., minigenes) | [ | |
| AND | variant-specific abrogated function | ||
| AND | additional frequency/co-segregation/clinical data, additional molecular/mechanistic evidences from other sources, supporting variant pathogenicity | ||
| 4: probably pathogenic | • | assay on mRNA from patients tissue samples | [ |
| AND | damaging effect on the gene or gene product | ||
| AND | other lines of evidence supporting variant pathogenicity (milder than for class 5) | ||
| • | lab assays based on mRNA (e.g., minigenes) | [ | |
| AND | variant-specific abrogated function | ||
| AND | additional frequency/co-segregation/clinical data, additional molecular/mechanistic evidences from other sources, supporting variant pathogenicity | ||
| • | minigene assays | [ | |
| AND | complete aberrant and frameshifting effect/ | ||
| 3: uncertain pathogenicity | all variants that do not fall into other classes | [ | |
| e.g., | aberrant transcripts produce deletion | ||
| e.g., | change in the level of alternative transcripts, | ||
| e.g., | leaky aberrant splicing | ||
| e.g., | contradictory benign and pathogenic criteria | [ | |
| 2: likely not pathogenic | • | assay on mRNA from patients tissue samples | [ |
| AND | no associated mRNA aberration detected | ||
| AND | analysis including NMD inhibition | only [ | |
| AND | other lines of evidence disproving variant pathogenicity | only [ | |
| • | lab assays based on mRNA | [ | |
| AND | variant-specific proficient function | ||
| AND | additional frequency/co-segregation/clinical data, additional molecular/mechanistic evidences from other sources, disproving variant pathogenicity | ||
| 1: not pathogenic | • | lab assays based on mRNA | [ |
| AND | variant-specific proficient function | ||
| AND | additional frequency/co-segregation/clinical data, additional molecular/mechanistic evidences from other sources, disproving variant pathogenicity |
Richards et al. combines multiple individual pathogenic and benign criteria to reach the classification [61]; 2 Walker et al. recommends predicted/expected SRE aberrations only for research testing [59]. • denotes a single set of criteria for each class. A variant has to meet all the criteria in at least one set to be included into the respective class.
The criteria listed in the table were extracted from several publications that were written, in fact, for somewhat differing purposes. Walker et al. published an evaluation and update of clinical classification guidelines previously designed by Spurdle et al. that were specifically aimed at potential splicing affection [59,62]. On the other hand, Thompson et al. and Richards et al. dealt with overall pathogenic potential of novel variants including direct effects on protein coding [60,61]. While Thompson et al. limited their criteria on mismatch repair genes [60], Richards et al. designed criteria for classification of all variants identified in genes that cause Mendelian disorders [61]. Finally, we have included one classification proposal derived from a publication that thoroughly evaluated the reliability of minigenes [48]. Of note, we did not include in the table the splicing-specific classification designed by Houdayer et al. [17], as this recommended a particular (specific) re-classification of class 3 variants (or variants of unknown significance) into three other classes (1S, 2S, 3S).
Concerning the in silico predictions, Richards et al. propose using them as a supportive criterion, if all the in silico programs tested agree on the prediction [61]. However, they do not directly mention predictions of SREs in their publication, describing only the splice site predictors. In comparison, Walker et al. allows usage of SRE predictors in general. If a variant causes a loss or creation of the same SRE predicted by at least two of three used programs, then these guidelines propose that it should be experimentally tested (as a research testing) [59]. Other guidelines mentioned in this table do not propose the usage of SRE predictions. Please note that only an effect on splicing is taken into account in this table. The effect of a variant on protein coding per se should always be considered as well.