| Literature DB >> 32655299 |
Jian-Min Chen1, Jin-Huan Lin1, Emmanuelle Masson1, Zhuan Liao1, Claude Férec1, David N Cooper1, Matthew Hayden1.
Abstract
INTRODUCTION: 5' splice site GT>GC or +2T>C variants have been frequently reported to cause human genetic disease and are routinely scored as pathogenic splicing mutations. However, we have recently demonstrated that such variants in human disease genes may not invariably be pathogenic. Moreover, we found that no splicing prediction tools appear to be capable of reliably distinguishing those +2T>C variants that generate wild-type transcripts from those that do not.Entities:
Keywords: +2T>C variant; 5' splice site; Full-length gene splicing assay; GT>GC variant; in silico splicing prediction; in vitro functional analysis
Year: 2020 PMID: 32655299 PMCID: PMC7324893 DOI: 10.2174/1389202921666200210141701
Source DB: PubMed Journal: Curr Genomics ISSN: 1389-2029 Impact factor: 2.236
Comparison of SpliceAI-predicted and experimentally demonstrated functional effects of the seven disease-causing GT>GC (+2T>C) variants that generated wild-type transcripts.
|
|
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|---|---|
| +2T>C | +2T>A | +2T>G | |||||||
|
| NM_001234.4 | 3 | 8733992 | T | c.114+2T>C | 11c | 0.9 | 1 | 1 |
|
| NM_000733.3 | 11 | 118313876 | T | c.520+2T>C | 1-5d | 0.99 | 0.99 | 0.99 |
|
| NM_000074.2 | X | 136654432 | T | c.346+2T>C | 15d | 0.95 | 0.97 | 0.97 |
|
| NM_004006.2 | X | 31657988 | A | c.8027+2T>C | 10d | 0.63 | 0.99 | 0.99 |
|
| NM_000533.4 | X | 103788512 | T | c.696+2T>C | 8c | 0.74 | 1 | 1 |
|
| NM_000112.3 | 5 | 149960981 | T | c.-26+2T>C | 5d | 0.9 | 0.99 | 0.99 |
|
| NM_003122.3 | 5 | 147828020 | A | c.194+2T>Ce | 10c | 0.35 | 0.99 | 1 |
aNomenclature in accordance with Human Genome Variation Society (HGVS) recommendations [23].
bExpresed as the level of the variant allele-derived wild-type transcripts relative to that of the wild-type allele-derived wild-type transcripts.
cExpression level determined here by ImageJ using gel photos from the original publications.
dExpression level as described in the original publications.
eIdentical to the SPINK1 IVS3+2T>C substitution in Table .
Comparison of SpliceAI-predicted and experimentally demonstrated functional effects of the 19 engineered GT>GC (+2T>C) substitutions that generated wild-type transcripts.
|
|
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|---|---|
| +2T>C | +2T>A | +2T>G | |||||||
|
| NM_213607.2 | 17 | 44899861 | T | IVS1+2T>C | Yes | 0.82 | 0.82 | 0.82 |
|
| NM_001079862.2 | 2 | 119368307 | T | IVS2+2T>C | Yes | 0.86 | 1 | 1 |
|
| NM_145261.3 | 3 | 180985924 | A | IVS5+2T>C | Yes (42%) | 0.03 | 0.99 | 0.95 |
|
| NM_033085.2 | X | 151716227 | T | IVS1+2T>C | Yes (84%) | 0.08 | 0.96 | 1 |
|
| NM_000804.3 | 11 | 72139484 | T | IVS4+2T>C | Yes | 0.45 | 1 | 1 |
|
| NM_003865.2 | 3 | 57199760 | A | IVS1+2T>C | Yes (2%) | 0.81 | 0.98 | 0.98 |
|
| NM_172138.1 | 19 | 39269823 | T | IVS5+2T>C | Yes (5%) | 0.05 | 0.84 | 0.73 |
|
| NM_000572.3 | 1 | 206770905 | A | IVS3+2T>C | Yes | 0.61 | 1 | 1 |
|
| NM_000900.4 | 12 | 14884211 | A | IVS2+2T>C | Yes (80%) | 0.97 | 0.99 | 0.99 |
|
| NM_001199163.1 | 17 | 63830503 | T | IVS6+2T>C | Yes (56%) | 0.31 | 0.98 | 1 |
| 63831228 | T | IVS8+2T>C | Yes (56%) | 0.21 | 1 | 1 | |||
| 63831618 | T | IVS10+2T>C | Yes (46%) | 0.83 | 1 | 1 | |||
|
| NM_000975.5 | 1 | 23692761 | T | IVS2+2T>C | Yes | 0 | 0.87 | 0.86 |
| 23693915 | T | IVS3+2T>C | Yes | 0.74 | 1 | 1 | |||
|
| NM_001030.4 | 1 | 153991225 | T | IVS2+2T>C | Yes (63%) | 0.67 | 1 | 1 |
| 153991678 | T | IVS3+2T>C | Yes | 0.98 | 1 | 1 | |||
|
| NM_203472.2 | 15 | 101277340 | A | IVS1+2T>C | Yes | 0.81 | 1 | 1 |
| 101274418 | A | IVS5+2T>C | Yes (14%) | 0.79 | 1 | 1 | |||
|
| NM_003122.3 | 5 | 147828020 | A | IVS3+2T>Cc | Yes | 0.35 | 0.99 | 1 |
aIn accordance with the traditional IVS (InterVening Sequence; i.e., an intron) nomenclature as previously described [16].
bExpression level (in parentheses), determined by quantitative RT-PCR analysis, was available for all +2T>C substitutions that generated only wild-type transcripts under the experimental conditions described in [16].
cIdentical to the SPINK1 c.194+2T>C variant in Supplementary Table and Table .
Comparison of SpliceAI-predicted and experimentally demonstrated functional effects of all possible single nucleotide substitutions in the +2 positions of 12 BRCA1 introns*.
|
|
|
|
|
|
| |||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |||
| 2 | 43124015 | A | Non-functional | 0.9 | Non-functional | 0.9 | Non-functional | 0.9 |
| 3 | 43115724 | A | Non-functional | 0.97 | Non-functional | 0.98 | Non-functional | 0.98 |
| 4 | 43106454 | A | Non-functional | 0.65 | Non-functional | 0.65 | Non-functional | 0.65 |
| 5 | 43104866 | A | Non-functional | 0.67 | Non-functional | 0.67 | Non-functional | 0.67 |
| 15 | 43070926 | A | Non-functional | 0.99 | Non-functional | 0.99 | Non-functional | 0.99 |
| 16 | 43067606 | A | Non-functional | 0.74 | Intermediate | 0.74 | Non-functional | 0.74 |
| 17 | 43063872 | A | Non-functional | 0.9 | Non-functional | 0.9 | Non-functional | 0.9 |
| 18 | 43063331 | A | Functional | 0.53 | Intermediate | 0.98 | Non-functional | 0.98 |
| 19 | 43057050 | A | Non-functional | 0.82 | Non-functional | 1 | Non-functional | 1 |
| 20 | 43051061 | A | Non-functional | 0.9 | Non-functional | 0.99 | Non-functional | 0.99 |
| 21 | 43049119 | A | Intermediate | 0.96 | Non-functional | 0.99 | Non-functional | 0.99 |
| 22 | 43047641 | A | Intermediate | 0.93 | Non-functional | 0.93 | Missing data | 0.93 |
*Experimental data were extracted from [19].
aIn accordance with NM_007294.3.
b“Non-functional” was interpreted as meaning that no wild-type transcripts were generated whereas “functional” and “intermediate” were held to imply the generation of wild-type transcripts.
Performance metrics of SpliceAI as a predictor for splice site disruption on 103 variants from dataset 2.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| 16% | 67% | 84% | 70% | 0.79 | 0.41 |
Overall correlation rates between SpliceAI-predicted and experimentally demonstrated functional effects of the GT>GC variants in the context of three datasets*.
|
| |
|---|---|
| Dataset 1 (45 disease-causing variants) | 43% (3/7) |
| Dataset 2 (103 variants analyzed by FLGSA) | 84% (16/19) |
| Dataset 3 (12 | 33% (1/3) |
|
| |
| Dataset 1 (45 disease-causing variants) | 89% (34/38) |
| Dataset 2 (103 variants analyzed by FLGSA) | 68% (57/84) |
| Dataset 3 (12 | 67% (6/9) |
*Splice AI Delta score (donor loss) of 0.85 was used as the threshold value for defining the generation of wild-type transcripts or not.
Nine +2T positions for which all three possible nucleotide substitutions had a consistent SpliceAI Delta score of <0.85.
|
|
|
|
|
|
| ||
|---|---|---|---|---|---|---|---|
| +2T>C | +2T>A | +2T>G | |||||
|
| NM_001015878.1 | 19 | 57235060 | T | 0.8 | 0.8 | 0.8 |
|
| NM_213607.2 | 17 | 44899861 | T | 0.82 | 0.82 | 0.82 |
|
| NM_001446.4 | 6 | 122779869 | T | 0.83 | 0.84 | 0.84 |
|
| NM_172138.1 | 19 | 39269823 | T | 0.05 | 0.84 | 0.73 |
|
| NM_001003693.1 | 6 | 31708136 | T | 0.81 | 0.81 | 0.81 |
| 31710420 | T | 0.3 | 0.3 | 0.3 | |||
|
| NM_001199163.1 | 17 | 63830191 | T | 0.76 | 0.77 | 0.77 |
|
| NM_000975.5 | 1 | 23695910 | T | 0.59 | 0.59 | 0.59 |
|
| NM_203472.2 | 15 | 101272762 | A | 0.64 | 0.64 | 0.64 |