| Literature DB >> 26969736 |
Jiazi Tan1, Jia Xin Jessie Ho1, Zhensheng Zhong2, Shufang Luo1, Gang Chen2, Xavier Roca3.
Abstract
Accurate recognition of splice sites is essential for pre-messenger RNA splicing. Mammalian 5' splice sites are mainly recognized by canonical base-pairing to the 5' end of U1 small nuclear RNA, yet we described multiple noncanonical base-pairing registers by shifting base-pair positions or allowing one-nucleotide bulges. By systematic mutational and suppressor U1 analyses, we prove three registers involving asymmetric loops and show that two-nucleotide bulges but not longer can form in this context. Importantly, we established that a noncanonical uridine-pseudouridine interaction in the 5' splice site/U1 helix contributes to the recognition of certain 5' splice sites. Thermal melting experiments support the formation of noncanonical registers and uridine-pseudouridine interactions. Overall, we experimentally validated or discarded the majority of predicted noncanonical registers, to derive a list of 5' splice sites using such alternative mechanisms that is much different from the original. This study allows not only the mechanistic understanding of the recognition of a wide diversity of mammalian 5' splice sites, but also the future development of better splice-site scoring methods that reliably predict the effects of disease-causing mutations at these sequences.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26969736 PMCID: PMC4856993 DOI: 10.1093/nar/gkw163
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Noncanonical base-pairing registers between 5′ss and U1 snRNA. (A–I) Boxes represent exons. 5′ss consensus and nonconsensus nucleotides are shown in red and black, respectively. 5′ss positions are numbered according to convention, with −3 to −1 at the end of the exon, and +1 to +8 at the beginning of the intron. U1 snRNA is schematically depicted above and below the 5′ss, with its 5′ end spanning A1 to G11. In each case the canonical base-pairing scheme is shown below for comparison with the noncanonical register. Ψ, pseudouridine; Dot, 2,2,7-trimethylguanosine cap at the 5′ end of U1. Vertical lines between nucleotides represent base pairs. (A) Mammalian consensus 5′ss using canonical base-pairing. (B) 5′ss that uses the shifted register (10). (C) 5′ss that uses a register with a bulge at position +3 (or +2) (16). (D) 5′ss predicted to use asymmetric-loop (Asym loop) 1 (+3/+4) register. (E) 5′ss predicted to use asymmetric loop 1 (+4/+5). (F) 5′ss predicted to use asymmetric loop Ψ. (G) 5′ss predicted to use a register with a bulge of two nucleotides at +3 and +4. (H) 5′ss predicted to use a register with a bulge of two nucleotides at +4 and +5. (I) 5′ss predicted to use a register with a bulge of three nucleotides at +3, +4 and +5.
Figure 2.Asymmetric loop 1 (+3/+4). (A–C) Schematics depict the minigenes with gray boxes as flanking exons and lines as introns, highlighting the middle exons (white box) and flanking intronic sequences (dashed lines) with the test 5′ss. U1 snRNA is schematically drawn and its 5′ end is shown base paired in the asymmetric loop 1 (+3/+4). Arrows show the tested point mutations. Bottom gel images show the radioactive RT-PCR results of transfections with the mutant minigenes and suppressor U1s indicated above each lane. The identity of the various spliced mRNAs is indicated on the left. In all gel panels, the mean percentage and Standard Deviation (SD) of middle exon inclusion are shown below each lane, which were derived from three experimental replicates (samples from independent transfections). Exon inclusion percentages are different between two experiments if the means and standard deviations do not overlap. (A) ABCC12 UMV minigene. From large to small, bands represent use of a cryptic 5′ss at position 76, correct exon 17 inclusion, use of a cryptic 5′ss at 20 nt upstream of the test 5′ss, and exon skipping. (B) SLC5A8 UMV minigene. From large to small, Bands represent correct exon 7 inclusion, use of a cryptic 5′ss at 52 nt upstream of the test 5′ss, and exon skipping. (C) SMN1 minigene. Bands represent correct exon 7 inclusion, exon skipping and use of a cryptic 5′ss 50 nucleotides upstream of the normal 5′ss of SMN1/2 exon 6 (10).
Figure 3.Asymmetric loop 1 (+4/+5). (A–D) Schematics of the UMV (A and C) and SMN1/2 (B and D) splicing minigenes as in Figure 2, indicating the 5′ss/U1 base-pairing in the asymmetric-loop 1 (+4/+5) register (A and B) and in the asymmetric-loop Ψ register (C and D). Bottom gel images show the radioactive RT-PCR results of transfections with the mutant minigenes and suppressor U1s indicated above each lane. The identity of the various spliced mRNAs is indicated on the left. In all gel panels, the mean percentage and SD of middle exon inclusion are shown below each lane, which were derived from three experimental replicates (samples from independent transfections). (A) RNF170 UMV minigene. Bands on the left indicate exon 3 inclusion and skipping. (B) SMN1/2 minigene analysis of asymmetric loop 1 (+4/+5). Bands are labeled as in Figure 2C. (C) PIK3R4 UMV minigene. Bands on the left indicate exon 5 inclusion and skipping. (D) SMN1/2 minigene analysis of asymmetric loop Ψ. Bands are labeled as in Figure 2C.
Figure 4.Bulges of 2 or more nucleotides in heterologous SMN1/2 context. (A–C) Gel images of the RT-PCR results with different SMN1/2 minigenes and suppressor U1s as indicated on top of each lane. Splicing products with exon 7 inclusion or skipping are schematically shown on the left as in Figure 1. In all gel panels, the mean percentage and SD of middle exon inclusion are shown below each lane, which were derived from three experimental replicates (samples from independent transfections). Exon inclusion percentages are different between two experiments if the means and standard deviations do not overlap. (A) SMN1/2 minigenes with the bulge 2 (+3/+4) (b2) and bulge 3 (+3/+4/+5) (b3) model 5′ss are shown in the bottom schematics, with the base-pairing to U1 above and below the 5′ss, respectively. The two A to C transversions to inactivate the SMN1/2 intronic silencer (-ISS) as well as the 5′ss point mutations are indicated by arrows. Gel images on top show very low or no exon inclusion. (B) Gel image of the mutational analysis for the bulge 2 (+3/+4) in SMN1-ISS context. (C) Base-pairing schematic and gel image for the analysis of the bulge 2 (+4/+5) 5′ss in SMN2.
5′ss that unexpectedly show recognition by U1 via the canonical register
| Gene | Exon | Sequence | Predicted Register | ΔΔ | ΔΔ |
|---|---|---|---|---|---|
| 5 | AAG/GUU | Asymmetric loop 1 (+3/+4) | −4.5 | −1.6 | |
| 9 | CAG/GUU | Asymmetric loop 1 (+3/+4) | −4.2 | −2 | |
| 15 | CAG/GUU | Asymmetric loop 1 (+3/+4) | −2.6 | 0 | |
| 6 | GAG/GUU | Bulge 2 (+3/+4) | −1.8 | −1 | |
| 4 | GAG/GUA | Bulge 2 (+3/+4) or (+4/+5) | −2.1 | −1.8 | |
| 3 | CAG/GUU | Asymmetric loop 1 (+3/+4) | −4.2 | −2 |
aΔΔG, mean energetic advantage of the noncanonical over the canonical register in 1 M NaCl at 37°C.
b+5G highlighted in bold, and +4U is underlined.
cBase-pairing in both noncanonical and canonical registers is shown in Figure 5A.
Figure 5.Contribution of a noncanonical U–Ψ interaction to 5′ss selection. (A and D) Schematics of the indicated potential base-pairing registers for the 5′ss in the DNAI1 (A) and CCDC132 -4A (D) minigenes. Yellow lines indicate the +5G−C4 base pair in canonical register, and question marks represent the potential +4U−Ψ5 interaction. (B,C,E,F) Gel images of the RT-PCR results for the mutational analyses (B and E) and suppressor U1 experiments (C and F) with different UMV DNAI1 (B and C) and CCDC132 -4A (E and F) minigenes and suppressor U1s as indicated on top of each lane. Splicing products with middle exon inclusion or skipping are schematically shown between panels. In all gel panels, the mean percentage and SD of middle exon inclusion are shown below each numbered lane, which were derived from three experimental replicates (samples from independent transfections). Exon inclusion percentages are different between two experiments if the means and standard deviations do not overlap.
Summary of thermal melting data
| Oligonucleotide | Name | Sequence (5′ to 3′) | Δ | Δ | Δ | ||
|---|---|---|---|---|---|---|---|
| U1 unmodified | U1–1 | AUACUUACCUG | – | – | – | – | |
| U1 modified | U1–2 | AUACΨΨACCUG | – | – | – | – | |
| Consensus | 5′ss-1 | ACAGGUAAGUAUA | −15.2d | −14.7 | 66.5 | −14.5 | 68.9 |
| Control consensus | 5′ss-2 | ACAGGUAAGUCCA | −12.6 | −13.2 | 56.4 | −13.4 | 59.4 |
| Control consensus +5C | 5′ss-3 | ACAGGUAACUCCA | −8.5 | −8.4 | 38.5 | −9.4 | 45.3 |
| Shifted +1 (NC) | 5′ss-4 | ACAGUCAAGUAUA | −6.5 | −6.8 | 29.4 | −7.6 | 34.2 |
| Shifted +1 +6C (C) | 5′ss-5 | ACAGUCAA | −1.6 | NDTf | NDT | NDT | NDT |
| Bulge -1 (NC) | 5′ss-6 | ACAGAGUAAGUAUA | −11.4 | −10.6 | 47.4 | −11.2 | 50.3 |
| Bulge -1 -2U (C) | 5′ss-7 | ACA | −8.2 | −9.1 | 40.7 | −9.3 | 43.3 |
| Asymmetric loop 1 +3/+4 (NC) | 5′ss-8 | ACAGGUUUAGUAUA | −10.0 | −10.1 | 44.7 | −11.2 | 48.7 |
| Asymmetric loop 1 +3/+4 +6C (C) | 5′ss-9 | ACAGGUUUA | −6.1 | NDT | NDT | NDT | NDT |
| Asymmetric loop 1 +4/+5 (NC) | 5′ss-10 | ACAGGUAUUGUAUA | −10.0 | −9.7 | 44.9 | −11.1 | 49.3 |
| Asymmetric loop 1 +4/+5 +6C (C) | 5′ss-11 | ACAGGUAUU | −7.7 | −8.0 | 36.1 | −8.7 | 40.8 |
| Asymmetric loop 1 Ψ (NC) | 5′ss-12 | ACAGGUUGUAUA | −8.7 | −8.2 | 44.3 | −8.5 | 46.5 |
| Asymmetric loop 1 Ψ +6C (C) | 5′ss-13 | ACAGGUUGU | −6.1 | −7.4 | 31.1 | −7.5 | 32.7 |
| Bulge 2 +3/+4 (NC) | 5′ss-14 | ACAGGUCUAAGUAUA | −10.1 | −10.6 | 46.2 | −10.9 | 48.3 |
| Bulge 2 +3/+4 +7C (C) | 5′ss-15 | ACAGGUCUAA | −6.1 | NDT | NDT | NDT | NDT |
| Bulge 2 +4/+5 (NC) | 5′ss-16 | ACAGGUAUUAGUAUA | −10.5 | −9.9 | 44.3 | −10.8 | 47.6 |
| Bulge 2 +4/+5 +7C (C) | 5′ss-17 | ACAGGUAUUA | −7.7 | −7.9 | 35.6 | −8.8 | 40.9 |
| Bulge 3 +3/+4/+5 (NC) | 5′ss-18 | ACAGGUUCUAAGUAUA | −9.7 | −10.0 | 43.8 | −10.1 | 44.7 |
| Bulge 3 +3/+4/+5 +8C (C) | 5′ss-19 | ACAGGUUCUAA | −6.1 | NDT | NDT | NDT | NDT |
| Control weak | 5′ss-20 | ACAGGUUUUUUUU | −6.1 | NDT | NDT | NDT | NDT |
| U–Ψ | 5′ss-21 | ACAGGUUUGGUUA | −7.4g | −8.9 | 40.2 | −9.9 | 44.9 |
| U–Ψ +4C | 5′ss-22 | ACAGGUUCGGUUA | −7.4h | −8.4 | 38.2 | −8.8 | 40.4 |
| U-Ψ +5C | 5′ss-23 | ACAGGUUUCGUUA | −6.1i | −7.6 | 34.5 | −8.1 | 36.9 |
aFree energy predicted by RNAstructure (RNAst) in 1 M NaCl at 37°C. All the experimental free energy values shown were measured in 1 M NaCl at 37°C.
bValues obtained by pairing the corresponding 5′ss oligonucleotide with the unmodified U1–1.
cValues obtained by pairing the corresponding 5′ss oligonucleotide with the pseudouridylated U1–2.
dValues are averages derived from three separate measurements.
eNucleotide difference between canonical (C) versus noncanonical (NC) oligonucleotides is underlined.
fNDT, No Defined Transition.
gPredicted ΔG for both asymmetric loop and for canonical register with +5G−C4 are −7.4 kcal/mol.
hPredicted ΔG for canonical register with +5G−C4 drops to −6.5 kcal/mol.
iThis predicted ΔG is for canonical register with only 5 base pairs and no +5G−C4.
Numbers and distribution of predicted non-canonical 5′ss
| Register | Position | Noncanonical 5′ssa | 5′ss with +5Gb | Updated noncanonical 5′ssc | Average ΔΔ | 5'ss with ΔΔ |
|---|---|---|---|---|---|---|
| Total | 10 248 | 4766 | 4270 | −1.32 | 2703 | |
| 3176 | −1.35 | 2073 | ||||
| 3119 | −1.35 | 2019 | ||||
| All | 5766 | 3386 | 2358 | −1.24 | 1448 | |
| –1 | 2913 | 2891 | 0 | NAj | 0 | |
| +2 | 1 | 0 | 1 | −4.9 | 1 | |
| All | 1179 | 361 | 818 | −1.69 | 625 | |
| +3/+4/+5 | 52 | 0 | 52 | −1.73 | 51 | |
| With +2l | 9 | 5 | 4 | −0.78 | 2 | |
| All | 1067 | 389 | 678 | −1.15 | 395 | |
| Othersm | 91 | 67 | 24 | −0.35 | 1 | |
| All | 472 | 56 | 416 | −1.37 | 235 | |
| All | 1764 | 574 | 0 | NA | 0 |
aPredictions from (16).
b5′ss from the original predictions with a G at +5, which would base pair in canonical register.
cCurated list in which we removed the +5G 5′ss, the bulge 1 (−1) and bulges and asymmetric loops longer than 2.
dPredicted ΔΔG = ΔG1 − ΔG2 (in 1 M NaCl at 37°C); mean energetic advantage of the bulge over the canonical register.
ePredicted ΔΔG ≤ −1, (in 1 M NaCl at 37°C); number of 5′ss in which the bulge register confers a substantial advantage over the canonical register.
fReliable list excludes the provisional registers with long bulges and asymmetric loops.
gValidated list includes only registers directly proven by experiments.
hPseudouridine at either position 5 or 6 of the 5’ end of U1 snRNA.
iBold indicates registers with experimental validation here or in (16).
jNA, not applicable.
k(+2/+3) Either 5′ss position +2 or +3 is bulged.
lGroup of asymmetric-loop 1 registers for GC 5′ss whereby the loop involves position +2 and others.
mThree registers with bulges at either both U1 pseudouridines, at 5′ss positions +2/+3 or at +5/+6.