Clemens Plaschka1,2, Pei-Chun Lin3, Clément Charenton4, Kiyoshi Nagai5. 1. MRC Laboratory of Molecular Biology, Cambridge, UK. clemens.plaschka@imp.ac.at. 2. Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Vienna, Austria. clemens.plaschka@imp.ac.at. 3. MRC Laboratory of Molecular Biology, Cambridge, UK. pclin@mrc-lmb.cam.ac.uk. 4. MRC Laboratory of Molecular Biology, Cambridge, UK. 5. MRC Laboratory of Molecular Biology, Cambridge, UK. kn@mrc-lmb.cam.ac.uk.
Abstract
The spliceosome catalyses the excision of introns from pre-mRNA in two steps, branching and exon ligation, and is assembled from five small nuclear ribonucleoprotein particles (snRNPs; U1, U2, U4, U5, U6) and numerous non-snRNP factors1. For branching, the intron 5' splice site and the branch point sequence are selected and brought by the U1 and U2 snRNPs into the prespliceosome1, which is a focal point for regulation by alternative splicing factors2. The U4/U6.U5 tri-snRNP subsequently joins the prespliceosome to form the complete pre-catalytic spliceosome. Recent studies have revealed the structural basis of the branching and exon-ligation reactions3, however, the structural basis of the early events in spliceosome assembly remains poorly understood4. Here we report the cryo-electron microscopy structure of the yeast Saccharomyces cerevisiae prespliceosome at near-atomic resolution. The structure reveals an induced stabilization of the 5' splice site in the U1 snRNP, and provides structural insights into the functions of the human alternative splicing factors LUC7-like (yeast Luc7) and TIA-1 (yeast Nam8), both of which have been linked to human disease5,6. In the prespliceosome, the U1 snRNP associates with the U2 snRNP through a stable contact with the U2 3' domain and a transient yeast-specific contact with the U2 SF3b-containing 5' region, leaving its tri-snRNP-binding interface fully exposed. The results suggest mechanisms for 5' splice site transfer to the U6 ACAGAGA region within the assembled spliceosome and for its subsequent conversion to the activation-competent B-complex spliceosome7,8. Taken together, the data provide a working model to investigate the early steps of spliceosome assembly.
The spliceosome catalyses the excision of introns from pre-mRNA in two steps, branching and exon ligation, and is assembled from five small nuclear ribonucleoprotein particles (snRNPs; U1, U2, U4, U5, U6) and numerous non-snRNP factors1. For branching, the intron 5' splice site and the branch point sequence are selected and brought by the U1 and U2 snRNPs into the prespliceosome1, which is a focal point for regulation by alternative splicing factors2. The U4/U6.U5 tri-snRNP subsequently joins the prespliceosome to form the complete pre-catalytic spliceosome. Recent studies have revealed the structural basis of the branching and exon-ligation reactions3, however, the structural basis of the early events in spliceosome assembly remains poorly understood4. Here we report the cryo-electron microscopy structure of the yeast Saccharomyces cerevisiae prespliceosome at near-atomic resolution. The structure reveals an induced stabilization of the 5' splice site in the U1 snRNP, and provides structural insights into the functions of the human alternative splicing factors LUC7-like (yeast Luc7) and TIA-1 (yeast Nam8), both of which have been linked to human disease5,6. In the prespliceosome, the U1 snRNP associates with the U2 snRNP through a stable contact with the U2 3' domain and a transient yeast-specific contact with the U2 SF3b-containing 5' region, leaving its tri-snRNP-binding interface fully exposed. The results suggest mechanisms for 5' splice site transfer to the U6 ACAGAGA region within the assembled spliceosome and for its subsequent conversion to the activation-competent B-complex spliceosome7,8. Taken together, the data provide a working model to investigate the early steps of spliceosome assembly.
To gain structural insights into early spliceosome assembly, we prepared the yeast
prespliceosome A complex on UBC4 pre-mRNA carrying a mutation in the
pre-mRNA branch point (BP) sequence, which was previously used to stall the A
complex9 (UACUAAC to
UACAAAC, where A is the BP
adenosine) (Extended Data Fig. 1a,b; Methods). The purified A complex contained
stoichiometric amounts of the U1 and U2 snRNP proteins (Extended Data Fig. 1b), and was used to determine cryo-EM densities of the A
complex at 4.0 Å (U1 snRNP, map A2) and 4.9-10.4 Å (U2 snRNP, maps A1 and
A3) resolution, respectively (Extended Data Figs
1c,d,e, 2, Methods). From these densities we could build a near-complete atomic model
of the A complex (Fig. 1, Supplementary videos 1 and 2,
supplementary file, Extended Data Fig. 1f), comprising 34 proteins, U1
and U2 snRNAs, and 34 nucleotides of pre-mRNA. The final model lacks the mobile
cap-binding complex, Prp5 or the U1 subunit Prp40 (Extended Data Fig. 1b,d,e and Extended Data
Table 1). The elongated U1 and U2 snRNPs bind the pre-mRNA 5'SS and BP
sequences, respectively, and associate in a parallel manner to form the A complex (Fig. 2a). The U1 snRNP structure contains all
essential regions of U1 snRNA and 16 proteins (Fig.
1). The U1 snRNP ‘core’ is highly similar to its human
counterpart (Extended Data Figs 3 and 4; ref. 10),
comprising the seven-membered Sm ring and orthologues of the human U1 snRNP proteins
(Snp1, human U1-70k; Mud1, human U1A; Yhc1, human U1C), and is bound to the peripheral
yeast U1 proteins Luc7, Nam8, Prp39, Prp42, Snu56, and Snu71 (ref. 11) (Extended Data Figs 3,
4). The U2 snRNP has a bipartite structure as
observed in B complex8, comprising the SF3b
subcomplex (‘5' region’) and the U2 3' domain/SF3a
subcomplex (‘3' region’) that are organized around the 5'
and 3' regions of U2 snRNA, respectively (Figs
1, 2a, Extended Data Fig. 5). The conformation of the U2 5' region is
unchanged from the B complex8, where the pre-mRNA
BP sequence is base-paired with U2 snRNA and the BP adenosine is bulged out and
accommodated in a pocket formed by U2 SF3b subunits Hsh155 and Rds3. After we completed
the A complex structure, the cryo-EM structure of the free yeast U1 snRNP was
reported12. This model is in good agreement
with the U1 snRNP in our A complex structure, but there are important differences12.
Extended Data Figure 1
Biochemical characterization and cryo-EM of the prespliceosome A complex.
a. Mutation of the UBC4 pre-mRNA branch point sequence (UACUAAC to UACAAAC, where A is the BP adenosine) stalls splicing before the first step, as described9. Splicing reactions were carried out for 30 min at 23 ºC in yeast extract using wild-type (lane 1) or mutant (U/A, lane 2) pre-mRNA (see Methods for details). This experiment was performed three times. The asterisk indicates a degradation product. For gel source data see Supplementary Fig. 1a.
b. Protein analysis of purified A complex (SDS-PAGE stained with Coomassie blue). The U2-associated Prp5 protein is sub-stoichiometric and not observed in the A complex structure. The purification and analysis of protein compositions were performed at least five times with similar results. For gel source data see Supplementary Fig. 1b.
c. Cryo-EM micrograph of the A complex. Scale bar, 100 nm. d. 2D class averages of the A complex were determined in RELION 2.1 (refs 39,40), and reveal a bipartite architecture, comprising the U1 snRNP and the U2 snRNP 3' and 5' regions, respectively. e. Composite cryo-EM density of the A complex shown in two orthogonal views (compare Fig. 1). The respective densities used for modeling the U1 snRNP (A2, gray), the U2 3' region (A1, cyan), and the U2 5' region (A3, green) are coloured and superimposed on a transparent outline of the full A3 map (Methods). The overall resolution of each map as well as the percentage from the cleaned dataset of 153,556 particles are shown in parentheses. Non-modelled regions are indicated and putatively assigned. f. Composite cryo-EM density with the final A complex model superimposed in a cartoon representation. The path of 40 nucleotides of the disordered UBC4 pre-mRNA intron are indicated. A complex components are coloured as in Fig. 1. Views as in panel e.
Extended Data Figure 2
Cryo-EM image classification and refinement.
a. Image processing workflow for analysis of the A complex cryo-EM data set (Methods). To visualize differences between the reconstructions the U1 snRNP (gray), U2 3' (cyan) and U2 5' regions (green) are coloured. For each round of three-dimensional classification, the percentage of the data and the type of soft-edged mask are indicated. The type of mask and overall resolution are indicated for each 3D refinement (blue box). b. Orientation distribution plots for all particles that contribute to the respective A1, A2, and A3 cryo-EM reconstructions. c. Gold-standard Fourier shell correlation (FSC = 0.143) of the respective A1, A2, and A3 cryo-EM reconstructions. d. Two views of the composite A complex cryo-EM density (maps A1, A2, and A3) coloured by local resolution as determined by ResMap43. e. As panel d, but for a central slice through the composite A complex cryo-EM map.
Figure 1
Prespliceosome A complex structure.
Two orthogonal views of the yeast A complex structure. Subunits are coloured according to snRNP identity (U1, shades of purple, U2, shades of green), and the pre-mRNA intron (black) and its 5' exon (orange) are highlighted. Orthologous human proteins are shown after the backslash. The location of the cap-binding complex (CBC) is indicated with a brown oval (Extended Data Figure 1d,e).
Extended Data Table 1
Summary of the components modelled into the A complex cryo-EM densities.
5'SS recognition and implications for alternative splicing.
a. The A complex U1-U2 snRNP interfaces (A and B) and RNA network are shown as cartoons, and are superimposed on transparent surfaces of prespliceosome proteins. The U2 subunit Hsh155 surface (gray oval), which interacts with the tri-snRNP in the B complex, is freely accessible in the A complex. The U1 snRNP proteins Nam8 (orange, human TIA-1), Luc7 (purple, human LUC7-like, LUC7L) and Yhc1 (magenta, human U1C) are shown as ribbons. The remaining densities are likely accounted for by Nam8 (see c) and cap binding complex (CBC). Branch point, BP; β-propeller B and C, BPB and BPC; RNA-recognition motif, RRM. b. The pre-mRNA 5'SS is recognized by the U1 snRNA 5' end, and is stabilized by Luc7 and Yhc1. Notably, the Yhc1 ZnF and Luc7 ZnF2 domains are arranged with pseudo-C2 symmetry around the U1-5'SS helix. c. Nam8 binds the U1 snRNP through its linker (yellow), RNA recognition motif 3 (RRM3, light orange) and C-terminal regions (orange), while its RRM1 and RRM2 domains are mobile and project towards the intron, to bind uridine-rich sequences downstream of the pre-mRNA 5’SS (dashed line), like its human counterpart TIA-1 (ref. 18). Nam8 contacts the Yhc1 (human U1C) C-terminus, and human TIA-1 biochemically also interacts with human U1C18. Snu56 (blue), Prp39 (magenta), Prp42 (violet), and Hsh49 (light green) are shown as transparent ribbon models and other protein and U1 snRNA elements were removed for clarity.
Extended Data Figure 3
Details of the U1 snRNP.
a. U1 snRNP structure with subunits coloured as in Fig. 1, except for Nam8 (orange), Snu56 (light blue), Snu71 (blue), Luc7 (dark purple), Mud1 (red) and U1 snRNA (various). The pre-mRNA nucleotides are labelled relative to the first nucleotide (+1) of the intron. The Nam8 RRM1 and RRM2 domains are flexible and project downstream of the 5'SS. The protein attributed to Luc7 in the free U1 snRNP structure12 was re-assigned to Snu71. Stem loop, SL; RNA recognition motif, RRM; Zn-finger, ZnF; N-terminus, N-term; C-terminus, C-term. In the structure we do not observe any evidence that the C-terminal tails of SmB, SmD1, and SmD3 interact with the 5’SS, consistent with their absence in the human 5’SS–minimal U1 snRNP crystal structure10. b. Representative regions of the sharpened U1 snRNP density determined at 4 Å resolution (map A2) are superimposed on the refined coordinate model. The density reveals side-chain details, and here segments from the Prp42 N-terminus (TPR repeat 1), the Sm ring subunit SmB, and the Snu56 α-helical domain are shown. c. The A2 cryo-EM density is shown superimposed on the coordinate models of a selection of U1 snRNP proteins: Luc7, Snu71, Yhc1, and Prp39. In the structure most of Snu71 is disordered, except for a small N-terminal domain (residues 2-43) that binds between the Prp42 N-terminus and the Snu56 KH-like fold, consistent with protein crosslinking12. Functional regions and disordered domains are indicated. d. The U1 snRNA–pre-mRNA 5' splice site (U1–5'SS) model is superimposed on its cryo-EM density (map A2). A secondary structure diagram of the U1–5'SS interaction is shown underneath the model. The register of the 5'SS is shifted by one nucleotide compared to the minimal human 5'SS–U1 snRNP crystal structure, due to an additional nucleotide in yeast U1 snRNA10 (U11). Lines indicate Watson–Crick base pairs and dots pseudouridine (ψ)-containing base pairs. e. The Prp39-Prp42 heterodimer is coloured to indicate each of their respective TPR repeats. f. Cryo-EM density of U1 snRNA from maps A2 (dark gray) and A3 (light gray) without (top) and with the superimposed coordinate model of yeast U1 snRNA (bottom). The model is labelled and coloured according to functional regions of U1 snRNA (5' end, pink; H helix, cyan; SL1, dark blue; SL2-1, green; SL3-1, light blue; SL2-2 and SL3-2 to -7, gray; 3’end and Sm site, yellow). Stem loop, SL. g. Secondary structure diagram of U1 snRNA. Bold letters indicate residues included in the model, lines indicate Watson–Crick base pairs, and dots G–U wobble and pseudouridine (ψ)-containing base pairs. Compare panel e. The conserved U1 snRNA ‘core’ is outlined with a gray box. The region of the putative phosphate backbone model of part of the U1 SL3-7 region is indicated with a gray box.
Extended Data Figure 4
Comparisons of yeast and human U1 snRNPs and implications for alternative splicing.
a. Formation of the U1–5'SS helix induces stable binding of Luc7. In the absence of a pre-mRNA 5'SS in the free U1 snRNP density (left, EMD-8622), Luc7 and the U1 5' end are disordered. Upon 5'SS recognition at the U1 5' end (center, map A2), Luc7 becomes ordered and stabilizes the U1–5'SS interaction, suggesting a mechanism for the selection of weak 5'SS sequences. The free U1 snRNP and the 5'SS-bound (map A2) cryo-EM densities are superimposed on the right. Although the long α-helical density next to Luc7 cannot be assigned with confidence, protein-protein crosslinking data12 and protein secondary structure prediction are consistent with either to Prp40 or Snu71. Based on additional biochemical data on the interaction between the α-helical Prp40 FF1 domain and Luc7 ZnF2 (ref. 52), we would speculate that the Prp40 FF1 domain is the most likely candidate for this density. b. Comparison of the yeast U1 snRNP ‘core’ with the human U1 snRNP crystal structure (PDB ID 3CW1). Protein and RNA (top) and RNA only (bottom) are shown side-by-side (left and center) and superimposed by a global alignment in PyMOL (right). Coloured as in Extended Data Fig. 3a. c. The yeast U1 snRNP model suggests regulatory mechanisms for human alternative splicing factors. The human homologues of the peripheral yeast U1 proteins may function through stabilization of the U1–5'SS interaction (region 1), of the U1-U2 3' region interface (region 2), or the U1-U2 5' interface (region 3). The yeast U1 snRNP ‘core’ is shown superimposed on a surface representation of the U1 snRNP model (top), compared with the similarly coloured human U1 snRNP (below). Interaction sites with the U2 snRNP are labelled (top). d. The location of yeast U1 snRNP components with homology to human splicing factors are indicated in the U1 snRNP structure. The Prp39-Prp42 heterodimer (human PRPF39 homodimer), Nam8 (ref. 18) (human TIA-1 and TIA-R), Luc7 (ref. 53) (human LUC7-like 1-3), and the Yhc1 C-terminus (human U1C) have clear counterparts in the human system. The yeast-specific U1 snRNA insertions may be replaced in the human system by alternative splicing factors that modulate interactions with the U2 5' region. e. Schematic model of the yeast E complex based on the U1 snRNP structure and biochemical data22. Luc7, Snu71, and Prp40 form a heterotrimer in vitro52, and their interacting regions may be located near unassigned density (compare Extended Data Fig. 1e) at the tip of an unassigned 40-residue α-helix next to Luc7 ZnF2. This helix is likely to belong to the U1 subunit Snu71 or Prp40, consistent with protein crosslinking12 and protein secondary structure prediction. Prp40 could then bind the yeast branch point binding protein (BBP, human SF1), which in turn interacts with Mud2 (human U2AF65) to tether the pre-mRNA branch point sequence in the E complex22.
Extended Data Figure 5
Conformational flexibility of the U2 snRNP.
a. Two defined positions of the U1 snRNP-U2 3' region could be identified relative to the U2 5' region. A complex models were fitted into class 2 and 4 from Round 2 of three-dimensional image classification (compare Extended Data Fig. 2a). The classes are aligned via their U2 5' region, illustrating their relative flexibility. b. Cartoon schematic of observed positions of the U2 3' region relative to the U2 5' region in A complex (left), B complex8 (center), and Bact complex (right, modelled from ref. 54). While in B complex the U2 3' region is free, in A and Bact complexes its position is influenced by interactions with Prp39 as well as Syf1 and Clf1, respectively. c. The U2 snRNP subunit Lea1 (human U2A’) aids to position the U2 snRNP 3' domain in different spliceosome states. In our A complex structure, the Prp39 TPR repeat T1 contacts the helical C-terminus of Lea1. In the yeast C complex structure, non-modeled density for the Syf1 N-terminus binds a neighboring but non-overlapping surface of Lea1 (PDB ID 5LJ5). In C*/P complex55 (PDB ID 6EXN), the Syf1 N-terminus binds yet another Lea1 surface and the U2 3' domain is repositioned relative to its C complex location. Together, this suggests that the Lea1 provides multiple interfaces that can be used to position the U2 3' domain in different spliceosomal complexes. d. Fit of the U2 3’ region coordinate model to the A1 cryo-EM density. The dashed black separates the U2 3’domain (Sm ring, Msl1 and Lea1 subunits, and U2 snRNA) and the SF3a subcomplex (Prp9, Prp11, and Prp21). Two orthogonal views are shown. See Supplementary Video 2. e. Fit of the U2 5’ region coordinate model to the A3 cryo-EM density. Density consistent with U2 snRNA stem IIa/b and the branch helix is observed. Two density thresholds are shown side-by-side (left, 0.0163; right, 0.0121), and orthogonal views are shown underneath. See Supplementary Video 2.
The first 10 nucleotides of U1 snRNA are disordered in the free U1 snRNP12, but become ordered in our A complex structure by pairing with the pre-mRNA 5'SS (Fig. 2a,b). Additional density appeared adjacent to the U1–5'SS helix, into which we could build a newly ordered Yhc1 peptide (human U1C) that contacts the 5'SS phosphate backbone (+5 and +6 positions, the ‘Yhc1 5'SS loop’) and a near-complete model of Luc7 (Luc7 in ref. 12 was attributed to what is now assigned as Snu71) (Extended Data Figs 3a,c, 4a). While Luc7 is disordered in the free U1 snRNP, it associates stably with the U1–5'SS helix in the A complex (Extended Data Fig. 4a), suggesting a mechanism for the selection of weak 5'SS sequences13. In our structure Luc7 is anchored by its N-terminal α–helix 1 to the Sm ring subunit SmE, and its C3H-type Zn-finger 1 (ZnF1) domain binds where the 5' exon emerges from the U1–5'SS helix, in excellent agreement with RNA-protein crosslinks13 (Fig. 2b). The adjacent Luc7 C2H2-type ZnF2 contacts the U1–5'SS helix minor groove and the U1 snRNA phosphate backbone (nucleotides U5-C8). This interaction mirrors that between the Yhc1 ZnF domain and the 5'SS nucleotides +1 to +4 downstream of the 5'SS junction10 (Fig. 2b). Thus, Yhc1 and Luc7 make no base-specific interactions with the U1–5'SS helix, and instead cradle the U1–5'SS helix phosphate backbone to stabilize 5'SS binding. Consistent with the structure, weakening of any of these interactions can impair splicing and bypass the requirement for Prp28 helicase activity13–16.The A complex structure reveals the first structural insights into the functions of the human alternative splicing factors LUC7-like (yeast Luc7) and TIA-1 (yeast Nam8) (Extended Data Fig. 4c,d). Luc7 and its human homologues LUC7-like 1-3 are highly conserved, suggesting that the LUC7L N-terminal α-helix also anchors it to the SmE protein and that the invariant ZnF2 helix α8 similarly stabilizes the U1–5'SS helix to promote the inclusion of weak alternative splice sites13 (Fig. 2b, Extended Data Figs 3c, 6a). The yeast U1 snRNP subunit Nam8 and its human homologue TIA-1 contain three RNA recognition motif (RRM) domains and a C-terminal Gln-rich (Q-rich) extension (Extended Data Fig. 6b). Human TIA-1 binds to uridine-rich sequences downstream of the 5'SS predominantly through its RRM217,18 to allow the use of weak 5’SSs. The Nam8 RRM2 shows high sequence similarity to the TIA-1 RRM2, including the nearly identical RNP1 and RNP2 motifs, indicating that Nam8 binds uridine-rich sequences also through its RRM2 (Extended Data Fig. 6b). In the A complex structure the Nam8 RRM3 and its C-terminal region bind in a cavity of the Prp39-Prp42 heterodimer and contact the Yhc1 C-terminal region near the U1–5'SS helix (Fig. 2c). From this location, Nam8 could project its mobile RRM2 domain to bind uridine-rich intron sequences downstream of the 5'SS, consistent with crosslinking experiments17, and thereby promote meiotic pre-mRNA splicing19 (Fig. 2a,c).
Extended Data Figure 6
Luc7 and Nam8 sequence alignments.
a. The Luc7 (human LUC7-like) amino acid sequence alignment comparing Saccharmoyces cerevisiae, Kluyveromyces lactis, Schizosaccharomyces pombe, Danio rerio, Xenopus tropicalis, Mus musculus, Bos taurus, and Homo sapiens was generated with Clustal Omega and visualized with ESPript 3 (refs 56,57). For the human sequence, LUC7-like 1 was used. Secondary structure elements are indicated above the sequence and derive from the A complex structure (purple) or PSIPRED58 secondary structure prediction (gray). Modelled regions (dashed line) and the Zn-coordinating residues of Zn-finger 1 and 2 (ZnF, asterisk) are indicated. Invariant or conserved residues are highlighted with a red box or red letter font, respectively. b. As panel a but for Nam8 (human TIA-1) comparing Saccharmoyces cerevisiae, Kluyveromyces lactis, Schizosaccharomyces pombe, Drosophila melanogaster, Danio rerio, Xenopus tropicalis, Mus musculus, Bos taurus, and Homo sapiens amino acid sequences. RNA recognition motif, RRM; ribonucleoprotein domain, RNP.
In the A complex, the U1 snRNP binds to the U2 snRNP through two interfaces, A and B (Fig. 2a). In interface A, the N-terminal helices α1-2 of the U1 protein Prp39 stably bind the U2 3' domain subunit Lea1 (human U2A') (Fig. 2a; Extended Data Fig. 5). The Prp39-Prp42 heterodimer binds Yhc1 to anchor the U2 snRNP 3' domain to the U1 snRNP. Similar interactions were observed biochemically between the human alternative splicing factor PRPF39 homodimer and U1C12 (yeast Yhc1), suggesting that PRPF39 may likewise contact the human U2 3' domain, though it is not an obligate component of the human A complex20 (Fig. 2a). Different, non-overlapping Lea1 surfaces are used to interact with the NTC protein Syf1 in the yeast C and C*/P complex conformations of the spliceosome21 (Extended Data Fig. 5c), suggesting that Lea1 aids in repositioning the U2 3' domain in multiple stages of splicing. Interface B is transient and found only in a subset of cryo-EM images (Extended Data Figs 2a, 5a,b). It involves weak interactions between the yeast-specific U1 snRNA Stem–loop (SL) 3-3 and the U2 SF3b Rse1 subunit β-propellers B and C (BPB, BPC) and the U2 SF3a Prp9 C-terminus. The pre-mRNA 5'SS and BP branching reactants are positioned ~150 Å apart in the A complex, with 40 nucleotides of the UBC4 intron looped out in-between (Fig. 2a, ED Fig. 1e,f). The surprisingly small interfaces between U1 and U2 snRNPs orient the snRNPs relative to each other, and this may facilitate 5'SS transfer in the assembled spliceosome and the subsequent dissociation of the U1 snRNP, consistent with structural and biochemical data7,8. While the precise U1–U2 snRNP interfaces may differ in the human A complex, a key function of U1–U2 (alternative) splicing factors could be to ensure that U1 and the U1–5’SS helix is oriented correctly relative to the U2 snRNP.Prior to A complex formation, the yeast Msl5-Mud2 heterodimer recognizes the BP sequence through Msl5 and binds the U1 snRNP subunit Prp40 (human PRPF40) in the E complex, looping out the intron between the 5'SS and BP sequences22 (Extended Data Fig. 4e). While Prp40 was not identified in the free U1 snRNP12 or in our A complex structure, Prp40 crosslinks to Luc7 and Snu71 (ref. 12) and unassigned cryo-EM density in the A complex may indicate its peripheral location near Luc7 (Extended Data Figs 1e, 4a,e). Msl5-Mud2 may then be destabilized by the Sub2 helicase, allowing the Prp5 helicase to remodel U2 snRNA for the stable association of the U2 snRNP with the BP sequence in the A complex9. Prp5 was shown to physically interact with the U2 SF3b subunit Hsh155 HEAT repeats 1-6 and 9-12 (ref. 23) and with U2 snRNA at and surrounding the branchpoint-interacting stem–loop9. Thus, after Prp5 activity, Prp5 needs to dissociate to fully expose the Hsh155 HEAT repeats 11-13 together with the U2 snRNA 5' end in the A complex, to allow for the subsequent U4/U6.U5 tri-snRNP association to assemble the spliceosome7–9 (Fig. 2a).The A complex structure also provides new insights into formation of the fully assembled
pre-B complex spliceosome, which requires integration of the tri-snRNP with the A
complex. The subsequent Prp28 helicase-mediated transfer of the 5'SS from U1 to
U6 snRNA and destabilization of the U1 snRNP produces the B complex spliceosome24. We first modelled a fully assembled yeast
spliceosome, by superimposing the U2 snRNP SF3b-containing domains of the yeast A
complex (this study) and the yeast B complex structure8. As in the B complex structure8, the
U2 snRNP would associate with tri-snRNP via U2/U6 helix II and Prp3 (Extended Data Fig. 7). The modelling shows that the
U1 snRNP would clash with large parts of the Brr2-containing ‘helicase’
domain (‘U1–B complex’; Extended
Data Figs. 7b, 8b), which may be
relieved owing to their known flexibilities8
(Extended Data Fig. 5a). However the known
binding site for Prp28 at the U5 Prp8 N-terminal domain (Prp8N) observed in
human tri-snRNP25 would be sterically occluded by
the pre-bound B-complex proteins7,8,26. We
therefore considered an alternative model for the assembled yeast ‘pre-B
complex’ spliceosome, by combining the available data from yeast and human
systems8,25,27,28 (Fig. 3a, Extended Data Figs. 7a, 8a).
First, the isolated human25 and yeast
tri-snRNP26,29 structures differ in their protein composition and conformation,
indicating that different complexes accumulate at steady-state. In the human tri-snRNP
structure25 the BRR2 helicase is held near
SNU114 by the SAD1 protein and PRP28 is bound to the PRP8 N-terminal domain
(PRP8N). In the yeast tri-snRNP26,29 and yeast and human B complex
structures7,8 Brr2 is repositioned and loaded onto its U4 snRNA substrate and the B
complex proteins replace Prp28 at the Prp8N domain, ready for spliceosome
activation. Second, in humans, an ATPase-deficient PRP28 helicase stalls spliceosome
assembly at the pre-B complex stage, prior to disruption of the U1–5'SS
interaction28 and this complex comprises the
U1 and U2 snRNPs, a loosely associated tri-snRNP, and SAD1 (ref. 28). Third, in yeast, Sad1 is essential for splicing and is very
transiently associated with tri-snRNP27. Given
the high conservation of the major spliceosome components in yeast and humans, the yeast
spliceosome may likewise assemble with a human-like tri-snRNP that contains Prp28, Sad1,
and a repositioned Brr2 helicase25,28. Based on these assumptions, we modelled a yeast
pre-B complex spliceosome that comprises all five snRNPs with a combined molecular
weight of ~3.1 megadalton and with only minor clashes (Fig. 3a, Extended Data Fig.
7a,b). Notably, this model indicates that the U2 snRNP positions the U1 snRNP
to deliver the U1–5'SS helix to the exposed U6 ACAGAGA stem in tri-snRNP,
only ~20 Å away, where Prp28 is likely to mediate 5'SS transfer,
consistent with protein-RNA crosslinks30 (Fig. 3b). This suggests that repositioning of the
Brr2 helicase onto U4 snRNA would coincide with U1 snRNP release due to a steric clash,
rendering Brr2 competent for spliceosome activation only after successful 5’SS
transfer (see Extended Data Figs 7b and 8a for details). The model thus indicates a new
molecular checkpoint to couple 5'SS transfer with U1 snRNP release and formation
of the B complex (Extended Data Figs 7b and 8a).
Extended Data Figure 7
Details of the pre-B complex model.
a. Multiple views of the pre-B complex model, generated by combining functional and structural data from yeast and human systems8,25. The mobility of the U1 snRNP relative to the U2 snRNP in A complex (this study) as well as of the U2 snRNP relative to tri-snRNP in the B complex structure8 are indicated (left). The pre-B model contained only minor clashes, and a clash between the highly flexible Prp28 C-terminal RecA-2 lobe (from human tri-snRNP25) and the highly flexible U6 snRNA 5' Stem loop (from yeast B complex8) may be resolved by small movements of either domain. See Methods for details on the pre-B model. b. Structural comparisons of the yeast pre-B model (this study) and the yeast B complex structure (PDB ID 5NRL, ref. 8) suggest the existence of a molecular checkpoint to couple 5'SS transfer to U1 snRNP release and formation of the activation-competent B complex. In the pre-B model (left) Sad1 tethers Brr2 through its interaction with the conserved Brr2 PWI domain51, and the U1 snRNP and its U1–5'SS helix are positioned near the U6 ACAGAGA region and the helicase Prp28. Subsequent to Prp28-mediated 5'SS transfer, Brr2 is repositioned onto its U4 snRNA substrate, guided by the B complex-specific proteins (right). In this conformation the Brr2 helicase and its associated factors would clash with the U1 snRNP, consistent with U1 snRNP destabilization and release yeast and human B complexes7,8. Brr2 is now ready to initiate spliceosome activation and formation of the active site in the Bact complex. Regions that are changed between pre-B and B complex models (black outline) and the clash between the Brr2-containing ‘helicase’ domain and the U1 snRNP in B complex (red ‘X’) are indicated. The lower right panel would conform to the alternative ‘U1-B complex’ model.
Extended Data Figure 8
Model for early splicing events.
a. Cartoon schematic of proposed early splicing events, detailing (I) assembly of pre-B complex spliceosome from A complex and the U4/U6.U5 tri-snRNP and (II) the subsequent conversion to the pre-catalytic B complex spliceosome. In the pre-B model the mobile U1 snRNP is next to Prp28, which is bound at the Prp8N domain. To initiate 5'SS transfer, Prp28 could clamp the pre-mRNA at or next to the U1–5'SS helix to destabilize it and to hand over the 5'SS to the U6 ACAGAGA region of tri-snRNP, consistent with protein-RNA crosslinks30. Formation of the U6–5'SS interaction may induce the binding of the B complex proteins to replace Prp28 at the Prp8N domain and induce the large movement of Brr2 to its B complex location on U4 snRNA (Fig. 5). The U1 snRNP, now loosely tethered to U2, may dissociate from B complex due to the steric clash with the Brr2-containing ‘helicase’ domain8 (Extended Data Fig. 7b). In agreement with this, the human pre-B complex converts to a B complex-like state in presence of a 5'SS oligonucleotide, which coincides with U1 snRNP release28. This model can explain how Brr2 is kept inactive to prevent premature U4/U6 duplex unwinding26. The model thereby implies the existence of a molecular checkpoint, coupling 5'SS transfer from U1 to U6 snRNA with Brr2 helicase repositioning and U1 snRNP release to generate the activation-competent B complex spliceosome. b. Cartoon schematic of an alternative model for spliceosome assembly and 5'SS transfer that relies only on the yeast A complex (this work), tri-snRNP26,29, and B complex structures8. In this model the tri-snRNP that associates with A complex already contains the Brr2 helicase bound to the U4 snRNA substrate and the yeast B complex proteins at the Prp8 N-terminal (Prp8N) domain. The tri-snRNP then binds the A complex (transition I, ‘Assembly’), requiring a significant readjustment to avoid a steric clash of the Brr2-containing ‘helicase’ domain and the U1 snRNP (‘U1-B complex’). The Prp28 helicase is then recruited to U1 snRNP directly as the Prp28-binding site on the Prp8 N-terminal domain in human tri-snRNP is occupied by B complex proteins25. Prp28 then disrupts the U1-5'SS helix, leading to 5'SS transfer (transition II, ‘Transfer’). Similar to the ‘pre-B complex’ assembly model in panel a, the U1 snRNP, now freed from the 5'SS, may then be released due to a steric clash with the Brr2-containing ‘helicase’ domain. This model does not require Sad1. Compare to panel a.
Figure 3
Spliceosome assembly and 5'SS transfer.
a. One of the two alternative pre-B-complex models, suggesting that the U2 snRNP
orients the U1 snRNP to deliver the pre-mRNA 5'SS to the U6 ACAGAGA stem.
The model was obtained by superposing the yeast A (this study) and B complex
structures (PDB 5NRL) and by modifying the locations of Brr2, U4 Sm ring, Sad1,
and Prp28 to resemble a human-like pre-B complex conformation based on
biochemical data and the human U4/U6.U5 tri-snRNP structure (PDB 3JCR) (see
Methods). Coloured as in Fig. 1 and ref. 8.
b. The pre-B complex RNA network and the Prp28 helicase are shown as cartoons and are superimposed on transparent surfaces of spliceosome proteins. Prp28 is positioned at the Prp8 N-terminal domain as in human tri-snRNP25 and may clamp onto the pre-mRNA near the U1–5'SS helix to destabilize it and transfer the 5'SS from U1 snRNA to the U6 snRNA ACAGAGA stem (red arrow), which are separated by ~20 Å in the pre-B model.
In summary, our prespliceosome structure reveals how the U1 and U2 snRNPs recognize the two reactants of the branching reaction and associate together with tri-snRNP into the fully assembled spliceosome. The results further suggest how the human alternative splicing factors LUC7-like and TIA-1 may influence splice site selection.
Methods
Prespliceosome preparation and purification
To obtain the prespliceosome A complexes for structural study, we prepared yeast Saccharomyces cerevisiae containing a genomic TAPS affinity tag on the U2 snRNP subunit Hsh155, essentially as described31. Yeast were then grown in a 120 L fermenter, and splicing extract was prepared using the liquid nitrogen method, essentially as described32. Capped UBC4 pre-mRNA containing a point mutation (U to A) two nucleotides upstream of the branch point adenosine (BP) and three MS2 stem loops at the 3'-end was produced by in vitro transcription9,33. The RNA product was labelled with Cy5 at its 3’-end to monitor complex purification34. The pre-mRNA substrate was bound to MS2-MBP fusion protein and added to an in vitro splicing reaction carried out for 90 min at 23 ºC, essentially as described33. The reaction mixture was then centrifuged through a 40% glycerol cushion in buffer A (20 mM HEPES (pH 7.9), 50 mM KCl, 0.2 mM EDTA, 1 mM DTT, 0.04% NP-40). The cushion was diluted with buffer A containing 1% glycerol, and applied to amylose resin (NEB) pre-washed with buffer B (20 mM HEPES (pH 7.9), 75 mM KCl, 5% glycerol, 0.2 mM EDTA, 1 mM DTT, 0.03% NP-40). After 12 h incubation at 4 ºC, the resin was washed with buffer B and eluted in buffer B containing 50 mM KCl and 12 mM maltose. Fractions containing A complex were pooled and applied to Strep-Tactin resin (GE Healthcare), pre-washed with buffer B, and incubated for 4 h at 4 ºC. The resin was washed with buffer B containing 2 mM MgCl2, and eluted with buffer B containing 50 mM KCl, 2.5 mM desthiobiotin, and 2 mM MgCl2. The A complex fractions were pooled and crosslinked using 1.1 mM BS3 (Sigma) on ice for 1 h, and subsequently quenched with 50 mM ammonium bicarbonate. The sample was concentrated to ~0.4 mg mL-1 and immediately used for EM sample preparation. Mass spectrometry (not shown), indicated that homogenous A complex was purified, containing sub-stoichiometric amounts of Prp5 (Extended Data Fig. 1b). The splicing assay in Extended Data Fig. 1a was carried out as for A complex purification, but in a volume of 25 µL and in the absence of MS2-MBP fusion protein, and was visualized after 30 min of splicing at 23 ºC on a denaturing 14% polyacrylamide TBE gel with a Typhoon scanner (GE Healthcare).
Electron microscopy
For cryo-EM analysis the A complex sample was applied to R2/2 holey carbon grids (Quantifoil), precoated with a 5–7 nm homemade carbon film. Grids were glow-discharged for 20 s before deposition of 2.5 µL sample (~0.4 mg mL-1), and subsequently blotted for 2–3.5 s and vitrified by plunging into liquid ethane with a Vitrobot Mark III (FEI) operated at 4 °C and 100% humidity. Cryo-EM data was acquired on three separate FEI Titan Krios microscopes (datasets 1-3) operated in EFTEM mode at 300 keV, each equipped with a K2 Summit direct detector (Gatan) and a GIF Quantum energy filter (slit width of 20 eV, Gatan). Datasets 1 and 2 were recorded using ‘Krios 1’ and ‘Krios 2’ at the MRC-LMB, respectively, and dataset 3 using ‘Krios 2’ at the Astbury Biostructure Laboratory (University of Leeds). For dataset 1 5,935 movies were acquired using EPU (FEI) with a defocus range of –0.4 µm to –4.4 µm at a nominal magnification of 105,000x (1.13 Å pixel–1). The camera was operated in ‘counting’ mode with a total exposure time of 13 s fractionated into 20 frames, a dose rate of 4.25 e- pixel–1 s–1, and a total dose of 43 e- Å-2 per movie. Dataset 2 was collected in the same manner, except that 727 movies were recorded using SerialEM35, at a nominal magnification of 105,000x (1.14 Å pixel–1), a total exposure time of 8 s fractionated into 20 frames, a dose rate of 4.33 e- pixel–1 s–1 and a total dose of 27 e- Å-2 per movie. Dataset 3 was collected with EPU (FEI) similar to dataset 1, except that 2,745 movies were collected at a nominal magnification of 130,000x (1.07 Å pixel–1), a total exposure time of 8 s fractionated into 20 frames, a dose rate of 7.94 e- pixel–1 s–1 and a total dose of 56 e- Å-2 per movie.
Image processing
Movies were aligned using MOTIONCOR2 (ref. 36) with 5x5 patches and applying a theoretical dose-weighting model to individual frames. CTF parameters were estimated using Gctf37. Resolution is reported based on the gold-standard Fourier shell correlation (FSC) (0.143 criterion) as described38 and B-factors were determined and applied automatically in RELION 2.1 (refs 39,40). Particles from dataset 1 were automatically picked using Gautomatch (Kai Zhang) and screened manually, and were then extracted in RELION with a 5602 pixel box size and pre-processed. Particles from datasets 2 and 3 were picked and pre-processed in the same way, and were then rescaled to the pixel size of dataset 1 (1.13 Å pixel–1) in RELION 2.1 by Fourier cropping during particle extraction with a 5602 pixel box. For rescaling, we first calculated 3D refinements in RELION 2.1 for each dataset (1-3) and performed real space correlation fits in UCSF Chimera to identify scaling factors for datasets 2 and 3 relative to dataset 1. Because the absolute magnification values differed slightly for the different microscopes, we re-determined the CTF values for datasets 2 and 3 using the new pixel sizes with Gctf37, and then re-extracted and rescaled the particles to the 5602 pixel box. Combining datasets 1-3 yielded a total dataset of 406,272 particles that were used for subsequent processing.The first 22,319 particles from dataset 1 were used to generate an ab initio 3D reference for the A complex using default parameters and three classes in cryoSPARC41 (Extended Data Fig. 2a). The complete dataset (1-3) was subjected to a ‘heterogeneous’ (multi-reference) refinement in cryoSPARC using default parameters and four classes: the ab initio A complex reference and three ‘junk’ references (Extended Data Fig. 2a; Round 1). Class 1 contained 153,570 particles (37.8%, percentage of particles form the full dataset) and was used for a 3D refinement in RELION 2.1 with a soft mask in shape of the A complex. This yielded a density (map A1) with an overall resolution of 4.9 Å and a B-factor of -188 Å2, comprising U1 snRNP and the U2 snRNP 3’ region (Extended Data Figs 1e,d, 2, 9). To improve the U1 snRNP density, we prepared a soft mask enveloping the U1 snRNP with the volume eraser in UCSF Chimera42 and RELION 2.1 (refs 39,40). This allowed the focused refinement of the U1 snRNP (map A2) from the same 153,570 particles to an overall resolution of 4.0 Å resolution and a B-factor of -146 Å2 (Extended Data Figs 1e,d, 2, 9). In A complex the U2 snRNP 5' region is flexible relative the U1 and the U2 3' region (Extended Data Fig. 2). To position the U2 snRNP 5' region in the A complex, we used a soft mask surrounding the U2 5' region and carried out 3D classification without image alignment with six classes (Round 2, Extended Data Fig. 2a). This revealed a class with defined U2 5' region from 19,937 particles (4.9%) that could be refined to an overall resolution of 10.4 Å (Extended Data Figs 2, 9). Local resolution was estimated using ResMap43 (Extended Data Fig. 2d,e).
Extended Data Figure 9
Data collection, refinement statistics, and validation.
a. Cryo-EM data collection and refinement statistics of the A complex structure. Maps A1 and A3 were used to position the U2 snRNP 3' and 5' regions, respectively (see Methods). b. FSC between the A2 cryo-EM density and the refined A complex U1 snRNP coordinate model.
Structural Modeling
We prepared a composite model of the A complex by combining the A1-3 densities (Extended Data Fig. 1e,f). Model building was carried out in COOT44. The U1 snRNP coordinates were refined into the sharpened A2 density in PHENIX45 using the phenix.real_space_refine routine, and applying secondary structure, rotamer, nucleic acid, and metal ion restraints. Homology models for yeast Yhc1, Snp1, and Mud1 were generated using MODELLER46 from the human U1 snRNP crystal structures10 (PDB ID 4PJO, 4PKD) and were fitted and manually adjusted in the A2 map. The yeast B complex U5 Sm ring model was used as the initial model for the U1 Sm ring, and was manually adjusted in the A2 density. Initial models for Prp39 and Prp42 were generated by I-TASSER47 and were subsequently adjusted and extended manually. The Prp39 N-terminal residues 47-339 were modelled as poly-alanine due to a lower local resolution of ~5-6Å (Extended Data Figs 2d,e, 3c). Snu56, the Yhc1 C-terminus, the Snu71 N-terminus were modelled de novo, where Yhc1 residues 48-82 and 135-142 were modelled as poly-alanine. To build the Luc7 model a C3H-type ZnF (from PDB ID 1RGO) for ZnF1 and a C2H2-type ZnF (from Yhc1) for ZnF2 were used to guide modelling in the A2 density, with a local resolution of 4-5Å (Extended Data Fig. 3c). The helices connecting Luc7 ZnF1 and ZnF2 (α5-7) were modelled as poly-alanine, and assigned based on density connectivity. The U1 snRNP protein model is in excellent agreement with biochemical and protein crosslinking results12. The U1 snRNA model was generated based on similarity to U1 snRNA in the human U1 snRNP crystal structures (PDB ID 3CW1, 4PJO, 4PKD) and according to the yeast U1 snRNA secondary structure prediction48. All basepairing U1 snRNA regions (helix H, SL1, SL2-1, -2, SL3-1, -2, -3, -4, -5, -6), except for the SL3-7 and the tip of SL3-3, were modelled (Extended Data Fig. 3f,g). The human SL1 loop (PDB ID 4PKD) was rigid-body-fitted together with the homology model of the yeast Snp1 (described above), and the human U1 snRNA sequence was replaced with that from yeast. The loops connecting SL2-1 to SL2-2 as well as SL3-3 to SL3-4 and SL3-4 to SL3-5 and the tips of SL2-2, SL3-3, -4, and -5 were not built, due to a lower local resolution (~4.5Å). The location of a region of U1 snRNA SL3-7 was modelled as a phosphate backbone only and may correspond to the sequence surrounding residues 378-391 and 428-440. The U1 snRNA–pre-mRNA 5' splice site helix was modelled de novo, and the UBC4 pre-mRNA contained 12 nucleotides, ten from the intron (+1 to +10) and two from the 5' exon (–1 to –2).The U2 snRNP 3' region (U2 3' domain and SF3a subcomplexes) from the yeast B complex structure (PDB ID 5NRL) were fitted into the A1 density using UCSF Chimera42, and the positions of Lea1, Msl1 and U2 snRNA residues 139-1169 were adjusted as a rigid body in COOT44. The U2 snRNP 5' region from the yeast B complex structure (PDB ID 5NRL) was fitted into the A3 density in UCSF Chimera. This provided an excellent fit, suggesting that the U2 5' region structure is not changed significantly from that observed in the yeast B complex8. To generate the complete A complex model, the refined U1 snRNP model and the U2 snRNP 3' region were fitted into the A3 density in UCSF Chimera, together with the fitted U2 snRNP 5' region. The final model comprises 34 proteins, U1 and U2 snRNAs, and the pre-mRNA substrate.To generate the pre-B complex model we modified and combined structural models using COOT44, based on structural and biochemical data from yeast and human systems8,25,28. We first superimposed our A complex structure on the yeast B complex structure8 using the U2 SF3b-containing domain. The free human tri-snRNP structure (PDB ID 3JCR), which is likely to resemble the pre-B conformation7,25, was used to model the yeast tri-snRNP in the pre-B complex conformation. We first removed the B complex proteins from the yeast B complex structure, because they are absent in the purified human pre-B complex28. Human pre-B instead contained the PRP28 helicase and SAD1, and we therefore added crystal structures of the yeast Prp28 helicase49 (PDB ID 4W7S) and yeast Sad150 (PDB ID 4MSX) in their human tri-snRNP locations25. We then positioned the U4 Sm ring and Brr2 as in the human tri-snRNP structure, where the Brr2 PWI domain makes a conserved contact with Sad1 (ref. 51). We removed a Snu66 peptide bound to Brr2 from the model, since its binding at this site is uncertain in the pre-B complex conformation. Several minor differences remain between the free human tri-snRNP structure25 and the pre-B complex model, and these were not modelled. The final pre-B model contained only minor clashes, and one observed clash between the highly flexible Prp28 RecA-2 lobe25 and the flexible U6 snRNA 5' stem loop8,26 could be resolved by a minor repositioning of either domain. The final pre-B model comprises 66 proteins, five snRNAs, the pre-mRNA substrate, and has a combined molecular weight of ~3.1 MDa.Figures were generated with PyMol (http://www.pymol.org) and UCSF Chimera.
Data availability
Three-dimensional cryo-EM density maps A1, A2, and A3 have been deposited in the Electron Microscopy Data Bank under the accession numbers EMD-4363, EMD-4364, and EMD-4365, respectively. The coordinate file of the A complex has been deposited in the Protein Data Bank under the accession number 6G90.
Biochemical characterization and cryo-EM of the prespliceosome A complex.
a. Mutation of the UBC4 pre-mRNA branch point sequence (UACUAAC to UACAAAC, where A is the BP adenosine) stalls splicing before the first step, as described9. Splicing reactions were carried out for 30 min at 23 ºC in yeast extract using wild-type (lane 1) or mutant (U/A, lane 2) pre-mRNA (see Methods for details). This experiment was performed three times. The asterisk indicates a degradation product. For gel source data see Supplementary Fig. 1a.
b. Protein analysis of purified A complex (SDS-PAGE stained with Coomassie blue). The U2-associated Prp5 protein is sub-stoichiometric and not observed in the A complex structure. The purification and analysis of protein compositions were performed at least five times with similar results. For gel source data see Supplementary Fig. 1b.
c. Cryo-EM micrograph of the A complex. Scale bar, 100 nm. d. 2D class averages of the A complex were determined in RELION 2.1 (refs 39,40), and reveal a bipartite architecture, comprising the U1 snRNP and the U2 snRNP 3' and 5' regions, respectively. e. Composite cryo-EM density of the A complex shown in two orthogonal views (compare Fig. 1). The respective densities used for modeling the U1 snRNP (A2, gray), the U2 3' region (A1, cyan), and the U2 5' region (A3, green) are coloured and superimposed on a transparent outline of the full A3 map (Methods). The overall resolution of each map as well as the percentage from the cleaned dataset of 153,556 particles are shown in parentheses. Non-modelled regions are indicated and putatively assigned. f. Composite cryo-EM density with the final A complex model superimposed in a cartoon representation. The path of 40 nucleotides of the disordered UBC4 pre-mRNA intron are indicated. A complex components are coloured as in Fig. 1. Views as in panel e.
Cryo-EM image classification and refinement.
a. Image processing workflow for analysis of the A complex cryo-EM data set (Methods). To visualize differences between the reconstructions the U1 snRNP (gray), U2 3' (cyan) and U2 5' regions (green) are coloured. For each round of three-dimensional classification, the percentage of the data and the type of soft-edged mask are indicated. The type of mask and overall resolution are indicated for each 3D refinement (blue box). b. Orientation distribution plots for all particles that contribute to the respective A1, A2, and A3 cryo-EM reconstructions. c. Gold-standard Fourier shell correlation (FSC = 0.143) of the respective A1, A2, and A3 cryo-EM reconstructions. d. Two views of the composite A complex cryo-EM density (maps A1, A2, and A3) coloured by local resolution as determined by ResMap43. e. As panel d, but for a central slice through the composite A complex cryo-EM map.
Details of the U1 snRNP.
a. U1 snRNP structure with subunits coloured as in Fig. 1, except for Nam8 (orange), Snu56 (light blue), Snu71 (blue), Luc7 (dark purple), Mud1 (red) and U1 snRNA (various). The pre-mRNA nucleotides are labelled relative to the first nucleotide (+1) of the intron. The Nam8 RRM1 and RRM2 domains are flexible and project downstream of the 5'SS. The protein attributed to Luc7 in the free U1 snRNP structure12 was re-assigned to Snu71. Stem loop, SL; RNA recognition motif, RRM; Zn-finger, ZnF; N-terminus, N-term; C-terminus, C-term. In the structure we do not observe any evidence that the C-terminal tails of SmB, SmD1, and SmD3 interact with the 5’SS, consistent with their absence in the human 5’SS–minimal U1 snRNP crystal structure10. b. Representative regions of the sharpened U1 snRNP density determined at 4 Å resolution (map A2) are superimposed on the refined coordinate model. The density reveals side-chain details, and here segments from the Prp42 N-terminus (TPR repeat 1), the Sm ring subunit SmB, and the Snu56 α-helical domain are shown. c. The A2 cryo-EM density is shown superimposed on the coordinate models of a selection of U1 snRNP proteins: Luc7, Snu71, Yhc1, and Prp39. In the structure most of Snu71 is disordered, except for a small N-terminal domain (residues 2-43) that binds between the Prp42 N-terminus and the Snu56 KH-like fold, consistent with protein crosslinking12. Functional regions and disordered domains are indicated. d. The U1 snRNA–pre-mRNA 5' splice site (U1–5'SS) model is superimposed on its cryo-EM density (map A2). A secondary structure diagram of the U1–5'SS interaction is shown underneath the model. The register of the 5'SS is shifted by one nucleotide compared to the minimal human 5'SS–U1 snRNP crystal structure, due to an additional nucleotide in yeast U1 snRNA10 (U11). Lines indicate Watson–Crick base pairs and dots pseudouridine (ψ)-containing base pairs. e. The Prp39-Prp42 heterodimer is coloured to indicate each of their respective TPR repeats. f. Cryo-EM density of U1 snRNA from maps A2 (dark gray) and A3 (light gray) without (top) and with the superimposed coordinate model of yeast U1 snRNA (bottom). The model is labelled and coloured according to functional regions of U1 snRNA (5' end, pink; H helix, cyan; SL1, dark blue; SL2-1, green; SL3-1, light blue; SL2-2 and SL3-2 to -7, gray; 3’end and Sm site, yellow). Stem loop, SL. g. Secondary structure diagram of U1 snRNA. Bold letters indicate residues included in the model, lines indicate Watson–Crick base pairs, and dots G–U wobble and pseudouridine (ψ)-containing base pairs. Compare panel e. The conserved U1 snRNA ‘core’ is outlined with a gray box. The region of the putative phosphate backbone model of part of the U1 SL3-7 region is indicated with a gray box.
Comparisons of yeast and human U1 snRNPs and implications for alternative splicing.
a. Formation of the U1–5'SS helix induces stable binding of Luc7. In the absence of a pre-mRNA 5'SS in the free U1 snRNP density (left, EMD-8622), Luc7 and the U1 5' end are disordered. Upon 5'SS recognition at the U1 5' end (center, map A2), Luc7 becomes ordered and stabilizes the U1–5'SS interaction, suggesting a mechanism for the selection of weak 5'SS sequences. The free U1 snRNP and the 5'SS-bound (map A2) cryo-EM densities are superimposed on the right. Although the long α-helical density next to Luc7 cannot be assigned with confidence, protein-protein crosslinking data12 and protein secondary structure prediction are consistent with either to Prp40 or Snu71. Based on additional biochemical data on the interaction between the α-helical Prp40 FF1 domain and Luc7 ZnF2 (ref. 52), we would speculate that the Prp40 FF1 domain is the most likely candidate for this density. b. Comparison of the yeast U1 snRNP ‘core’ with the human U1 snRNP crystal structure (PDB ID 3CW1). Protein and RNA (top) and RNA only (bottom) are shown side-by-side (left and center) and superimposed by a global alignment in PyMOL (right). Coloured as in Extended Data Fig. 3a. c. The yeast U1 snRNP model suggests regulatory mechanisms for human alternative splicing factors. The human homologues of the peripheral yeast U1 proteins may function through stabilization of the U1–5'SS interaction (region 1), of the U1-U2 3' region interface (region 2), or the U1-U2 5' interface (region 3). The yeast U1 snRNP ‘core’ is shown superimposed on a surface representation of the U1 snRNP model (top), compared with the similarly coloured human U1 snRNP (below). Interaction sites with the U2 snRNP are labelled (top). d. The location of yeast U1 snRNP components with homology to human splicing factors are indicated in the U1 snRNP structure. The Prp39-Prp42 heterodimer (human PRPF39 homodimer), Nam8 (ref. 18) (human TIA-1 and TIA-R), Luc7 (ref. 53) (human LUC7-like 1-3), and the Yhc1 C-terminus (human U1C) have clear counterparts in the human system. The yeast-specific U1 snRNA insertions may be replaced in the human system by alternative splicing factors that modulate interactions with the U2 5' region. e. Schematic model of the yeast E complex based on the U1 snRNP structure and biochemical data22. Luc7, Snu71, and Prp40 form a heterotrimer in vitro52, and their interacting regions may be located near unassigned density (compare Extended Data Fig. 1e) at the tip of an unassigned 40-residue α-helix next to Luc7 ZnF2. This helix is likely to belong to the U1 subunit Snu71 or Prp40, consistent with protein crosslinking12 and protein secondary structure prediction. Prp40 could then bind the yeast branch point binding protein (BBP, human SF1), which in turn interacts with Mud2 (human U2AF65) to tether the pre-mRNA branch point sequence in the E complex22.
Conformational flexibility of the U2 snRNP.
a. Two defined positions of the U1 snRNP-U2 3' region could be identified relative to the U2 5' region. A complex models were fitted into class 2 and 4 from Round 2 of three-dimensional image classification (compare Extended Data Fig. 2a). The classes are aligned via their U2 5' region, illustrating their relative flexibility. b. Cartoon schematic of observed positions of the U2 3' region relative to the U2 5' region in A complex (left), B complex8 (center), and Bact complex (right, modelled from ref. 54). While in B complex the U2 3' region is free, in A and Bact complexes its position is influenced by interactions with Prp39 as well as Syf1 and Clf1, respectively. c. The U2 snRNP subunit Lea1 (human U2A’) aids to position the U2 snRNP 3' domain in different spliceosome states. In our A complex structure, the Prp39 TPR repeat T1 contacts the helical C-terminus of Lea1. In the yeast C complex structure, non-modeled density for the Syf1 N-terminus binds a neighboring but non-overlapping surface of Lea1 (PDB ID 5LJ5). In C*/P complex55 (PDB ID 6EXN), the Syf1 N-terminus binds yet another Lea1 surface and the U2 3' domain is repositioned relative to its C complex location. Together, this suggests that the Lea1 provides multiple interfaces that can be used to position the U2 3' domain in different spliceosomal complexes. d. Fit of the U2 3’ region coordinate model to the A1 cryo-EM density. The dashed black separates the U2 3’domain (Sm ring, Msl1 and Lea1 subunits, and U2 snRNA) and the SF3a subcomplex (Prp9, Prp11, and Prp21). Two orthogonal views are shown. See Supplementary Video 2. e. Fit of the U2 5’ region coordinate model to the A3 cryo-EM density. Density consistent with U2 snRNA stem IIa/b and the branch helix is observed. Two density thresholds are shown side-by-side (left, 0.0163; right, 0.0121), and orthogonal views are shown underneath. See Supplementary Video 2.
Luc7 and Nam8 sequence alignments.
a. The Luc7 (human LUC7-like) amino acid sequence alignment comparing Saccharmoyces cerevisiae, Kluyveromyces lactis, Schizosaccharomyces pombe, Danio rerio, Xenopus tropicalis, Mus musculus, Bos taurus, and Homo sapiens was generated with Clustal Omega and visualized with ESPript 3 (refs 56,57). For the human sequence, LUC7-like 1 was used. Secondary structure elements are indicated above the sequence and derive from the A complex structure (purple) or PSIPRED58 secondary structure prediction (gray). Modelled regions (dashed line) and the Zn-coordinating residues of Zn-finger 1 and 2 (ZnF, asterisk) are indicated. Invariant or conserved residues are highlighted with a red box or red letter font, respectively. b. As panel a but for Nam8 (human TIA-1) comparing Saccharmoyces cerevisiae, Kluyveromyces lactis, Schizosaccharomyces pombe, Drosophila melanogaster, Danio rerio, Xenopus tropicalis, Mus musculus, Bos taurus, and Homo sapiens amino acid sequences. RNA recognition motif, RRM; ribonucleoprotein domain, RNP.
Details of the pre-B complex model.
a. Multiple views of the pre-B complex model, generated by combining functional and structural data from yeast and human systems8,25. The mobility of the U1 snRNP relative to the U2 snRNP in A complex (this study) as well as of the U2 snRNP relative to tri-snRNP in the B complex structure8 are indicated (left). The pre-B model contained only minor clashes, and a clash between the highly flexible Prp28 C-terminal RecA-2 lobe (from human tri-snRNP25) and the highly flexible U6 snRNA 5' Stem loop (from yeast B complex8) may be resolved by small movements of either domain. See Methods for details on the pre-B model. b. Structural comparisons of the yeast pre-B model (this study) and the yeast B complex structure (PDB ID 5NRL, ref. 8) suggest the existence of a molecular checkpoint to couple 5'SS transfer to U1 snRNP release and formation of the activation-competent B complex. In the pre-B model (left) Sad1 tethers Brr2 through its interaction with the conserved Brr2 PWI domain51, and the U1 snRNP and its U1–5'SS helix are positioned near the U6 ACAGAGA region and the helicase Prp28. Subsequent to Prp28-mediated 5'SS transfer, Brr2 is repositioned onto its U4 snRNA substrate, guided by the B complex-specific proteins (right). In this conformation the Brr2 helicase and its associated factors would clash with the U1 snRNP, consistent with U1 snRNP destabilization and release yeast and human B complexes7,8. Brr2 is now ready to initiate spliceosome activation and formation of the active site in the Bact complex. Regions that are changed between pre-B and B complex models (black outline) and the clash between the Brr2-containing ‘helicase’ domain and the U1 snRNP in B complex (red ‘X’) are indicated. The lower right panel would conform to the alternative ‘U1-B complex’ model.
Model for early splicing events.
a. Cartoon schematic of proposed early splicing events, detailing (I) assembly of pre-B complex spliceosome from A complex and the U4/U6.U5 tri-snRNP and (II) the subsequent conversion to the pre-catalytic B complex spliceosome. In the pre-B model the mobile U1 snRNP is next to Prp28, which is bound at the Prp8N domain. To initiate 5'SS transfer, Prp28 could clamp the pre-mRNA at or next to the U1–5'SS helix to destabilize it and to hand over the 5'SS to the U6 ACAGAGA region of tri-snRNP, consistent with protein-RNA crosslinks30. Formation of the U6–5'SS interaction may induce the binding of the B complex proteins to replace Prp28 at the Prp8N domain and induce the large movement of Brr2 to its B complex location on U4 snRNA (Fig. 5). The U1 snRNP, now loosely tethered to U2, may dissociate from B complex due to the steric clash with the Brr2-containing ‘helicase’ domain8 (Extended Data Fig. 7b). In agreement with this, the human pre-B complex converts to a B complex-like state in presence of a 5'SS oligonucleotide, which coincides with U1 snRNP release28. This model can explain how Brr2 is kept inactive to prevent premature U4/U6 duplex unwinding26. The model thereby implies the existence of a molecular checkpoint, coupling 5'SS transfer from U1 to U6 snRNA with Brr2 helicase repositioning and U1 snRNP release to generate the activation-competent B complex spliceosome. b. Cartoon schematic of an alternative model for spliceosome assembly and 5'SS transfer that relies only on the yeast A complex (this work), tri-snRNP26,29, and B complex structures8. In this model the tri-snRNP that associates with A complex already contains the Brr2 helicase bound to the U4 snRNA substrate and the yeast B complex proteins at the Prp8 N-terminal (Prp8N) domain. The tri-snRNP then binds the A complex (transition I, ‘Assembly’), requiring a significant readjustment to avoid a steric clash of the Brr2-containing ‘helicase’ domain and the U1 snRNP (‘U1-B complex’). The Prp28 helicase is then recruited to U1 snRNP directly as the Prp28-binding site on the Prp8 N-terminal domain in human tri-snRNP is occupied by B complex proteins25. Prp28 then disrupts the U1-5'SS helix, leading to 5'SS transfer (transition II, ‘Transfer’). Similar to the ‘pre-B complex’ assembly model in panel a, the U1 snRNP, now freed from the 5'SS, may then be released due to a steric clash with the Brr2-containing ‘helicase’ domain. This model does not require Sad1. Compare to panel a.
Data collection, refinement statistics, and validation.
a. Cryo-EM data collection and refinement statistics of the A complex structure. Maps A1 and A3 were used to position the U2 snRNP 3' and 5' regions, respectively (see Methods). b. FSC between the A2 cryo-EM density and the refined A complex U1 snRNP coordinate model.
Supplementary Material
Supplementary Information containing original gel data (Supplementary Fig. 1), two supplementary videos (Supplementary Videos 1 and 2) and a PyMol session of the A complex coloured as in Fig. 1 (PDB coordinate file: 6G90) are available in the online version of the paper.
Authors: Peter Hackman; Jaakko Sarparanta; Sara Lehtinen; Anna Vihola; Anni Evilä; Per Harald Jonson; Helena Luque; Juha Kere; Mark Screen; Patrick F Chinnery; Gabrielle Åhlberg; Lars Edström; Bjarne Udd Journal: Ann Neurol Date: 2013-02-11 Impact factor: 10.422
Authors: Xueni Li; Shiheng Liu; Jiansen Jiang; Lingdi Zhang; Sara Espinosa; Ryan C Hill; Kirk C Hansen; Z Hong Zhou; Rui Zhao Journal: Nat Commun Date: 2017-10-19 Impact factor: 14.919
Authors: Andrew J MacRae; Patricia Coltri; Eva Hrabeta-Robinson; Robert J Chalkley; A L Burlingame; Melissa S Jurica Journal: RNA Biol Date: 2019-06-29 Impact factor: 4.652