U4/U6.U5 tri-snRNP represents a substantial part of the spliceosome before activation. A cryo-electron microscopy structure of Saccharomyces cerevisiae U4/U6.U5 tri-snRNP at 3.7 Å resolution led to an essentially complete atomic model comprising 30 proteins plus U4/U6 and U5 small nuclear RNAs (snRNAs). The structure reveals striking interweaving interactions of the protein and RNA components, including extended polypeptides penetrating into subunit interfaces. The invariant ACAGAGA sequence of U6 snRNA, which base-pairs with the 5'-splice site during catalytic activation, forms a hairpin stabilized by Dib1 and Prp8 while the adjacent nucleotides interact with the exon binding loop 1 of U5 snRNA. Snu114 harbours GTP, but its putative catalytic histidine is held away from the γ-phosphate by hydrogen bonding to a tyrosine in the amino-terminal domain of Prp8. Mutation of this histidine to alanine has no detectable effect on yeast growth. The structure provides important new insights into the spliceosome activation process leading to the formation of the catalytic centre.
U4/U6.U5 tri-snRNP represents a substantial part of the spliceosome before activation. A cryo-electron microscopy structure of Saccharomyces cerevisiae U4/U6.U5 tri-snRNP at 3.7 Å resolution led to an essentially complete atomic model comprising 30 proteins plus U4/U6 and U5 small nuclear RNAs (snRNAs). The structure reveals striking interweaving interactions of the protein and RNA components, including extended polypeptides penetrating into subunit interfaces. The invariant ACAGAGA sequence of U6 snRNA, which base-pairs with the 5'-splice site during catalytic activation, forms a hairpin stabilized by Dib1 and Prp8 while the adjacent nucleotides interact with the exon binding loop 1 of U5 snRNA. Snu114 harbours GTP, but its putative catalytic histidine is held away from the γ-phosphate by hydrogen bonding to a tyrosine in the amino-terminal domain of Prp8. Mutation of this histidine to alanine has no detectable effect on yeast growth. The structure provides important new insights into the spliceosome activation process leading to the formation of the catalytic centre.
Pre-mRNA splicing is catalyzed by an intricate molecular machine called the spliceosome and proceeds by a two-step trans-esterification mechanism, analogous to group II intron self-splicing[1]. The spliceosome is assembled on pre-mRNA by the ordered addition of small nuclear ribonucleoprotein particles (snRNPs) and numerous proteins including nineteen complex (NTC) and, nineteen related (NTR) complex[2-4]. Initially U1 and U2 snRNPs recognise the pre-mRNA 5′-splice site (5′SS) and branch point (BP), respectively. Recruitment of U4/U6.U5 tri-snRNP produces the fully assembled but catalytically inactive complex B[1]. U1 snRNP is displaced from the 5′SS by Prp28 (ref. 5), the 5′SS pairs with the ACAGAGA sequence in U6 snRNA, and Brr2 helicase unwinds the extensively base-paired U4/U6 snRNAs to release U4 snRNA with its associated proteins[6,7]. This allows U6 snRNA to base-pair with U2 snRNA generating the group II intron-like catalytic RNA core[8-11]. The 2’OH group of the BP adenosine attacks the 5′SS, producing exon1 and lariat intron-exon2 intermediates, and after further remodeling to complex C*, U5 snRNA loop 1 aligns exons 1 and 2 for the second trans-esterification[12,13]. The spliced mRNA product is released and the residual Intron Lariat Spliceosome (ILS) is disassembled, recycling the snRNPs for subsequent rounds of splicing[1].Electron microscopic studies of spliceosomes at different stages of the splicing cycle revealed low-resolution pictures of these complexes[14]. Taking advantage of the recent revolution in cryoEM single particle analysis[15] we reported the organisation of the proteins, and U5 snRNA and U4/U6 snRNAs in S. cerevisiae U4/U6.U5 tri-snRNP based on cryoEM map at 5.9 Å[16]. In Schizosaccharomyces pombe cell extract, ILS complexes containing U2, U5 and U6 snRNAs accumulate through inefficient disassembly[17,18]. The structure of this endogenous U2.U6.U5 spliceosomal complex was determined by single particle cryoEM at 3.6 Å resolution[19,20]. This was an important breakthrough that revealed the overall architecture of a spliceosomal complex with the striking structures of NTC and NTR[19,20]. The absence of spliced mRNA and step 2 factors[4] from this complex[19] confirms that it is the post-splicing ILS[18]. The structure also revealed the important features of the well-established group II intron-like catalytic RNA core[8-11] remaining after spliced mRNA is released[20].Here we present an essentially complete atomic model of S. cerevisiae U4/U6.U5 tri-snRNP[21,22] based on a cryoEM density map at 3.7 Å overall resolution, revealing the architectural and mechanistic principles of spliceosome activation.
Overall structure
We collected a new dataset on a Titan Krios microscope using the Gatan K2 Summit direct electron detector (Methods). The overall resolution of the tri-snRNP map was improved from 5.9 Å to 3.7 Å (Extended Data Fig. 1). Using a modified masked refinement with signal subtraction[23], we obtained more homogeneous 3.6, 3.7 and 4.2 Å reconstructions for the Body, Foot and Head domains, respectively and improved the resolution of the Arm domain from 10 Å to 6-7.5 Å (Extended Data Fig. 2). The new maps enabled us to build a near-complete atomic model of the yeast tri-snRNP containing 30 proteins, U4/U6 and U5 snRNAs (Fig. 1) revealing an amazing web of interactions between components of the complex (Extended Data Fig. 3).
Extended Data Figure 1
Image processing procedures
a, Representative micrograph. b, Representative 2D class averages obtained from reference-free 2D classification. c, Classification and refinement procedures used in this study.
Extended Data Figure 2
Local and overall resolutions of tri-snRNP maps
Local resolution estimation by Resmap[55] of a, the overall 3.7 Å map and b, maps of thehead, body and foot domains obtained from masked refinements with signal subtraction[23]. c, Gold-standard FSC curves for the overall map and the maps of the head, body and foot domains obtained from masked refinements. Their resolutions are estimated at FSC=0.143. d, e, f and g FSC curves of model versus map and cross-validation of model refinement by half-maps for the Body, Foot, Head and Overall maps, respectively. The red curves show FSC between the atomic model and the half-map it was refined against (half1) and the blue curves show FSC between the atomic model and the other half-map (half2) it was not refined against. The black curves show FSC between the atomic model and the sum map which the model was refined against.
Figure 1
Three orthogonal views of a near-complete atomic model of the Saccharomyces cerevisiae U4/U6.U5 tri-snRNP
Inset shows four subdomains.
Extended Data Figure 3
Representative EM density for different components of the map
a, Snu114 in the Foot domain with a bound GTP (magenta). The inset shows the GTP-binding pocket. b, Brr2 in the Head domain with a bound single-stranded region of U4 snRNA. The inset shows the density in the RNA binding tunnel. c, Density for Prp8 large and RNase-like domains. The inset shows the density in the core of Prp8. d, e and f, Prp3, Prp31 and Prp6 densities, respectively, with extended polypeptides.
Prp8
A complete atomic model of Prp8 is now built except the unstructured N-terminus and inter-domain linkers. The α-helix (αRT1) at the N-terminus of the reverse transcriptase (RT) domain in the crystal structure[24] extends further and forms a helix bundle (HB) with three additional long helices appended to the RT domain (Fig. 2a and 2b). Residues 108-733 form a predominantly α-helical N-terminal domain. Stems I and II of U5 snRNA are coaxially stacked[16] and an extra variable stem protrudes from the three-way junction (Extended Data Fig. 4). A long slightly bent C-terminal α-helix (residues 703-735) of the N-terminal domain fits into the minor groove of the co-axially stacked stems I and II, which is tightly harnessed in the major groove by a polypeptide loop (residues 535-543) protruding from the N-terminal domain (Fig. 2c). The conserved loop 1 of U5 snRNA, which aligns the exons during the second trans-esterification reaction[12,13], points towards the most positively charged and conserved surface of Prp8 in the Thumb/Linker domain, part of the active site cavity[24]. The BP+2 nucleotide cross-links in active spliceosomes between Prp8 residues 1585-1598, on the cavity surface (C. M. Norman and A.J.N., unpublished observations). This region is disordered in the Prp8-Aar2 complex[24] whereas in U4/U6.U5 tri-snRNP it forms a helix-turn-helix (the α-finger) and contacts U54-U55 of U4 snRNA near the three-way junction (Fig. 2b).
Figure 2
Prp8 and U4/U6 and U5 snRNAs
a, Domain structure of Prp8[24]: HB, helix bundle; RT, Reverse Transcriptase-like; Endo, Endonuclease-like; RH, RNaseH-like; JM, Jab1/MPN. b, Prp8 makes extensive interactions with U4/U6 and U5 snRNAs. c, The α-helix (residues 703-735) of the N-terminal domain fits into the minor groove of U5 snRNA and an extended polypeptide (residue 535-543) fits into the major groove on the opposite face, harnessing the RNA helix firmly in place. d, orthogonal view. U5 snRNA loop 1 interacts with the single-stranded region of U6 snRNA. e, The region around the ACAGAGA sequence forms a hairpin and is sandwiched between the Large and N-terminal domains of Prp8 and Dib1.
Extended Data Figure 4
Secondary structure of the snRNAs in tri-snRNP
a, U4/U6 snRNA; c, U5 snRNA. The colored nucleotides with red, green and blue background were built de novo into our EM density. The region near the ACAGAGA sequence of U6 snRNA forms a stem-loop that was not predicted previously. b, d, Representative EM density for U4/U6 snRNA duplex and U5 snRNA, respectively.
The 5′-stem-loop of U6 snRNA interacts with the N-terminal domain of Prp8 and the adjacent single-stranded region pairs with the exon binding U5 snRNA loop 1 (Fig. 2d). The small highly conserved protein Dib1 (ref. 25) binds to the helix bundle and α-finger of Prp8, and a long polypeptide of Prp31. U6 snRNA forms a short stem-loop, involving part of the ACAGAGA sequence, which is sandwiched between Dib1 and the Prp8 Large domain (residues 1648-1653) (Fig. 2d and e; Extended Data Fig. 4a)
Snu114
We built a near complete atomic model of Snu114 comprising five domains (D1-D5) similar to EF-G/EF-2 (ref. 26,27). The relative arrangement of D1-D3 closely resembles that of EF-G/EF-2, whereas D4 and D5 pack more compactly (Fig. 3a). The guanine nucleotide density is consistent with GTP bound via canonical interactions with the surrounding residues (Fig. 3b; Extended Data Fig. 3a and 5a-e). In most GTPases the glutamine residue in switch II loop places a water molecule at the γ-phosphate of GTP and hydrolyses the phosphate ester[28]. As in EF-Tu, EF-G and their eukaryotic counterparts, the catalytic glutamine residue is replaced by histidine in Snu114 (ref. 26) (Extended Data Fig. 5e). In U4/U6.U5 tri-snRNP, His218 is hydrogen-bonded to Tyr403 of Prp8 preventing the His218 side chain from rotating towards the γ-phosphate of GTP and hence keeping the GTPase inactive (Fig. 3c). In EF-G and EF-Tu, GTP is hydrolysed when this histidine is repositioned by a hydrogen-bond with a phosphate in the sarcin-ricin loop of the ribosome[29,30] (Fig. 3d). The extensive interactions between Snu114 and the N-terminal domain of Prp8 are conserved between U4/U6.U5 tri-snRNP and the S. pombe ILS[19] (Extended Data Fig. 5f). The hydrogen-bond between His218 of Snu114 and Tyr403 of Prp8 is maintained by the equivalent residues in ILS[19] (Extended data Fig. 5d). The GTP binding site of Snu114 is at the interface with the N-terminal domain of Prp8 leaving insufficient room for U5 snRNA or any proteins to access the GTPase active site and act like the sarcin-ricin loop[29,30] or GTPase activating protein (GAP)[28]. Since the structure suggests no obvious mechanism for Snu114 GTPase activation we investigated the function of Snu114 by mutagenesis. With the His218Arg mutation yeast shows only a mild temperature-sensitive phenotype, confirming earlier results[31] (Extended Data Fig. 5g) whereas the equivalent mutation in EF-Tu reduces cognate tRNA-induced GTPase activity 105-fold[32]. Surprisingly yeast containing the His218Ala mutant of Snu114 shows no apparent phenotype (Extended Data Fig. 5g) whilst the equivalent mutation in EF-Tu reduces the rate of GTP hydrolysis more than 106-fold[32]. Furthermore mutations of Tyr403 (Tyr403Phe and Tyr403Ala) in Prp8, which hydrogen-bonds with His218 in Snu114, have no apparent phenotype (Extended Data Fig. 5h). These results raise the possibility that Snu114-bound GTP may not be hydrolysed during splicing.
Figure 3
Snu114 and its interaction with Prp8 and U5 snRNA
a, Snu114, the N-terminal domain of Prp8 and U5 snRNA stems I and II form a stable domain in the Foot domain in U4/U6.U5 tri-snRNP. GTP is bound in the GTPase active site at the interface with the Prp8 N-terminal domain. b, Canonical interactions of GTP with surrounding residues in Snu114. c, The catalytic His218, hydrogen bonded to Tyr403 in Prp8, points away from the GTP γ-phosphate. d, Activation of EF-G GTPase upon binding to the sarcinricin (SR) loop in the ribosome. His87 moves closer to the γ-phosphate and places a water molecule[29].
Extended Data Figure 5
Interactions of Snu114 with guanine nucleotides and the N-terminal domain of Prp8 in the S. cerevisiae U4/U6.U5 tri-snRNP and S. pombe ILS complexes
a, Conformation of the Snu114(Cwf10)-bound GDP refined in the S. pombe ILS spliceosomal complex[19,20] (red, PDB 3JB9), was overlaid on GDPs found in other guanine-nucleotide binding proteins (grey, PDB coordinates: 1DAR, 2E1R, 2WRI, 1Z0I, 5CA8, 1XTQ, 4YLG, 1SF8, 5BXQ). b, Guanine nucleotide refined as GDP in Snu114 of the S. cerevisiae U4/U6.U5 tri-snRNP (blue) is overlaid on GDPs found in the PDB coordinates as in a. c, Conformation of guanine nucleotide refined as GTP in Snu114 of the S. cerevisiae U4/U6.U5 tri-snRNP (blue) agrees well with GTP or GTP analogs in other guanine-nucleotide binding proteins (PDB code: 2BV3, 2DY1, 2J7K, 4YW9, 1ASO, 1LF0 (grey)). d, Superposition of the active site of Snu114-GTP and Cwf10-GDP. e, Superposition of the GDP-bound EF-G (2WRI), GMP-PCP bound EF-G (4JUW) and Snu114 (S. cerevisiae tri-snRNP) active sites. His218 (His78 in EF-G) positions water molecule crucial for GTP hydrolysis. f, Comparison of Prp8N-term domain, Snu114 and U5 snRNA in the S. cerevisiae U4/U6.U5 complex and S. pombe ILS complex. g, Growth of serial dilutions of yeast strains carrying wild-type Snu114, His218Arg or His218Ala Snu114 mutants at different temperatures. Cells were spotted on YPD plates and grown at 14°C for 10 days, 30°C and 37°C for 2 days. h, Growth of serial dilutions of yeast strains carrying wild-type Prp8, Tyr403Phe and Tyr403Ala mutants. Cells were spotted on YPD plates and grown at 14°C for 9 days, 30°C for 3 days. This yeast strain does not survive at 37°C and thus is not shown.
The guanine nucleotide in the post-splicing ILS is interpreted as GDP[19] but its conformation is distinct from that of GDP in other GTPases (Extended Data Fig. 5a). In contrast the conformation of the Snu114-bound GTP in tri-snRNP superimposes well with GTP or non-hydrolysable GTP analogues in other GTPases (Extended Data Fig. 5c). When we refined our structure of Snu114 with GDP the resulting GDP conformation is similar to that observed in ILS (Extended Data Fig. 5b). The proposed guanine nucleotide-dependent regulatory role of Snu114 is based on the effects of XDP, XTP and a non-hydrolysable XTP analogue on the XTP binding mutant (D271N) of Snu114 (ref. 33,34). Mutations in the GTPase domain prevent the interaction of Snu114 with Prp8, blocking U5 snRNP assembly[35]. The observed effect of XDP, XTP and non-hydrolysable XTP may be due to XTP-induced stabilization and association of Snu114 mutants with Prp8.
The U4/U6 di-snRNP
The extensively base-paired U4/U6 snRNAs form a three-way helix junction (Extended Data Fig. 4a and 4b). Snu13, bound to the k-turn motif[36], is wedged between U4 5′-stem-loop and the U4/U6 snRNA helix II and packed against the Prp4 WD40 domain[37] (Fig. 4a). Prp3 makes extensive interactions with the Prp4 WD40 domain, the basket handle-like structure and Snu13, and forms a long α-helix sitting in the minor groove of U4/U6 helix II. After forming a short α-helix, Prp3 folds back to form a long α-helix binding across the major groove of U4/U6 helix II (Fig. 4b; Extended Data Fig. 3d). These latter two Prp3 helices and the connecting loop interact extensively with the RNaseH-like domain of Prp8 and Brr2 N-terminal domain. Prp3 further extends to form a ferredoxin-like domain[38], which packs against the Prp4 WD40 domain[37]. Masked classification of the Arm domain reveals extra density for the N-terminus of Prp3 extending towards the LSm protein ring (Extended Data Fig. 6a and 6b). The 3′-end of U6 snRNA binds to the central hole of the LSm protein ring while the preceding single-stranded region binds to the ferredoxin-like domain of Prp3 (ref. 38). The Nop and coiled-coil domains of Prp31 interact with Snu13 whereas the k-turn motif of U4 5′-stem-loop is sandwiched between Snu13 and Prp31 (ref. 36,39) (Fig. 4c). The extended polypeptide chain of Prp31 runs between the phosphate backbone of U4 5′-stem and Dib1, and forms a small domain together with Prp6 which is surrounded by the three-way RNA helix junction and the α-finger and helix bundle of Prp8 (Fig. 4d).
Figure 4
Interactions of U4/U6 snRNAs with proteins
a, Overview of U4/U6 di-snRNP. b, the extraordinary structure of Prp3 and its multiple interactions with U4/U6 snRNA, Prp4, Snu13, the RNaseH-like domain of Prp8, Brr2 N-terminal domain and the LSm core domain. c, The C-terminal region of Prp31 extends along U4 snRNA 5′-stem towards the three-way junction. d, The C-terminal extension of Prp31 makes multiple interactions with U4/U6 snRNAs, Dib1, Prp8 α-finger and the N-terminal extension of Prp6. e, the Prp4 WD40 domain and Prp31 interact with the C-terminal TPR domain of Prp6.
Extended Data Figure 6
Conformational flexibility of tri-snRNP observed by classification
a, Different conformations of the Arm domain demonstrated by the unsharpened maps of the three major classes (purple, magenta and red) obtained from masked classification of the Arm domain alone followed by masked refinement with the Body and Arm domains. The Body domain was included in the refinement because the arm domain is too small for accurate alignments. b, The sharpened map of one of the three classes with Prp3 and LSm models shown. In the improved domain maps for the Arm domain, extra density for the N-terminal helix of Prp3 could be observed to extend to the LSm proteins. c, The sharpened map of the tri-snRNP and the locations of Snu66 and Prp8. d, The open and closed conformations of the Head and Foot domains of the tri-snRNP observed by global classification. The unsharpened maps for the two major classes obtained from global classification with finer angular sampling (1.8°) followed by 3D auto-refinement are shown. The open and closed states are indicated. e, Superposition of the unsharpened maps of the open (grey) and closed (yellow) states shown in d. The arrows indicate the rotations of the head and foot domains.
The C-terminus of the Prp6 TPR repeats[40] interacts with the Prp4 WD40 domain, Snu13, Prp31 and the tip of U4 5′-stem-loop (Fig. 4e) while an extended N-terminal polypeptide of Prp6 packs against the RNaseH-like domain of Prp8 and interacts with the small C-terminal domain of Prp31, the Prp8 α-finger, and U4/U6 snRNA three-way junction and then wraps around the Prp8 helix bundle (Fig. 4d; Extended Data Fig. 3d-f). The numerous interactions that Prp6 makes with U4/U6 snRNP components and Prp8 reflect its importance for tri-snRNP assembly[41].
Brr2
The single-stranded region of U4 snRNA (Extended Data Fig. 3b and 4a) extending from stem I, enters the active site of Brr2 N-terminal helicase cassette near the strand-separating β-hairpin and passes through the channel between the RecA1, RecA2, Ratchet and WH domains[42] (Extended Data Fig. 7a-c). The N-terminal domain (NTD) of Brr2 extends towards U4/U6 stem II and contacts the long helix of Prp3 running along the phosphate backbone of U4 snRNA. Brr2 inserts a loop of the NTD into the minor groove of U4/U6 stem II (Extended Data Fig. 7b and 7d). These interactions may guide U4/U6 stem II during unwinding. Snu13, Prp4 WD40, Prp3 ferredoxin, and Prp31 Nop and coiled-coil domains assemble together while the long α-helices and stretched polypeptide chains of Prp3 and Prp31 extend from these domains and interact with U4/U6 stem II and U4 5′-stem-loop, respectively[39]. These long α-helices and extended polypeptides may function like elastic bands to accommodate conformational changes and partial strand separation of the U4/U6 duplex as Brr2 translocates along U4 snRNA and unwinds U4/U6 stem I (ref. 16,43). Brr2 forms a stable complex with the Jab1/MPN domain of Prp8 (ref. 42), which is attached to the RNaseH-like domain of Prp8 via a long flexible linker, enabling both Brr2 and U4/U6 di-snRNP to detach from the main body of Prp8 during unwinding.
Extended Data Figure 7
Brr2 helicase and its U4/U6 snRNA substrate
a, domain structure of Brr2 helicase comprising the N-terminal domain and two helicase cassettes. Individual domains of N-terminal helicase cassette (NHC) are colour-coded. b, Extensive interactions of Brr2 with U4/U6 snRNA and Prp3. The single-stranded region of U4 snRNA extending from stem I enters the active site near the β-finger (red). c, 3′ stem of U4 snRNA interacts with the HLH domain of NHC. d, The N-terminal domain (NTD) of Brr2 interacts with a long helix of Prp3 and inserts a loop into U4/U6 Stem II. e, Snu66 has a long extended region that wraps around both helicase cassettes of Brr2.
The improved map of the Head domain at 4.5-5Å resolution, obtained by masked refinement, enabled us to build most of the Snu66 structure as poly-Ala chains. Its N-terminal region forms a globular domain that interacts with Prp8 endonuclease-like and Brr2 N-terminal ratchet domains. This is followed by a long helix wedged between Prp8 Jab1/MPN and Brr2 N-terminal HLH domains while its C-terminus wraps around Brr2, forming extensive interactions with the Brr2 C-terminal cassette (Extended Data Fig. 7e), fully consistent with yeast two-hybrid and co-immunoprecipitation assays[44]. Interestingly, our global classification approach showed “open” and “closed” conformations of the Head and Foot domains (Extended Data Fig. 6c-e). In the “closed” conformation, the globular domain of Snu66 contacts the N-terminal domain of Prp8, which in turn interacts with Snu114.
Insight into spliceosome activation
A comparison of the U4/U6.U5 tri-snRNP with B[45] and BΔU1 complexes[46] shows that U2 snRNP docks with tri-snRNP where the LSm complex, Prp3 and Prp6 are located, whilst U1 snRNP sits on top of U2 snRNP (Fig. 5a). The components of NTC/NTR are also detected by mass spectrometry in complex B[3]. We compared the structures of our tri-snRNP and the post-splicing ILS[19] by overlaying the large domain of Prp8 together with Snu114 and the U5 core domain. This shows that NTC and NTR can associate with tri-snRNP without clashing and contact U2 snRNP (Fig. 5a and 5b). In complex B, U2 snRNP interacts with U4/U6.U5 tri-snRNP[46], but when NTC and NTR dock with tri-snRNP, U2 snRNP is passed to NTC and NTR, and U2 Sm domain and U2B”/U2A’ complex associate with Aquarius(Cwf11), Syf1(Cwf3) and Isy1(Cwf12)[47] as revealed in the S. pombe ILS[19] (S. pombe protein names are shown in italics in parenthesis).
Figure 5
B complex formation and activation mechanism
a, U4/U6.U5 tri-snRNP fits into the EM envelope of human complex B[45] (reproduced from ref. 45 with permission), showing that U2 snRNP binds near the LSm core domain, Prp6 and Prp3. b, Overlay of the Prp8 large domain between tri-snRNP and the ILS[19] shows how NTC/NTR might bind to complex B and interact with U2 snRNP so that U2 snRNP can be passed to the NTC/NTR complex. c, A comparison of the tri-snRNP and the ILS[19] structures shows rotation of the Foot domain with respect to the Prp8 large domain. Upon rotation, Prp8 residues 602-614 will clash with Dib1 and ACAGAGA helix, causing them to dissociate thus liberating the ACAGAGA sequence to bind the 5′-splice site.
The S. cerevisiae U4/U6.U5 tri-snRNP and S. pombe ILS structures reveal that the Foot domains of the two structures, containing the Prp8 N-terminal domain, U5 snRNA stem-loop 1 and Snu114, superpose very well showing that they form a stable structural unit (Extended Data Fig. 5f). Overlay of their Prp8 large domains shows that the Foot domain rotates as a rigid body, by 30° between the two structures, causing U5 loop 1 to move closer toward the Prp8 α-finger in the post-splicing ILS[19] (Fig. 5c). NTC forms extensive interfaces with both the N-terminal and Large domains of Prp8, hence the rotation of the Foot domain may be caused by NTC. When the Foot domain of the U4/U6.U5 tri-snRNP structure rotates by 30° (as in the post-splicing ILS) Prp8 residues 602-614 clash with Dib1 and ACAGAGA helix, forcing Dib1 to dissociate from the large domain of Prp8 and liberating the ACAGAGA sequence to bind the 5′SS. It is known from the U4 snRNA cs1 mutation and its suppressor in U6 snRNA (U6-Dup)[48] that the pairing between 5′SS and the ACAGAGA sequence is a checkpoint for the unwinding of the U4/U6 snRNA duplex by Brr2. Thus, conformational toggling of the Prp8 N-terminal domain could couple 5′SS recognition to U4/U6 unwinding. Suppressors of the U4-cs1 mutation[49] suggest the allosteric changes required for Brr2 activation. Interestingly these suppressors form four clusters in Prp8: one at the interface between the RT domain and D3 of Snu114, one at the interface between the helix bundle and Prp31, one on the surface of endonuclease-like domain near the ACAGAGA hairpin and one on the surface of the N-terminal domain where the 5′-stem-loop of U6 snRNA binds, near the interface with Snu66 that undergoes a transition between the Open and Closed forms (Extended Data Fig. 6c-e). It is possible that NTC/NTR play important roles in inducing allosteric changes that trigger the unwinding of U4/U6 snRNA by Brr2 (ref. 50).The cryo-EM structures of S. cerevisiae U4/U6.U5 tri-snRNP and S. pombe ILS[19] have provided a wealth of new information about the architecture and conformational changes of these spliceosomal assemblies. Functional studies based on these new structural insights should greatly enhance our understanding of spliceosome activation and catalysis.
Methods
Statistics
No statistical methods were used to predetermine sample size.
Sample preparation
Tri-snRNP sample was prepared as described in our published protocol[16].
Electron microscopy
Aliquots of 3.5 μl of purified yeast tri-snRNP were applied to Quantifoil Cu R1.2/1.3, 400 mesh grids which were coated with 6 nm-thick homemade carbon film and glow-discharged in N-amylamine. The grids were blotted for 2s at 4°C, plunged into liquid ethane by an FEI Vitrobot MKIII at 100% humidity and loaded onto a Titan Krios transmission electron microscope operated at 300kV. Zero-loss-energy images were collected manually on a Gatan K2-Summit detector in super-resolution counting mode at a calibrated magnification of 35,714x (pixel size of 1.43Å) and a dose rate of ~2.5 electron per Å2 per second (Extended Data Figure 1A). We used a slit width of 20 eV on a GIF Quantum energy filter. Each image was exposed for a total of 16 seconds and dose-fractionated into 20 movie frames. A defocus range of 0.5-3.5 μm was used.
Image processing
MOTIONCORR[51] was used for whole-image drift correction of the movie frames of each micrograph, and contrast transfer function (CTF) parameters of the corrected micrographs were estimated using CTFFIND4 (ref. 52). All subsequent processing steps were done using RELION[53] unless otherwise stated. A subset of ~5000 particles was picked manually, extracted using a 3802 pixel box and subjected to reference-free 2D classification. Some of the resulting 2D class averages were low-pass filtered to 20Å and used as references for automatic particle picking of the whole dataset of 2477 micrographs. The automatically picked particles were screened manually to remove false positives, aggregation and ice contamination, resulting in an initial set of 473,827 particles for reference-free 2D classification. We selected 438,602 particles from good 2D classes for the 3D classification (Extended Data Fig. 1b and 1c), which was run for 25 iterations, using an angular sampling of 7.5°, a regularisation parameter T of 4 and a 60Å low-pass filtered initial model from our previous reconstruction[16]. A subset of 140,155 particles was selected for the first 3D auto-refinement. Particle-based beam-induced motion correction and radiation-damage weighing (particle polishing) were performed on these particles[54]. Auto-refinement of the polished particles resulted in a reconstruction at 3.7Å overall resolution with an estimated angular accuracy of 1.1°.Local resolution analysis by Resmap[55] showed a range of resolution from 3.0 Å in the core to 10 Å in the Arm domain and part of the Head domain, indicating conformational heterogeneity within the complex. As previously observed, the four domains of the structure, particularly the Head and Arm domains, are flexible in our structure. We employed two classification/refinement approaches: a local approach to improve the local resolution of the domains and a global approach to allow global conformations of the domains relative to one another to be observed (Extended Data Fig. 1c). For the local approach, we used a masked refinement procedure with signal subtraction for each of the Head, Body and Foot domains[23] and a masked classification with signal subtraction followed by a masked refinement for the most flexible Arm domain[23]. Each of the four domains only makes up a third or less of the total mass of the complex. For each domain, we subtracted projections from the remaining three domains of the reconstruction in the experimental particle images using the relative orientation of each experimental image from the last auto-refinement run of all the polished particles. This resulted in four sets of new experimental particle images that only have signal from the domain of interest. For the Body, Foot and Head domains the subtracted experimental images were used in 3D auto-refinement with a soft mask for that domain, yielding 3.6, 3.7 and 4.2 Å reconstructions for the Body, Foot and Head domains, respectively (Extended Data Fig. 1b, 2a-c). The Arm domain is too small for masked refinement. Thus we performed 3D classification on the subtracted experimental images with a mask around the Arm domain and no alignments. We selected three classes with 23,760, 26,367 and 24,627 particles each with three distinct conformations for the Arm domain. Since the Arm domain is too small for accurate alignments of the particles, we refined each of these classes together with the Body domain using a new set of modified experimental particle images that included both the Arm and Body domains by the same subtraction method used previously (Extended Data Fig. 1c, 6a and 6b). Class 1 with 23,760 particles yielded a 4.6Å overall resolution for both Body and Arm domains and 6.2Å resolution for the Arm domain alone. Class 2 with 26,363 particles yielded a 4.5Å overall resolution for both Body and Arm domains and 7.5Å resolution for the Arm domain alone. Class 3 with 24,627 particles yielded a 4.4Å overall resolution for both Body and Arm domains and 6.2Å resolution for the Arm domain alone.For the global classification/refinement approach, we performed 3D classification of the polished particles for the whole complex with a finer angular sampling of 1.8° and local angular search range of 10°. Two of the sub-classes of 48,945 and 36,824 particles had significantly better angular accuracies and gave 4.2 Å and 4.3 Å reconstructions, respectively, after auto-refinement with more homogeneous conformations of the Head and Foot domains. We observed distinct “Open” and “Closed” conformations for the Head and Foot domains (Extended Data Figure 6c-e).All reported resolutions are based on the gold-standard Fourier Shell Correlation (FSC) = 0.143 criterion[56]. FSC curves were calculated using soft spherical masks and high-resolution noise substitution was used to correct for convolution effects of the masks on the FSC curves[57]. Prior to visualization, all maps were corrected for the modulation transfer function of the detector. Local resolution was estimated using Resmap[55].
Model building
The maps resulting from local masked refinements were first used for de novo model building using our previous protein placements[15] because they have the best local resolution for each of the domains separately. All model building was performed in COOT[58]. In our medium resolution structure, except for Brr2442-2163, Prp8885-2413, Snu13 and LSm proteins whose yeast structures are available, all other proteins are either human, homology models or idealized poly-Ala helices and only double-stranded RNA helices were modelled. Recently the structure of the ferredoxin domain of yeast Prp3 (residues 335-467) became available[38], which replaced our homology model of this domain. We rebuilt and extended the yeast Brr2442-2163, Snu13, Prp8885-2413 and Prp3335-467 and all remaining components were built de novo first into the masked refinement maps and rigid-body fitted into the overall 3.7Å map. We identified a previously unassigned density as that of Snu66 based on previous yeast two-hybrid studies[44] and its interacting proteins in our structure. The LSm proteins were rigid-body fitted into the overall map and the improved maps of the 3 classes from masked classification and refinement. Extended Data Table 1 summarises all modeled components of the structure. The model was refined using REFMAC 5.8 (ref. 59) with secondary structure restraints provided by PROSMART[60] and RNA base-pair and stacking restraints provided by LIBG[61]. We first performed model refinement for the Body, Foot and Head domains separately against the corresponding masked refined maps (Extended Data Table 2a). The subunits of these three refined models were rigid-body fitted into the masked all map. To resolve the possible clashes in the domain interfaces, we refined this overall model against the overall map. Cross-validation of two half maps defined a REFMAC refinement weights of 0.001. The Xmipp package[62] was used to calculate FSC model versus map. FSC curves of model versus map were calculated for the maps of the Body, Foot and Head domains, which were used for model building and refinement of the structure and also the overall map (Extended Data Fig. 2d-g). Extended Data Table 2 summarises refinement statistics for the overall structure and the domain structures and the deposited maps and their associated coordinates.
Extended Data Table 1
Summary of model building of tri-snRNP components
protein
total residues
M.W.
Modeled
Chain name
Local map
Human/S. pombe names
U5 snRNP
Prp8
2413
279,299
110-2401
A
108-735: Foot751-2104: Body2147-2401: Head
220K/Spp42
Brr2
2163
246,125
364-2163
B
363-433: Body439-2163: Head
200K/Brr2
Snu114
1008
114,025
102-989
C
Foot
116K/Cwf10
Dib1
143
16,774
2-137
D
Body
15K/Dim1
SmB
196
22,403
4-102
b
SmB/SmB
SmD3
110
11,229
1-109
d
SmD3/SmD3
SmD1
146
16,288
15-108
h
SmD1/SmD1
SmD2
110
12,856
4-85
i
Foot
SmD2/SmD2
SmE
94
10,373
4-92
e
SmE/SmE
SmF
96
9,659
12-83
f
SmF/SmF
SmG
77
8,479
2-76
g
SmG/SmG
U5 snRNA-L
214
68,847
4-173
U
4-173: Foot88-107: Body
U4/U6 snRNP
Snu13
126
13,570
3-126
K
Body
15.5K/Snu13
Prp31
494
56,305
43-457
F
Body
61K/Prp31
Prp3
469
55,877
150-467
G
Body
90K/Prp3
Prp4
465
52,425
109-465
H
Body
60K/Rna4
SmB
196
22,403
4-102
k
SmB/SmB
SmD1
146
16,288
1-118
l
SmD1/SmD1
SmD2
110
12,856
15-108
m
SmD2/SmD2
SmD3
110
11,229
4-85
n
Head
SmD3/SmD3
SmE
94
10,373
10-92
p
SmE/SmE
SmF
96
9,659
12-83
q
SmF/SmF
SmG
77
8,479
2-76
r
SmG/SmG
Lsm2
95
11,164
1-90
2
Lsm2/Lsm2
Lsm3
89
10,020
3-79
3
Lsm3/Lsm3
Lsm4
172
20,304
1-90
4
Lsm4/Lsm4
Lsm5
93
10,415
4-84
5
Arm
Lsm5/Lsm5
Lsm6
86
9,396
11-84
6
Lsm6/Lsm6
Lsm7
115
13,010
26-105
7
Lsm7/Lsm7
Lsm8
109
12,385
1-67
8
Lsm8/Lsm8
U4 snRNA
160
51,390
1-152
V
1-67: Body73-152: Head
U6 snRNA
112
36,088
1-112
!
1-26: Foot26-88: Body108-112: Arm
tri-snRNP specific
Prp6
899
104,234
155-898
J
Body
102K/Prp1
Snu66
587
66,426
5-560 (poly-Ala)
E
Head
110K/Snu66
Prp38
242
27,957
Not modeled
hPrp38/Prp38
Snu23
194
22,682
Not modeled
hSnu23/Snu23
Spp381
291
33,764
Not modeled
Extended Data Table 2
Refinement, model statistics and structure/map depositions
a. Statistic of tri-snRNP structure determination
Data collection
EM
Titan Krios 300kV, K2 Gatan Summit
Pixel size (Å)
1.43
Defocus range ( μm)
−0.5 to −3.5
Reconstruction (RELION)
Overall
Body
Foot
Head
Accuracy of rotations (°)
1.13
1.15
1.73
2.42
Accuracy of translations (pixel)
0.65
0.67
0.89
1.28
Final resolution
3.7
3.6
3.7
4.2
Refinement (REFMAC)
Refinement weight
0.001
0.001
0.001
0.001
Resolution limits
3.6
3.6
3.6
3.6
Residue numbers
9325
3728
2186
2922
Fourier Shell Correlation
0.75
0.85
0.82
0.6
R-factor (%)
29.7
27.8
28.7
31.5
Rms bond length (Å)
0.0078
0.0073
0.0073
0.011
Rms bond angle (°)
1.27
1.33
1.38
1.4
Ramachandran plot
Favoured
8066 (91.4%)
3266 (91.9%)
1810 (90.9%)
2531 (89.3%)
Allowed
615 (6.9%)
237 (6.7%)
135 (7.2%)
238 (9.3%)
Outliers
152 (1.7%)
50 (1.4%)
37 (1.9%)
39 (1.4%)
Validation by Molprobity
Geometry score (percentile)
2.52 (98 th)
2.41 (99 th)
2.79 (95 th)
2.62 (97 th)
Clashscore (percentile)
7.48 (97 th)
6.78 (100 th)
11.4 (97 th)
6.82 (100 th)
Good rotamer (%)
94.8
95.7
93.5
93.2
Map visualisation
Maps were visualized in Chimera[63] and all figures were prepared using either Pymol (www.pymol.org) or Chimera.
Plasmid Shuffling
Mutations were introduced into PRP8 and SNU114 genes by the dut− ung− methods[64]. The viability of Prp8 and Snu114 mutants was assessed by plasmid shuffling analysis. The Prp8 deletion strain SC261Δ8B1 (ref. 65) carrying wild type PRP8 on pRS316 (URA3, centromeric replication origin) was transformed with mutant Prp8 genes on pRS314 (TRP1, centromeric replication origin) and transformants were selected on plates lacking tryptophan. The Snu114 deletion strain YSNU114KO1 (ref. 66) carrying wild type SNU114 on pRS416 (URA3, centromeric replication origin) was transformed with mutant Snu114 genes on pRS413 (HIS3, centromeric replication origin) and transformants were selected on plates lacking histidine. Trp+ and His+ cells were transferred onto plates containing 5-fluoro-orotic acid (5-FOA), to test cell growth at 30°C after loss of the URA3-marked plasmid. Plasmids were rescued from the 5-FOA-resistant strains and sequenced to confirm the presence of the appropriate mutation, and cell growth was assessed on YEPD plates incubated at various temperatures.
Image processing procedures
a, Representative micrograph. b, Representative 2D class averages obtained from reference-free 2D classification. c, Classification and refinement procedures used in this study.
Local and overall resolutions of tri-snRNP maps
Local resolution estimation by Resmap[55] of a, the overall 3.7 Å map and b, maps of thehead, body and foot domains obtained from masked refinements with signal subtraction[23]. c, Gold-standard FSC curves for the overall map and the maps of the head, body and foot domains obtained from masked refinements. Their resolutions are estimated at FSC=0.143. d, e, f and g FSC curves of model versus map and cross-validation of model refinement by half-maps for the Body, Foot, Head and Overall maps, respectively. The red curves show FSC between the atomic model and the half-map it was refined against (half1) and the blue curves show FSC between the atomic model and the other half-map (half2) it was not refined against. The black curves show FSC between the atomic model and the sum map which the model was refined against.
Representative EM density for different components of the map
a, Snu114 in the Foot domain with a bound GTP (magenta). The inset shows the GTP-binding pocket. b, Brr2 in the Head domain with a bound single-stranded region of U4 snRNA. The inset shows the density in the RNA binding tunnel. c, Density for Prp8 large and RNase-like domains. The inset shows the density in the core of Prp8. d, e and f, Prp3, Prp31 and Prp6 densities, respectively, with extended polypeptides.
Secondary structure of the snRNAs in tri-snRNP
a, U4/U6 snRNA; c, U5 snRNA. The colored nucleotides with red, green and blue background were built de novo into our EM density. The region near the ACAGAGA sequence of U6 snRNA forms a stem-loop that was not predicted previously. b, d, Representative EM density for U4/U6 snRNA duplex and U5 snRNA, respectively.
Interactions of Snu114 with guanine nucleotides and the N-terminal domain of Prp8 in the S. cerevisiae U4/U6.U5 tri-snRNP and S. pombe ILS complexes
a, Conformation of the Snu114(Cwf10)-bound GDP refined in the S. pombe ILS spliceosomal complex[19,20] (red, PDB 3JB9), was overlaid on GDPs found in other guanine-nucleotide binding proteins (grey, PDB coordinates: 1DAR, 2E1R, 2WRI, 1Z0I, 5CA8, 1XTQ, 4YLG, 1SF8, 5BXQ). b, Guanine nucleotide refined as GDP in Snu114 of the S. cerevisiae U4/U6.U5 tri-snRNP (blue) is overlaid on GDPs found in the PDB coordinates as in a. c, Conformation of guanine nucleotide refined as GTP in Snu114 of the S. cerevisiae U4/U6.U5 tri-snRNP (blue) agrees well with GTP or GTP analogs in other guanine-nucleotide binding proteins (PDB code: 2BV3, 2DY1, 2J7K, 4YW9, 1ASO, 1LF0 (grey)). d, Superposition of the active site of Snu114-GTP and Cwf10-GDP. e, Superposition of the GDP-bound EF-G (2WRI), GMP-PCP bound EF-G (4JUW) and Snu114 (S. cerevisiae tri-snRNP) active sites. His218 (His78 in EF-G) positions water molecule crucial for GTP hydrolysis. f, Comparison of Prp8N-term domain, Snu114 and U5 snRNA in the S. cerevisiae U4/U6.U5 complex and S. pombe ILS complex. g, Growth of serial dilutions of yeast strains carrying wild-type Snu114, His218Arg or His218Ala Snu114 mutants at different temperatures. Cells were spotted on YPD plates and grown at 14°C for 10 days, 30°C and 37°C for 2 days. h, Growth of serial dilutions of yeast strains carrying wild-type Prp8, Tyr403Phe and Tyr403Ala mutants. Cells were spotted on YPD plates and grown at 14°C for 9 days, 30°C for 3 days. This yeast strain does not survive at 37°C and thus is not shown.
Conformational flexibility of tri-snRNP observed by classification
a, Different conformations of the Arm domain demonstrated by the unsharpened maps of the three major classes (purple, magenta and red) obtained from masked classification of the Arm domain alone followed by masked refinement with the Body and Arm domains. The Body domain was included in the refinement because the arm domain is too small for accurate alignments. b, The sharpened map of one of the three classes with Prp3 and LSm models shown. In the improved domain maps for the Arm domain, extra density for the N-terminal helix of Prp3 could be observed to extend to the LSm proteins. c, The sharpened map of the tri-snRNP and the locations of Snu66 and Prp8. d, The open and closed conformations of the Head and Foot domains of the tri-snRNP observed by global classification. The unsharpened maps for the two major classes obtained from global classification with finer angular sampling (1.8°) followed by 3D auto-refinement are shown. The open and closed states are indicated. e, Superposition of the unsharpened maps of the open (grey) and closed (yellow) states shown in d. The arrows indicate the rotations of the head and foot domains.
Brr2 helicase and its U4/U6 snRNA substrate
a, domain structure of Brr2 helicase comprising the N-terminal domain and two helicase cassettes. Individual domains of N-terminal helicase cassette (NHC) are colour-coded. b, Extensive interactions of Brr2 with U4/U6 snRNA and Prp3. The single-stranded region of U4 snRNA extending from stem I enters the active site near the β-finger (red). c, 3′ stem of U4 snRNA interacts with the HLH domain of NHC. d, The N-terminal domain (NTD) of Brr2 interacts with a long helix of Prp3 and inserts a loop into U4/U6 Stem II. e, Snu66 has a long extended region that wraps around both helicase cassettes of Brr2.Summary of model building of tri-snRNP componentsRefinement, model statistics and structure/map depositions
Authors: Sjors H W Scheres; Rafael Núñez-Ramírez; Carlos O S Sorzano; José María Carazo; Roberto Marabini Journal: Nat Protoc Date: 2008 Impact factor: 13.491
Authors: Melanie D Ohi; Liping Ren; Joseph S Wall; Kathleen L Gould; Thomas Walz Journal: Proc Natl Acad Sci U S A Date: 2007-02-20 Impact factor: 11.205
Authors: Soung-Hun Roh; Corey F Hryc; Hyun-Hwan Jeong; Xue Fei; Joanita Jakana; George H Lorimer; Wah Chiu Journal: Proc Natl Acad Sci U S A Date: 2017-07-14 Impact factor: 11.205
Authors: Kaiming Zhang; Sarah C Keane; Zhaoming Su; Rossitza N Irobalieva; Muyuan Chen; Verna Van; Carly A Sciandra; Jan Marchant; Xiao Heng; Michael F Schmid; David A Case; Steven J Ludtke; Michael F Summers; Wah Chiu Journal: Structure Date: 2018-02-02 Impact factor: 5.006