Junhua Pan1,2, Xinlei Qian3,4,5, Simon Lattmann4, Abbas El Sahili3,4, Tiong Han Yeo3, Huan Jia3,4, Tessa Cressey6, Barbara Ludeke6, Sarah Noton6, Marian Kalocsay7, Rachel Fearns8, Julien Lescar9,10,11. 1. Division of Molecular Medicine, Boston Children's Hospital, Boston, MA, USA. pan@crystal.harvard.edu. 2. Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA, USA. pan@crystal.harvard.edu. 3. School of Biological Sciences, Nanyang Technological University, Singapore, Singapore. 4. NTU Institute of Structural Biology, Nanyang Technological University, Singapore, Singapore. 5. Antimicrobial Resistance Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore, Singapore. 6. Boston University School of Medicine, National Emerging Infectious Diseases Laboratories, Boston, MA, USA. 7. Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School, Boston, MA, USA. 8. Boston University School of Medicine, National Emerging Infectious Diseases Laboratories, Boston, MA, USA. rfearns@bu.edu. 9. School of Biological Sciences, Nanyang Technological University, Singapore, Singapore. julien@ntu.edu.sg. 10. NTU Institute of Structural Biology, Nanyang Technological University, Singapore, Singapore. julien@ntu.edu.sg. 11. Antimicrobial Resistance Interdisciplinary Research Group, Singapore-MIT Alliance for Research and Technology, Singapore, Singapore. julien@ntu.edu.sg.
Abstract
Respiratory syncytial virus (RSV) and human metapneumovirus (HMPV) cause severe respiratory diseases in infants and elderly adults1. No vaccine or effective antiviral therapy currently exists to control RSV or HMPV infections. During viral genome replication and transcription, the tetrameric phosphoprotein P serves as a crucial adaptor between the ribonucleoprotein template and the L protein, which has RNA-dependent RNA polymerase (RdRp), GDP polyribonucleotidyltransferase and cap-specific methyltransferase activities2,3. How P interacts with L and mediates the association with the free form of N and with the ribonucleoprotein is not clear for HMPV or other major human pathogens, including the viruses that cause measles, Ebola and rabies. Here we report a cryo-electron microscopy reconstruction that shows the ring-shaped structure of the polymerase and capping domains of HMPV-L bound to a tetramer of P. The connector and methyltransferase domains of L are mobile with respect to the core. The putative priming loop that is important for the initiation of RNA synthesis is fully retracted, which leaves space in the active-site cavity for RNA elongation. P interacts extensively with the N-terminal region of L, burying more than 4,016 Å2 of the molecular surface area in the interface. Two of the four helices that form the coiled-coil tetramerization domain of P, and long C-terminal extensions projecting from these two helices, wrap around the L protein in a manner similar to tentacles. The structural versatility of the four P protomers-which are largely disordered in their free state-demonstrates an example of a 'folding-upon-partner-binding' mechanism for carrying out P adaptor functions. The structure shows that P has the potential to modulate multiple functions of L and these results should accelerate the design of specific antiviral drugs.
n class="Species">Respiratory syncytial virusn> (class="Species">RSV) and human metapneumovirus (HMPV) cause severe respiratory diseases in infants and elderly adults1. No vaccine or effective antiviral therapy currently exists to control RSV or HMPV infections. During viral genome replication and transcription, the tetrameric phosphoprotein P serves as a crucial adaptor between the ribonucleoprotein template and the L protein, which has RNA-dependent RNA polymerase (RdRp), GDP polyribonucleotidyltransferase and cap-specific methyltransferase activities2,3. How P interacts with L and mediates the association with the free form of N and with the ribonucleoprotein is not clear for HMPV or other major human pathogens, including the viruses that cause measles, Ebola and rabies. Here we report a cryo-electron microscopy reconstruction that shows the ring-shaped structure of the polymerase and capping domains of HMPV-L bound to a tetramer of P. The connector and methyltransferase domains of L are mobile with respect to the core. The putative priming loop that is important for the initiation of RNA synthesis is fully retracted, which leaves space in the active-site cavity for RNA elongation. P interacts extensively with the N-terminal region of L, burying more than 4,016 Å2 of the molecular surface area in the interface. Two of the four helices that form the coiled-coil tetramerization domain of P, and long C-terminal extensions projecting from these two helices, wrap around the L protein in a manner similar to tentacles. The structural versatility of the four P protomers-which are largely disordered in their free state-demonstrates an example of a 'folding-upon-partner-binding' mechanism for carrying out P adaptor functions. The structure shows that P has the potential to modulate multiple functions of L and these results should accelerate the design of specific antiviral drugs.
The 13 kb non-segmented negative-strand (NS) n class="Species">HMPVn> RNA genome encodes nine class="Chemical">proteins[1]. Following virus entry, N-RNA is released into the host-cell cytoplasm and serves as template for the viral class="Chemical">polymerase to produce sub-genomic capped and polyadenylated viral mRNAs, and for genome replication through an anti-genome intermediate. A cap 1 structure is formed at the 5’ end of viral mRNAs by the capping and MTase domains of L[2,3]. Structures are available for segmented NS virus polymerases such as influenza[4], which acquire a capped primer by “cap snatching”. For polymerases of the Mononegavirales order, which includes several major human pathogens, the structure of the rhabdovirus vesicular stomatitis virus (VSV) L protein was determined first[3] and a crystal structure of the MTase domain of HMPV-L was reported[5].
Co-exclass="Chemical">pressing the L and n class="Chemical">P proteins (Fig. 1a, b and Extended Data Fig. 1a) from HMPV in Sf9 insect cells yielded a stoichiometric preparation suitable for functional and structural studies (Fig. 1c–e). A primer elongation assay demonstrated that LWT:P complexes had polymerase activity (Fig. 1c). LWT:P complexes were also tested for de novo RNA synthesis activity using a template consisting of a sequence from the 3’ end of the HMPV genome (“le14”). The major band detected in reactions containing LWT:P, all four NTPs and radiolabeled UTP as a tracer was 12 nt in length, consistent with position 3C being the dominant initiation site (Fig. 1d). Thisproduct was not detectable in a reaction mix containing only radiolabeled UTP and no other NTPs, or in a reaction containing a chain terminator, confirming that it was the product of de novo initiation at the promoter. LD745A:P, in which Asp745 from the conserved “GDNQ” motif C was mutated to Ala, was devoid of polymerase activity, ruling out the possibility that a contaminating polymerase was responsible for the observed activity. Thus, recombinant L:P is active for RNA polymerization (Fig. 1c–d and Extended Data Fig. 2).
Figure 1.
Overall structure of the HMPV L:P complex.
Domain organization of L (a) and P (b) outlining conserved regions and motifs. Regions of HMPV-P predicted to interact with N0, M2–1, L and RNP, and phosphorylation sites are indicated[11]. (c) Primer elongation assay. Sequences of the 18 nt RNA template and for the 5 nt primer are shown with nascent RNA in blue and radiolabel incorporation sites in red. This experiment was performed a total of four times with two different buffer conditions. (d) RdRp activity assay using the “le14” RNA template. Sequences for the 5’-triphosphorylated 25 nt and 3 nt markers and the 12 nt product are indicated. The 3’dGTP chain terminator is labeled “G”. Radiolabeled UMP is in red. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1. (e) Overview of the HMPV-L:P cryo-EM 3D reconstruction (f) Overview of the L:P atomic model. RdRp: cyan, Capping domain: green, phosphoprotein tetramer subunits P1 magenta, P2 hot pink, P3 salmon and P4 pink. Atoms from the “GDNQ” motif in the RdRp and from the “HR” motif in the capping domains are shown as colored spheres (g) Rotated view of the L:P atomic model.
Extended Data Fig. 1 |
Purification of HMPV L:P and structure determination using cryo-EM
a, Representative size exclusion chromatogram of the L:P complex (these experiments were repeated more than 5 times). Fractions indicated by an arrow were collected and concentrated to 0.85 mg/mL and used for cryo EM analysis. Inset: SDS PAGE followed by Coomassie blue staining of the purified samples. Also shown: free P protein separated from L:P complex by heparin chromatography (For gel source data, see Supplementary Fig. 1). b, Raw micrograph of HMPV-L:P particles recorded in vitreous ice. Scale bar 10 nm. c, Power spectrum of the image shown in panel (b). We limited the high resolution for fitting to a spatial frequency of 1/5.0 Å and 1/2.9 Å marks the highest spacing to which CTF rings were successfully fit. d, 2D classes and “self-consistency check” for the cryo-EM 3D reconstruction. In each box over the three rows, the upper panel shows one 2D class average, whilst the lower panel shows the corresponding projection from the initial 3D model. e, Local resolution of the cryo-EM density map. Variations in local resolution are color- coded from blue (3.0 Å) to red (5.9 Å), computed with Resmap.[39]
f, Fourier Shell Correlation (FSC) of the cryo-EM map as a function of the spatial frequency. The gold standard resolution is 3.7 Å based on the FSC=0.143 criterion, consistent with the model to map correlation (0.5 criterion). g, Example of the electron density map that allowed model building. The region shown is at an interface between the RdRp and capping domain. The map is shown as a gray mesh, contoured at a level of 3 σ. The atomic model is shown as sticks with residues from RdRp colored in cyan (NTD in grey) and in green for the capping domain. h, The region shown is the three-stranded β-sheet at the interface between the RdRp (cyan sticks) and the phosphoprotein (magenta sticks). The map is shown as a gray mesh, contoured at a level of 2.5 σ. We observed a nearly identical structure of the L:P complex in a reconstruction obtained by premixing the L:P complex with fully phosphorylated P, indicating that potential exchange of P affected neither the formation nor the structure of the L:P complex.
Extended Data Fig. 2 |
RdRp activity assay.
a, SDS PAGE of HMPV wild type L:P, LD745A:P purified for RdRp activity assays. Proteins were purified by metal affinity, TEV cleavage of Histidine-tag followed by reverse His-tag affinity purification and size exclusion chromatography. b, Analysis of the 3′ extension activity of HMPV polymerase using the le25 RNA template. Reactions were performed with rNTPs (0.5 mM each of rUTP, rGTP and rCTP), 20 μM rATP and 20 nM [α-32P] rATP. When a 3’-modified le25 (le25[SpC3], three-carbon spacer group linked to the 3’ extremity) was used as a template, synthesis of products greater than 25 nt was greatly reduced compared to le25. When only [α-32P]rATP and no other rNTP was supplied, only a product with size greater than 25 nt was observed. This result shows that the L:P complex was capable of modifying the 3’ terminus of the template, in addition to engaging in de novo initiation at the promoter. The radiolabeled RNA products were visualized by phosphorimaging. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1.
We determined the structure of the n class="Chemical">L:n class="Chemical">P compn>lex to a resolution of 3.7 Å using class="Chemical">cryo-EM (Fig. 1e, Supclass="Chemical">plementary Video 1, Extended Data Fig. 1). The final 3D reconstruction (Extended Data Fig. 3) allowed us to build an atomic model for residues 8 to 1380 of L and for the bound P tetramer (Fig. 1f–g, Extended Data Fig. 3–5). Residues 1381 to the C-terminus are not visible, suggesting flexibility of the peripheral appendage comprising CR-VI+[5]. A comparison of HMPV-L with VSV-L3 returns an r.m.s.d. of 2.8 Å, illustrating the preservation of the core L structure during Mononegavirales evolution (Supplementary Fig. 2). This similarity extends to the N-terminal subdomain (NTD) of VSV-L, which is essential for viral transcription (Extended Data Fig. 6)[6]. The RdRp folds into the canonical fingers-palm-thumb subdomains observed in many RNA polymerase structures (Fig. 2a)[3]. Nucleoside inhibitors have been developed targeting RSV and HMPV RdRp activity[7]. The 5’ triphosphate metabolites of ALS-8176/ALS8112 inhibit HMPV and RSV in non-human primates with similar potency and resistant RSV mutants were sequenced[7]. We mapped the locations of RSV-L resistance mutations[7] to helix α29 of the present HMPV-L structure (Extended Data Fig. 7).
Extended Data Fig. 3 |
Flow-chart depicting structure determination using cryo-EM.
Please see methods sections for details.
Extended Data Figure 5 |
Topology of the L:P complex.
Topological depiction of the secondary structure elements of L and P. Helices are depicted as tubes and strands as arrows. The color code is the same as in Fig. 1. The RdRp domain and its subdomains and the capping domain are colored as in Figs. 1 and 2: NTD in grey, finger in blue, palm in red, thumb in dark green and CAP in green. The four subunits of the phosphoprotein P1, P2, P3 and P4 are colored as in Fig. 1. Secondary structure boundaries are indicated.
Extended Data Fig. 6 |
View of the N-terminal domain (NTD).
NTD is displayed as grey ribbons (following Fig. 2a colors), with evolutionary conserved residues clustered near the rNTP entry tunnel playing a role in transcription, represented as sticks and labeled. Colored in lighter grey is the equivalent region of the VSV_L superimposed to HMPV_L.
Figure 2.
Structure of the HMPV L protein.
(a) The RdRp viewed in its “front” orientation. Fingers subdomain: blue; palm: red and thumb: green, NTD: grey. RdRp motifs A-G are shown. (b) The capping domain of L (green ribbons) of HMPV-L. A superposition of HMPV-L with VSV-L3 (grey ribbons) highlights the difference between conformations of the putative priming loops: putative priming loop of HMPV-L (including Thr1192 of the GxxT motif): red and of VSV-L (1157–1173): gold. P: magenta. (c) RNA capping motifs A’-E’.[2] (d) Overall view of HMPV-L with VSV-L superimposed, highlighting the mobile appendage (displayed in the VSV-L orientation, but disordered in the present structure) and rNTP (S) entry, template (T) and nascent (U) RNA tunnels. (e) Impact of substitutions in aromatic and proline residues in the putative priming loop of RSV-L on RNA production from the +1 and +3 sites. The mean and standard error of three independent experiments are indicated. (f) Sequence conservation in the putative priming loop of L. The four residues that were mutated (panel e) are indicated by arrows. A proline residue (P1186 in HMPV-L) important for an early stage of RNA synthesis is conserved in Pneumoviridae, Paramyxoviridae and Filoviridae but not in Rhabdoviridae.
Extended Data Fig. 7 |
Model for an elongation complex stalled by the addition of ALS-8176 5’-triphosphate.
ALS-8176 5’-triphosphate is a nucleoside triphosphate analog against RSV and HMPV currently in phase 2 clinical trials. The 744GDNQ catalytic motif and positions (A789V, L795I and I796V) of which mutations conferred resistance (identified by passaging RSV) are mapped onto the HMPV-L structure (respectively corresponding to A723, V729 and V730) and displayed as sticks. These conservative mutations probably affect inhibitor binding by inducing a slight repositioning of the helix, due to altered hydrophobic contacts with neighboring helices. Please refer also to the sequence alignment displayed in Supplementary Fig. 2. Protein is colored according to Fig. 2a, and the template and nascent RNA strands according to Fig. 3a.
As in n class="Species">VSVn>-L, the capping domain of HMPV-L (Fig. 2b), containing the “HR” catalytic motif[2] has a large interface with the RdRp, accounting for the class="Disease">rigidity of the “doughnut”. Arg1264 is exposed to the solvent, due to the mobility of the C-terminal appendage. A feature shared by RdRps that do not need a polynucleotide primer to initiate polymerization, is a “priming loop” (Fig. 2c, d) that supports the initiating rNTP[8]. In the influenza polymerase[4], a proline at the tip of a β-hairpin of PB1 plays a key role for replication initiation. In rhabdoviruses (VSV and rabies virus), the corresponding loop is in the capping domain and contains residues important both for terminal de novo initiation and for mRNA capping[10], with a Trp residue in the capping domain essential for de novo initiation[10], indicating it is the priming residue. In sequence alignments, a Tyr residue lies at a similar position in HMPV-L (Tyr1201) and RSV-L (Tyr1276), which could provide base-stacking interactions required for priming (Supplementary Fig. 2). We tested this possibility using the RSV polymerase, which initiates at positions 1 and 3 on its promoter. A Tyr1276Ala substitution did not hamper RSV initiation from either site, but Ala substitution of either Pro1261 or Trp1262, corresponding to Pro1186 and Trp1187 in HMPV-L in motif B’ of the capping domain (Fig. 2c), inhibited an early stage of RNA synthesis within the promoter region (Fig. 2e, f), suggesting that either thisPro or Trp residue might be responsible for priming de novo RNA synthesis. However, we cannot exclude the possibility that the N-terminal region of the loop plays a structural role not directly related to HMPV de novo initiation.
In class="Chemical">pan class="Species">VSVn>-L, the class="Chemical">priming loop largely occludes the central RNA binding cavity (Fig. 2d), precluding the formation of a long dsRNA duplex and indicating a pre-initiation conformation[3]. The putative priming loop in our structure of HMPV-L retracts completely into a cavity in the capping domain, leaving ample space for several base-pairs of a primer-template dsRNA intermediate (Fig. 2b, d). Modeling an elongation complex in the HMPV L tunnel shows that an RNA duplex can indeed be accommodated in the central cavity (Fig. 3a–d).
Figure 3.
Model for RNA elongation by L:P from Pneumoviridae.
(a) Model for HMPV L:P elongation complex. Template strand: firebrick, nascent strand: orange. Residues from L (Lys307, Arg313, Arg788) and from the C-terminus of subunit P1 (Lys at positions 224, 227, 229, 243, 250, 254 and 256 and Arg241) form a positively-charged arch attracting rNTPs to the rNTP entry tunnel. (b) Cut-out view of L showing electrostatic surfaces (blue: positive and red: negative) and paths followed by the template and nascent RNA strands. (c) Tunnels and interior cavities of L depicted as a white surface enclosed by the L:P complex (ribbons). Cut open are various tunnel openings leading to the exterior. (d) Magnified view of the tunnel that traverses the L protein. Functional motifs A-G and catalytic residues 744GDNQ are indicated.
In the class="Chemical">preinitiation state of n class="Chemical">pan class="Species">VSV-L, the connector domain (pn>an class="Chemical">CD) and MTase domain have a defined orientation with respect to the core, being locked in place by the addition of peptide 35–106 from the VSV-P protein[3]. In HMPV-L, one linker region located approximately at residues 1381–1407 just upstream of the CD and another at residues 1596–1598 before the CR-VI+ domain (Fig. 1a), likely account for their flexibility. Mobility of the C-terminal domain, which has nucleoside 5’-triphosphatase, nucleoside-2’-O- and guanine-N[7] MTase activities[5], might be necessary to approach the 5’ end of nascent mRNA for cap addition. Thus, it is possible that N-terminal residues of HMPV-P (which are disordered here) are a latch that keeps the appendage in a fixed orientation for initiation and releases it during RNA elongation.
Clear density corresclass="Chemical">ponding to a coiled coil of four helices that constitute the n class="Chemical">P oligomerization region[11] allowed tracing of residues 169–266 and 171–236 for the two proximal P subunits P1 and P2 in close contact with L, and of residues 168–219 and 169–231 for the two distal P subunits P3 and P4 (Fig. 1f, g and Fig. 4a). Much of P appears to fold upon binding to L, as shown by the presence of several extended arms typical of large molecular assemblies such as ribosomes or viral particles (Fig. 1, Fig. 4b and Supplementary Video 2). A large molecular surface area (Extended Data Fig. 4) becomes buried in the interaction, mostly involving residues from P1 and P2, while P3 and P4 provide few interactions (Fig. 4a). Previous structural studies mapped the tetramerization and L-binding regions of P to the four parallel α-helices (spanning residues 171–193 of P) that could be visualized using X-ray crystallography[11]. In contrast, the N-terminal M2–1 binding regions of P[12], and the C-terminal regions of P that bind to the RNP were found to be “intrinsically disordered” in the absence of their respective binding partners[11]. While the tetramerization region of P appears as a rigid unit largely unaltered upon binding L, a disorder-to-order transition in the C-terminal regions of the two subunits proximal to L takes place upon binding: residues 194 to 213 of P1 fold into a β-hairpin, such that a three-stranded antiparallel β-sheet is formed with residues 383–391 of the fingers subdomain of L (Fig. 4c). This β-hairpin is followed by a long tentacle-like structure composed of three consecutive α-helices that encircle the rear aperture, through which rNTPs gain access to the active site chamber (Fig. 3a). Residues at the C-terminus of the adjacent coiled-coil-forming helix (from P2) project in the opposite direction towards the capping domain (Fig. 4d) with residues 199–205 and 207–213 making two short α-helices and 215–236 forming an extended helical arm (Fig. 4a, d). Only residues Glu181 and Lys182 from P3 make contact with Lys422 and Leu424 from the finger subdomain of L and a single contact is seen between L and P4 (Fig. 4d). Finally, the helix-forming residues 215–236 of P2, together with two long and mobile helices- presumably belonging to residues 201–217 and 205–231 of P3 and P4 respectively form a three-helix bundle largely exposed to solvent. In agreement with the present observations, earlier studies suggested that residues downstream of the coiled-coil region tend to form helices and constitute a “molecular recognition element”[11].
Figure 4.
The P homo-tetramer and L:P interactions.
(a) Interactions between residues from P1 (magenta), P2 (hotpink), P3 (salmon), P4 (pink) of P and the RdRp (cyan). (b) The electron density map (RdRp: cyan, capping domain: green and P: magenta). (c) A β-hairpin of P1 forms an antiparallel β-sheet with residues 383–391 of the fingers subdomains of L (cyan). Polar contacts between P1 and L are indicated by dashes. (d) interactions between the coiled-coil of P and other parts of P2, P3 and P4 with L. (e) Model for RNA replication by L:P. Genomic viral RNA, extruded from three adjacent N units from the RNP, inserts as a loop of ~27 nts into the RNA entry tunnel and back through the template exit tunnel. Genomic RNA displacement is achieved by three PCTD of phosphoprotein subunits P2, P3 and P4 (while P1 plays a structural role). Concomitantly, four PNTD helices, flexibly linked to the tetramerization domain, bring four new free N0 subunits towards the nascent RNA exit tunnel for encapsidation.
Extended Data Fig. 4 |
Phosphoprotein tetramer in complex with L.
a, The L protein (cyan) is represented as a molecular surface and the tetrameric P protein subunits are represented as ribbons, following the color codes in Figs. 1–3 (P1 in magenta, P2 in hot pink, P3 in salmon and P4 in pink). b, Structures adopted by the four individual P subunits bound to L, colored as a blue to red “rainbow” from the N- to the C- terminal ends. Secondary structures boundaries are noted for each subunit. c and d, Superposition of the tetramerization helices in the context of the L:P complex and the free P protein. Structures are represented as colored ribbons with the free phosphoprotein coiled-coil (PDB access code 4BXT) colored in gray and the four P subunits reported in this work colored according to Fig. 1 (P1 in magenta; P2, hot pink; P3, salmon and P4, pink). The r.m.s.d. of the superimposition is 1.13 Å over 88 α-carbon atoms. e, View of the complex where L and P have been pulled apart to display electrostatic surfaces. f, Overall view of the L:P complex with P shown as ribbons and L as electrostatic surface. The P tetramer consists of subunits P1(magenta), P2 (hotpink), P3 (salmon), and P4 (pink).
class="Chemical">Phosn class="Chemical">phorylation by cellular kinases regulates the transcripn>tion modulation activity of pn>an class="Species">VSV-P[13]. Analysis of recombinant HMPV-P using tandem mass-spectrometry (Extended Data Fig. 8) showed that P residues targeted by phosphorylation are largely solvent-exposed and changes in phosphorylation should only affect contacts of P with the RNP and M2–1 protein, not complex formation with L (Fig. 4a).
Extended Data Fig. 8 |
MS2 spectrum of the Ser148 P phosphopeptide.
(a) One MS2 spectrum used for identification of the phosphorylated P peptide 142DALDLLS#DNEEEDAESSILTFEER is displayed. Tandem mass spectrum (top) and deviation (bottom) allowed detection of phosphorylation (symbol #) at site Ser148. Peptides fragmented from the N-terminus (b-fragments) and C-terminus (y-fragments) are colored in blue and red, respectively. (b) y and b ion series m/z identified in the spectrum (a) and their deviation from theoretical m/z are displayed in the Table. The present pattern of phosphorylation agrees with observations showing that phosphorylation of the peptide comprising residues 100–120 (ref 44) of RSV-P - in particular phosphorylation of Thr108 (ref 45) corresponding to Ser148 of HMPV-P (Extended Data Fig. 9)) - controls its interaction with the M2–1 protein.
In the Mononegavirales, the genome forms a helical nucleocaclass="Chemical">psid, with RNA bound in a n class="Chemical">pan class="Chemical">crevice between the two N pn>an class="Chemical">proteins subdomains[14]. P acts as a chaperone by preventing non-specific aggregation of the N protein in the presence of non-viral RNA, preserving a pool of monomeric RNA-free N protein (N0) for RNP assembly[15]. The P-N0 interaction involves an α-helix from HMPV-P (residues 12–28) that inserts into a hydrophobic groove in the CTD of the N0 protein[15]. Here the entire regions N-terminal to the coiled coil of the four P subunits, including the N-terminal α-helix (PNTD), are disordered. The tetrameric coiled-coil appears to be anchored to the L protein at a location such that these helices, at the tips of flexible polar linkers spanning residues 29–135, can bind and deliver N0 to the RNA exit tunnel for encapsidating nascent RNA (Fig. 4e). The amino-acid sequence 29–135 of P (Extended Data Fig. 9) bears physico-chemical properties expected of a linker that evolved to ensure the spatial search of nascent RNA exiting L, by N0 bound to PNTD. In the present structure, the C-terminal regions of P are in non-equivalent positions (Fig. 1f, g). These segments have been proposed to dislodge RNA from the RNP complex and feed it into the template tunnel (Fig. 4e)[15]. Their structural asymmetry probably reflects different roles for the C-terminal domains of each subunit: while the C-terminal end of subunit P2 approaches the RNA template entry tunnel (Fig. 3c) and is well placed to bind N-RNA, the corresponding end of P1 is too distant to play a similar role (Fig. 3a). Moreover, the N-terminus of L (residues 8–24) wraps around the template entry tunnel (Fig. 3c). Thus, in addition to the C-terminal helix of P2, the NTD of L could help displace N to allow naked RNA to feed into the template tunnel (Fig. 3c). In summary, the tetrameric phosphoprotein binds to the monomeric L and fulfills its various adaptor functions by adopting an asymmetric structure (Fig. 4e) through (i) “conformational switching”, whereby each of the four P subunits folds into a different structure, depending on its position in the molecular assembly (ii) spatially confining structurally important N- and C-terminal helical elements (PNTD and PCTD) to the free and RNA-bound N protein respectively, through long flexible regions tethered to the rigidly anchored coiled-coil (Fig. 4e).
Extended Data Fig. 9 |
Structure-based sequence alignment of the phosphoprotein from HMPV (labeled HMPV-A, strain CAN97-83) and other known pneumoviral P proteins:
HMPV-B, human metapneumovirus subgroup B; HRSV-A and B, human respiratory syncytial virus subgroup A and B respectively; BRSV, bovine respiratory syncytial virus; PVM, pneumonia virus of mice; AMPV-A and AMPV-C, avian metapneumovirus subgroup A and C respectively. Sequences accession codes for the alignment HMPV-A: AAQ67693.1 (used in this work), HMPV-B: AAQ67684.1, HRSV-A: AAX23990.1, HRSV-B: AAR14262.1, BRSV: AAL49395.1, PVM: AAW79177.1, AMPV-A: AAT68644.1 and AMPV-C: AAT86110.1.
The secondary structure of HMPV_P subunit P1 (this work) is displayed above the alignment. Phosphorylation sites are highlighted in brown. Positively-charged residues of HMPV_P are shaded in blue, negatively charged residues in magenta and hydrophobic residues 29 to 135 in yellow. The conserved region containing hydrophobic residues critical for L:P interactions are highlighted in green. Structural alignment of P from HMPV and RSV[16] showed similar overall tetramer organization. However, differences are observed in subunit P1 with an r.m.s.d. of 2.24 Å over 82 residues. Although P is in general more mobile with weaker densities and higher B factors compared to L, the region following the beta-hairpin (residues 175–215 in HMPV) does adopt a slightly different conformation compared to RSV P1. Subunit P3 has an r.m.s.d. of 1.94 Å over 45 residues due to a slightly tilted C-terminal helix compared to RSV. Subunit P2 is most similar with an r.m.s.d. of 0.92 Å over 56 residues. Subunit P4 has an r.m.s.d. of 1.33 Å over 47 residues. The eight residues of HRSV-P, that are crucial for interacting with HRSV-L and whose substitutions impair viral replication, are shaded in dark green (data from reference 16). With the exception of Asn189 (HRSV-P) where a deletion is present in HMPV-P, these residues are conserved in HMPV-P and other known pneumoviral P proteins.
While tn class="Chemical">hisn> manusclass="Chemical">cript was under revision, the structure of RSVL:P was reported[16]. RSV and HMPV L and P proteins have only 48% and 36% overall amino acid identity respectively, but the structures are similar (r.m.s.d. of 1.34 Å over 1501 residues). Moreover, the putative RSV-L priming loop retracts to a similar position as seen here for HMPV-L[16]. In RSV, hydrophobic residues critical for L: P interactions (L216, L223 and L227 of RSV-P) have been shown to be critical for polymerase function[17]. The corresponding region, residues 251–265, of subunit P1 of the HMPV phosphoprotein contains several residues that contact L (Fig. 4c). The overall conservation of interactions between L and P within the family Pneumoviridae, suggests the possibility of broad-spectrum inhibitors disrupting the L: P interface[16]. We note that dimeric P proteins from rabies virus that interact with the C-terminal region of L[18] clearly differ from tetrameric P proteins as for HMPV, that bind to the N-terminal part of L. Whether the mode of interactions between L and P observed here for HMPV and for RSV[16], is also found in more distant Mononegavirales viral families such as the Filoviridae must await further structural studies.
Methods
Cloning and expression
class="Chemical">pan class="Species">HMn class="Chemical">PV L and P genes (from pn>an class="Species">hMPV isolate CAN97–83, Genbank: AY297749.1)[19] codon-optimized for expression in Spodoptera frugiperda 9 (Sf9) cells, were chemically synthesized and cloned into modified pFastBacDual (Thermo Fischer Scientific) by Bio Basic Inc. (Singapore). HMPV L (protein sequence Genbank: AAQ67700.1) was inserted downstream of a Strep-tag-II followed by a TEV cleavage sequence. Strep-TEV-HMPV L was inserted downstream of polyhedrin promoter on pFastBacDual. HMPV P (protein sequence Genbank: AAQ67693.1) was inserted downstream of an 8× His-tag sequence, followed by a TEV cleavage sequence. 8×His-TEV-HMPV P was inserted downstream of p10promoter on pFastBacDual. We mutated pFastBacDual_HMPV_L:P by plasmid PCR to generate pFastBacDual_HMPV_LD745A:P using forward primer 5’-GACGTGTCCAAGCCTGTGAAGCTGTC and reverse primer 5’-GATGGACTGGTTCGCGCCGTTGAGC. The mutation was confirmed by sequencing. Generation and isolation of bacmid of HMPV_L:P and HMPV_LD745A:P followed bac to bac expression system instructions manual (Invitrogen). Viral stocks generated from purified bacmids were amplified and used for protein expression.
Protein purification
n class="CellLine">Sf9n> cells class="Chemical">pellet was resuspended in 1/10 of cultured volume in cold HMPVL:Pclass="Chemical">lysis buffer, 20 mM Na Hepes, pH 7.4, 300 mM NaCl, 15 mM imidazole, 1 mM MgCl2, 0.5 mM TCEP and lysed by sonication. Cell debris was pelleted by centrifugation at 50,000 g for 45 min at 4 °C. Supernatant was incubated with 3 mL pre-equilibrated nickel-NTA beads (Thermo Fisher) per liter of cell pellet for 2 hours at 4 °C. The beads were collected by centrifugation at 500 g for 3 min at 4 °C. The pelleted beads were subjected to washes at 15 mM, 25 mM and 50 mM imidazole with 40, 30 and 20 bed volumes, respectively. Washed beads were transferred into a gravity flow column for elution. HMPVL:P complex was eluted in HMPVL:PHis-trap elution buffer containing 20 mM Na Hepes, pH 7.4, 300 mM NaCl, 500 mM imidazole, 1 mM MgCl2, 0.5 mM TCEP, through gravity flow at 4 °C. Eluted fractions were analyzed using SDS PAGE and stained with InstantBlue (Expedeon) to visualize the HMPV L and P proteins (Extended data Fig. 1a). Fractions containing the HMPVL:P complex were pooled with the addition of 1/10 (mg/ml) of TEV and subjected to dialysis against HMPVL:Plysis buffer with no imidazole at 4 °C overnight. Cleaved HMPVL:Pprotein was incubated with 1 mL nickel-NTA beads at 4 °C for 1 h. Flow-through and wash fractions containing L:P were pooled and subjected to heparin chromatography: a 5 mL HiTrap Heparin (GE healthcare) column was equilibrated with 20 mM Na Hepes, pH 7.4, 300 mM NaCl, 1 mM MgCl2, 0.5 mM TCEP and protein sample was loaded, washed with the same buffer, and eluted using a buffer containing 20 mM Na Hepes, pH 7.4, 1 mM MgCl2, 1 M NaCl, 0.5 mM TCEP. Fractions containing pure L:P were pooled, concentrated and loaded onto a Superose 6 increase 10/300 column (GE healthcare) pre-equilibrated with 20 mM Na Hepes, pH 7.4, 300 mM NaCl, 1 mM MgCl2, 0.5 mM TCEP. Fractions collected were subjected to SDS PAGE and instant blue staining. Fractions containing L:P complex were pooled and concentrated using Vivaspin15R with a 100 kDa cutoff concentrator (Sartorius) to a concentration of 0.85 mg/mL for cryo-EM. The final yield for the L:P complex used for cryo EM was 0.045 mg/L. During the heparin chromatographic step, a large excess of P was observed to flow through the heparin column. This fraction was further purified by size exclusion using a superdex 200 10/300 column (GE healthcare) pre-equilibrated with 20 mM Na Hepes, pH 7.4, 300 mM NaCl, 1 mM MgCl2, 0.5 mM TCEP and the pure P was finally concentrated in a Vivaspin15R concentrator (10 kDa cut-off). The yield of free P was 0.23 mg/L. In order to obtain protein for the polymerization assay, both the wild-type L:P and LD745A:P mutant were purified following the same procedure, but without the heparin chromatographic step. The L protein in the L:P preparations was quantified by separation on SDS-PAGE and detection using colloidal blue staining. The molar concentration of L:P was determined with respect to L.
RdRp activity assays
class="Chemical">PAGE-n class="Chemical">purified RNA oligo-ribonucleotides representing residues 1–14 (Fig. 1d) or 1–25 (Extended data Fig. 2b) of the le sequence pn>urchased from Sigma were used as a template for in vitro RNA synthesis. Purified recombinant HMPVclass="Chemical">L:P at a concentration of 13 nM was incubated with 1 μM RNA oligonucleotide in a total reaction volume of 20 μL in vitro transcription buffer (50 mM Tris-HCl, pH 7.5, 40 mM KCl, 5.0 mM MgCl2, 0.2 U/μL murine RNase inhibitor, 0.01% (v/v) Triton X-100 and 1 mM DTT). The reactions were pre-equilibrated on ice for 30 min. Transcription reactions were initiated by the addition of rNTPs (1 mM each of rATP, rGTP and rCTP, 20 μM rUTP) and 20 nM [α-32P] rUTP (3,000 Ci/mmol, Perkin Elmer) or as indicated in the legends and incubated at 30° C for 3 hours. Reactions were terminated by addition of an equal volume of deionized formamide. After addition of proteinase K, the reaction mixtures were further incubated at 37° C for 20 min. The reaction products were heat-denatured for 10 min at 95° C and ¼ of the reaction volume was resolved by denaturing urea PAGE (20% acrylamide, 0.5× TBE). Gels were subsequently dried onto a positively charged Hybond N+ nylon membrane (GE Healthcare) and imaged using a phosphorimager. Data are representative of three independent experiments.
The length of the RNA n class="Chemical">pron>ducts of the RdRp activity assay was determined by comparison with 3-nt and 25-nt long 5’-tripn>hospho-oligo-RNAs synthesised in vitro using T7 RNA polymerase and [α-32P] rATP incorporation. Succinctly, annealed complementary DNA oligonucleotides representing a class II T7 phage promoter followed by the anticipated 25-nt Tr RNA product sequence were incubated with T7 RNA polymerase (New England Biolab) and [α-32P] class="Chemical">rATP according to the manufacturer’s instructions. To generate the 3-nt long 5’-triphospho-oligo-RNA marker (5’-class="Chemical">pclass="Chemical">pclass="Chemical">pACG), reaction was class="Chemical">performed under the same experimental conditions except that rGTP was substituted with 3ʹ-dGTP chain terminator.
To characterize the 3′ extension activity of n class="Species">HMn class="Chemical">PV pn>olymerase on the RNA template, the le25 template was labelled with E. colipoly(A) polymerase (New England Biolab) in the presence of [α-32P] class="Chemical">rATP and excess of the chain-terminating analogue cordycepin 3′-triphosphate (3′-dATP). A quantity of 1 μM RNA was incubated (37° C, 60 min) with 0.25 U/μL E. colipoly(A) polymerase in reaction buffer (50 mM Tris-HCl (pH 7.9), 250 mM NaCl, 10 mM MgCl2) supplemented with 20 μM 3ʹ-dATP and 100 nM [α-32P] rATP (3,000 Ci/mmol, Perkin Elmer).
class="Chemical">Primer elongation reactions con class="Chemical">pan class="Chemical">ntained 2 μM tempn>late (coclass="Chemical">ntaining a 3´phosphate group), 20 μM primer (containing a 5´- hydroxyl group) and 20 μM each rATP, rCTP, rGTP and rUTP, with 60 nM [α-32P] rGTP tracer in a buffer containing 50 mM Tris-HCl, pH 7.4, 8 mM MgCl2, 5 mM DTT and 10% glycerol. Primer elongation was initiated by the addition of HMPVL:P to a concentration of 20 nM in a total reaction volume of 25 μL. Reactions were allowed to proceed for 2 h at 30˚C. After heat inactivation for 3 min at 90˚C, reactions were heat denatured for 5 min at 95˚C in an equal volume of 10 mM EDTA in deionized formamide with 0.02% each bromophenol blue and xylene cyanol dyes. 7 μL of the 50 μL stopped reaction mix were subjected to denaturing urea PAGE (20% acrylamide, 7 M urea, 1× TBE). The gel was dried onto 3MM chromatography paper and the radiolabelled products were detected by phosphorimager analysis.
Cryo Electron Microscopy
We glow-discharged Quantifoil R 1.2/1.3 holey n class="Chemical">carbonn> 200-mesh class="Chemical">copper grids (Quantifoil Micro Tools, Großlöbichau, Germany) at 20 mA for 45 seconds. To vitrify the sample, we applied 3.5 μL of the HMPVL:P complex (both diluted to 0.8 mg/mL) to each grid, and plunge-froze in liquid ethane using a Vitrobot Mark IV (FEI) after a 4.0-second blotting (offset -2) with filter papers pre-saturated for 15 minutes under 100 % humidity. We screened the sample homogeneity and cryo preparation in an Oxford-style CT3500 cryo-specimen holder (Gatan Inc, Pleasanton, CA) on a liquid nitrogen cooling Tecnai F20 electron microscope (FEI, Lausanne, Switzerland), operated at 200 kV and equipped with a K2 summit direct detector (Gatan Inc., Pleasanton, CA). We screened for thin and clean ice with maximum distribution of non-overlapping protein particles. Sample homogeneity was judged by visual observation and by 2D classification (Extended data Fig. 1). We optimized and prepared two sets of cryo-grids, one with the purified L:P sample and the other with purified L:P supplemented with the free P sample at a 1:0.4 molar ratio.
We automated data acquisition using SerialEM[20] on a K2 summit detector (Gatan Inc., class="Chemical">Pleasanton, CA) equin class="Chemical">pped to a Tecnai Polara electron microscope (FEI, Lausanne, Switzerland) operated at 300 kV and at a nominal magnification of 31,000×. The calibrated physical pn>ixel size was 1.24 Å corresponding to a calibrated magnification of 40,323× on the camera sensor plane. We confirmed beam parallelization by collecting a few movies on carbon after comma-free and objective astigmatism alignments and prior to the actual data collection; the Thon rings of carbon images consistently extended beyond 2.9–2.7 Å and were of excellent stigmation (Extended data Fig. 1). For all datasets, we recorded 40 frames/movie in super-resolution mode (0.62 Å/pixel) using a dose rate of 9.2 electrons/physical pixel/second (1.24 Å/pixel), and a total exposure of 48 electrons/Å2. We recorded a total of 7,438 and 1,100 dose-fractionated images for the class="Chemical">L:P sample and L:P supplemented with P sample, respectively.
Image processing
We aligned the super resolution movie frames using n class="Chemical">pan class="Gene">MotionCor2[21] with dose filtering and 5 by 5 pn>atched sub-frame alignment; the frames were binned over 2 by 2 super-resolution “pixels” in Fourier space, yielding the physical pixel size of 1.24 Å of the saved micrographs. We binned the motion-corrected micrographs by a factor of 4 (4.96 Å/pixel) and low-pass filtered to 15 Å for visual examination. We discarded those of thick ice, abnormal defocus, or large carbon areas. For particle picking and 2D classification, we down-sampled the micrographs to 2.48 Å/pixel in Fourier space using resample_mp.exe[22] to facilitate computation. We manually picked particles for 10 representative micrographs using e2boxer.class="Chemical">py[23] and used these boxes to train the auto-class="Chemical">picking model in crYOLO[24]. We then used the trained model to automatically pick particles from all micrographs with the crYOLO Phosaurus network. We visually reviewed the results and manually improved the automatically picked boxes for the 10 micrographs, while incorporating missed representative micrographs for the next round training. We repeated the crYOLO training/autopicking procedures until all particles were accurately picked and well centered from all micrographs. The final L:P data contained a total of 5,381,943 particles, while that of L:P supplemented with P contained 606,410 particles.
We determined the defocus values and defocus angles of the unbinned (1.24 Å/class="Chemical">pixel) min class="Chemical">pan class="Chemical">crograpn>hs using ctffind4[25]. The Thon ring fitting resolution (reported by ctffind4) of most selected micrographs consistently fell into the 2.9~3.3 Å range (Extended data Fig. 1), indicating good data quality. We performed 2D classification in Relion[26] until all high-quality 2D class images became apparent (within 50 iterations). We only selected the best 2D classes judged by visual examination and by the resolution and other statistics reported by Relion, retaining 880,724 particles for 3D anaclass="Chemical">lysis.
To sclass="Chemical">peed-un class="Chemical">p computation, we started the initial 3D analysis with the data downsampled to 2.48 Å/pixel. We generated the ab initio model (3 classes) using Relion, and selected the best class as a reference model for further data anapn>an class="Chemical">lysis. We performed one round of mode 3 alignment for the 2D class images against the 3D initial model in frealign (version 9.11)[22], with CTF correction turned off. We then calculated projection matching using frealign (version 9.11)[22] to cross-validate the 3D model and to confirm the 2D class selection as a self-consistency test. We performed 3D classification straightly with 6 classes in Relion until convergence (within 50 iterations). Two classes were clearly of high quality and superimposed well, with a cross correlation coefficient higher than 99%. We combined these two classes, yielding a total of 437,611 particles. The handedness of the maps was confirmed to be correct as judged by well-defined α-helices. From this point on, we scaled the alignment parameters to 1.24 Å/pixel and used the full resolution data for further analysis. Relion auto-refinement of these particles produced a map of 4.0-Å resolution. We further improved the resolution to 3.9-Å after refining the per-particle motion in Relion and to 3.8 Å after beam tilt refinement by treating all particles as one group. Refining the defocus of individual particles did not yield obvious improvement. We confirmed convergence of motion correction by another round of per-particle motion refinement, which yielded no apparent improvement. Nonetheless, the quality of this map at 3.8-Å resolution was excellent, as judged by main-chain continuity and side-chain densities (Extended data Fig. 1g, h). We further classified the 437,611 particles into six classes and discarded four junk classes, yielding a total of 302,346 particles. We auto-refined the two good classes individually or combined and again they overlapped nearly perfectly. We then gave the data more weights by adjusting the T number in Relion and later performed masked alignment, while correcting FSC for effects of the solvent mask. We visually examined the results and found out that a T number of 4 produced the best map without apparent signs of over-fitting. This combined with another round of per-particle motion refinement, beam tilt refinement, and per particle defocus refinement eventually yielded a map at 3.7 Å. In an effort to improve the density associated with P, we generated masks for P and L using Chimera segger[27] and performed multi-body refinement using Relion, with the orientation of P relative to L fixed. We also subtracted the L signal from the 2D images, excised P from the 2D images, and performed 3D classification without alignment. Neither approach improved the P density, perhaps due to insufficient signal considering the small size of P. The final map was at 3.7 Å resolution and was sharpened with a B factor value of −188.25 Å2. The procedure is summarized in Extended Data Fig. 3.
We followed the same data n class="Chemical">pron>cessing class="Chemical">paradigm with minor adjustments for the dataset of the L:P preparation supplemented with P. The classification results were consistent with the class="Chemical">L:P dataset, leaving 273,574 and 65,398 class="Chemical">particles for the good classes after 2D classification and a first round of 3D classification, respectively. The final map after a second round of 3D classification was at 4.1-Å resolution from 29,376 class="Chemical">particles and was nearly identical to the map from the L:P dataset. Hence, we mainly used the map from the L:P dataset for map interpretation, except for two less well-defined C-terminal helices of two P subunits (P3 and P4).
Model building and refinement
We built the atomic model using the graclass="Chemical">phics n class="Chemical">pan class="Chemical">programs O[28] and Coot[29]. First, we fitted the RdRpn> and capping enzymatic domains of the VSV-L model (PDB access code: 5A22) into the map as well as the coiled-coil domain of P (PDB access code: 4BXT) and then used these as a guide for finding main-chain connectivity. Local resolution analysis indicated that the majority of the core portion of the complex falls within a resolution range of 3.0–3.6 Å, with excellent main-chain connectivity and side chain densities class="Chemical">throughout the molecules (Extended Data Fig 1g, h). The majority of the electron density showed excellent continuity giving confidence for tracing the α–carbon chain. Moreover, in most regions, clear electron density for side chains allowed us to unambiguously assign the sequence register. We only observed weak and disordered extra density accounting for a portion of the CD and CRVI+ domains in maps sharpened with lower B factors and/or low-pass filtered with lower spatial frequency, suggesting that these domains are highly flexible in the complexes we examined. The appendage of HMPV-L might become fixed with respect to the ring-shaped core domain upon binding to a peptide encompassing residues 35–106 of the VSV-P used in the VSV-L study. Indeed, a recent study of the rabies virus L protein in complex with the counterpart of this P peptide demonstrates such contacts between P and L (Josh Horwitz and Stephen C. Harrison, personal communication). Although there were some stretches of weaker densities for the P protein, long side-chains of arginine and lysine residues ensured the correct polarity and register assignment of the sequence. We further confirmed that the sequence was in register by comparing the model with the coiled-coil interface of the known structure of the phosphoprotein tetramerization region (PDB code: 4BXT). Overall, we traced the entire NTD subdomain and RdRp domain spanning residues 8–892, except for mobile residues 1–7 (predicted to be disordered using the PrDOS prediction server http://prdos.hgc.jp/cgi-bin/top.cgi), 607–625 (603–625 are predicted to be disordered), the complete capping enzyme domain (residues 893–1380) and the four P subunits (residues 169–266, 171–236, 168–219, and 169–193/204–231 of subunits P1, P2, P3 and P4, respectively). Residues 194–203 of subunit P4 are disordered. While the sequences could be assigned unambiguously for P1 and P2 residues as well as for the coiled coil region of P3 and P4, the sequence assignment for the C-terminal ends of P3 and P4 that are predicted to form helices is less certain. These segments might become fully ordered only in the presence of the ribonucleoprotein (see main text). Model building was interspersed with refinement using the real space refinement option in the package PHENIX[30] (rigid_body, minimization_global, local grid searching and ADP) for the coordinates and B factors. We applied Ramachandran, rotamer and secondary structure restraints throughout the refinement, in addition to the standard geometry and B factor restraints. Over 40 rounds of manual rebuilding and phenix real space refinement were carried out until we had high confidence of the sequence register, density fitting, model geometry and local chemical environments. We analyzed the final model with MolProbity[31] and statistics are given in Extended Data Table 1. We deposited the refined coordinates to wwPDB (PDB access code: 6U5O) and the electron density map to EMDB (EMDB access code: EMD-20651).
Extended Data Table 1 |
Cryo-EM data collection, structure refinement and model statistics.
HMPV L:P complex (EMDB-20651) (PDB 6U50)
Data collection and processing
Magnification (nominal/calibrated)
31,000/40,322.58
Voltage (kV)
300
Electron exposure (e-/Å2)
48
Defocus range (μm)
1.5–3.0
Pixel size (Å)
1.24
Symmetry imposed
Cl
Initial particle images (no.)
5,381,943
Final particle images (no.)
302,346
Map resolution (Å)
3.7
FSC threshold
0.143
Map resolution range (Å)
∞-3.7
Refinementa
Initial model used (PDB code)
-
Model resolution (Å)
3.7
FSC threshold
0.5
Model resolution range (Å)
∞-3.7
Map sharpening B factor (Å2)
−188.25
Model composition
Non-hydrogen atoms
12,977
Protein residues
1,623
Ligands
-
B factors (Å2)
Protein
15.62 (5.27–52.43)
Ligand
-
R.m.s. deviations
Bond lengths (Å)
0.004
Bond angles (°)
0.717
Validationb
MolProbity scorec
1.31 (98th percentile)
Clashscore
2.13
Poor rotamers (%)
0.00
Ramachandran plot
Favored (%)
100.00
Allowed (%)
0.00
Disallowed (%)
0.00
Refinement statistics were obtained with program Phenix (phenix.real_space_refinement) (ref 30) except otherwise noted.
Molprobity score was obtained from the Molprobity online server (100th percentile is the best among structures within the resolution specified) and the present structure is in the 98th percentile (N=27675, 0 – 99Å)
The assignment of Pro-1041 as a cis-Proline is supported by electron density showing a hydrogen bond between the carbonyl oxygen atom of Thr-1040 and the N atom of Gln-1353.
Modeling of the polymerase: RNA elongation complex
We fitted a rotavirus in situ elongation comclass="Chemical">plex (n class="Chemical">PDB ID 6OJ3) to the HMPVL:P complex by superimposing the CA atoms of the 5 functional polymerase motifs in the program O[28] and refined the alignment using the O command lsq_improve. The relatively long RNA allowed us to easily identify the NTP entry, template entry, template exit, and transcript exit tunnels in the HMPV L structure. Modeling using elongation complexes of other polymerases such as influenza virusPB1 (PDB code: 6QCT), the HCV NS5 (PDB code: 4WTG) and reovirus λ3 elongation complex (PDB code 1N35) confirmed the results.
Tunnel identification
We identified the tunnels in the structure using voidoo, mama and maclass="Chemical">pman of the Un class="Chemical">ppsala Software Factory[34]. We tested values for the probe radius in the range of 1.4–2.0 Å and 1.8 Å yielded the most accurate tunnels. We generated pseudo-atoms to define the range of the calculation and displayed the tunnel along with the models in PyMOL (https:pymol.org). We performed object-specific clipping of the tunnel mask using PovRay (http://www.povray.org) to display its interior along with the modeled RNA.
Tandem Mass spectrometry
We class="Chemical">performed tandem mass sn class="Chemical">pectrometry data collection and analysis (Extended Data Fig. 8) for both the purified pn>an class="Chemical">L:P complex and free P following a standard protocol[35]. Briefly, we digested 2 μg of each sample over night at 37°C with trypsin (1:100 enzyme:sample molar ratio). Digests were carried out in 200 mM EPPS pH 8.5 in presence of 2% acetonitrile (v/v) with trypsin (Promega #V5111). After peptide purification by reversed phase C18 chromatography 1 μg of peptides was separated with a 4-hour acetonitrile gradient by HPLC prior to injection. Data were collected using an Orbitrap Lumos mass spectrometer (Thermo Fisher Scientific) coupled to a Proxeon EASY-nLC 1200 liquid chromatography (LC) system (Thermo Fisher Scientific). The 75 μm inner diameter capillary column used was packed with C18 resin (Accucore 2.6 μm). The scan sequence started with collection of an MS[1] spectrum (Orbitrap analysis; resolution 120,000; mass range 400–1400 Th). MS2 analysis followed collision-induced dissociation (CID, CE=35) with a maximum ion injection time of 250 ms and an isolation window of 1.2 Da. Neutral loss of phosphate was taken into account by using a multi-stage activation (MSA) method with a neutral loss of 97.9763 Da. Peptides were searched with a SEQUEST-based algorithm against an insect database containing L and P protein sequences and common contaminants in addition. Searches were performed with a target decoy database strategy and a false discovery rate (FDR) of 1% set for peptide-spectrum matches followed by linear discriminant analysis (LDA) filtering and a final collapsed protein-level FDR of 1%. We dynamically searched for oxidized methionine (+15.9949146221 Da) and phospho-serine, -threonine and -tyrosine (+79.9663304104 Da). Phosphorylation site localization used a modified Anova score (ModScore). We repeated the experiments twice and the results were consistent.
RSV priming loop experiments
Mutations in codon-oclass="Chemical">ptimized n class="Chemical">pan class="Species">RSV L (strain A2) were generated in a fragment of L ORF subcloned into a pn>GEM T easy vector (class="Chemical">Promega). The W1262A substitution was generated by site-directed mutagenesis by Genscript, the P1261A, P1274A, Y1276A substitutions were generated using Q5 site directed mutagenesis (NEB). RSVP1261A was generated using forward primer 5′-ACCTACTAAGGCCTGGGTCGG-3′ and reverse primer 5′-CCCCTCTCGCCCCTGGTC-3′. RSVP1274A was generated using forward primer 5′-CCAGGAAAAAAAGACCATGGCCGTCTACAACAGG -3′ and reverse primer 5′- GTGGAGGAGCCGACCCAGGGCTTAGTAGG -3′. RSVY1276A was generated using forward primer 5′-CATGCCCGTCGCCAACAGGCAG-3′ and reverse primer 5′-GTCTTTTTTTCCTGGGTG-3′. The mutated ORF fragments were sequenced and then substituted in the codon optimized L ORF contained in a T7 expression vector (pTM1) by restriction digest and ligation. All plasmids were sequenced to confirm the presence of mutations, and to ensure that no other changes were introduced during PCR-amplification steps. BSR-T7 cells, a BHK-derived cell line that constitutively expresses T7 polymerase[46] (provided by Dr Karl Conzelmann), in six well dishes were transfected with pTM1 plasmids expressing the following proteins: RSV (A2 strain) N (400 ng/well), P (200 ng/well), M2–1 (100 ng/well), and L (or L mutant; 100 ng/well), and a plasmid expressing a minigenome RNA containing nt 1–36 of the trailer (tr) promoter at its 3´ end (200 ng/well). The minigenome was limited to synthesizing only the first strand of replication, by the lack of a promoter specific sequence at the 3´ end of the replicative intermediate. The cells were transfected using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s instructions. 40–48 h post transfection, cells were harvested and RNA was purified using Trizol, according to the manufacturer’s instructions, except that the RNA was subjected to a further round of purification by extraction with phenol-chloroform, followed by ethanol precipitation. RNA products initiated within the tr promoter were analyzed by primer extension. RNA representing one-tenth of a well of cells were subjected to reverse transcription at 37˚C using the Sensiscript RT kit (Qiagen) and a radiolabeled primer (5´-TACGAGATATTAGTTTTTGAGAC-3´). One half the primer extension reaction was subjected to electrophoresis in 8% polyacrylamide gels containing 7 M urea in 1× TBE. Radiolabeled oligonucleotides corresponding in sequence to cDNA representing initiation from positions +1 and +3 of the tr promoter were used as markers (+1, 5´-TACGAGATATTAGTTTTTGAGACTTTTTTTCTCGT-3´; +3, 5´- TACGAGATATTAGTTTTTGAGACTTTTTTTCTC-3´´). The same RNA samples, RNA representing one-tenth of a well of cells, were also subjected to denaturing gel electrophoresis on a 1.5% formaldehyde-agarose gel and transferred to nitrocellulose by northern blotting. The input minigenome template was detected using a radiolabeled strand-specific riboprobe. Primer extension products were visualized by autoradiography and the primer extension products and input minigenome were quantified using phosphorimager analysis. The levels of input minigenome were used to normalize for transfection efficiency. The data were normalized to the mean of the two WT samples included in each experiment. With the caveat that the sample sizes were an n of 3, statistical analysis using ANOVA showed that the levels of +1 and +3 RNA products for the P1261A and W1262A L variants were significantly different than those from wt L (p values of 0.002 and 0.003, respectively for +1; 0.004 for both variants for +3), whereas the levels of RNA produced from P1274A and Y1276A L variants were not statistically different (p values of 0.0385 and 0.549, respectively for +1; 0.227 and 0.518, respectively for +3).
Figure preparation
We class="Chemical">pren class="Chemical">pared the figures and videos using PyMOL (Schrödinger LLC), Chimera[27], python matplotlib, gnuplot, CCP4 TopDraw[36], and PovRay (http://www.povray.org). Multiple sequence alignment was performed with Clustal Omega[37] (https://www.ebi.ac.uk/Tools/msa/clustalo/) and the alignment displayed with ESPript[38].
Data availability
Structure coordinates are available from the RCSB n class="Chemical">n class="Chemical">Protein Data Bank (PDB) under accession code 6U5O and the electron density mapn> from the Electron Miclass="Chemical">croscopy Data Bank (EMDB) under accession code EMD-20651. All other data generated or analyzed in this work are available from the corresponding authors upon reasonable request.
Purification of HMPV L:P and structure determination using cryo-EM
a, Reclass="Chemical">presen class="Chemical">pan class="Chemical">ntative size exclusion chromatogram of the pn>an class="Chemical">L:P complex (these experiments were repeated more than 5 times). Fractions indicated by an arrow were collected and concentrated to 0.85 mg/mL and used for cryo EM analysis. Inset: SDS PAGE followed by Coomassie blue staining of the purified samples. Also shown: free P protein separated from L:P complex by heparin chromatography (For gel source data, see Supplementary Fig. 1). b, Raw micrograph of HMPV-L:P particles recorded in vitreous ice. Scale bar 10 nm. c, Power spectrum of the image shown in panel (b). We limited the high resolution for fitting to a spatial frequency of 1/5.0 Å and 1/2.9 Å marks the highest spacing to which CTF rings were successfully fit. d, 2D classes and “self-consistency check” for the cryo-EM 3D reconstruction. In each box over the three rows, the upper panel shows one 2D class average, whilst the lower panel shows the corresponding projection from the initial 3D model. e, Local resolution of the cryo-EM density map. Variations in local resolution are color- coded from blue (3.0 Å) to red (5.9 Å), computed with Resmap.[39]
f, Fourier Shell Correlation (FSC) of the cryo-EM map as a function of the spatial frequency. The gold standard resolution is 3.7 Å based on the FSC=0.143 criterion, consistent with the model to map correlation (0.5 criterion). g, Example of the electron density map that allowed model building. The region shown is at an interface between the RdRp and capping domain. The map is shown as a gray mesh, contoured at a level of 3 σ. The atomic model is shown as sticks with residues from RdRp colored in cyan (NTD in grey) and in green for the capping domain. h, The region shown is the three-stranded β-sheet at the interface between the RdRp (cyan sticks) and the phosphoprotein (magenta sticks). The map is shown as a gray mesh, contoured at a level of 2.5 σ. We observed a nearly identical structure of the L:P complex in a reconstruction obtained by premixing the L:P complex with fully phosphorylated P, indicating that potential exchange of P affected neither the formation nor the structure of the L:P complex.
RdRp activity assay.
a, n class="Chemical">SDSn> PAGE of class="Species">HMPV wild type L:P, LD745A:P purified for RdRp activity assays. Proteins were purified by metal affinity, TEV cleavage of Histidine-tag followed by reverse His-tag affinity purification and size exclusion chromatography. b, Analysis of the 3′ extension activity of HMPV polymerase using the le25 RNA template. Reactions were performed with rNTPs (0.5 mM each of rUTP, rGTP and rCTP), 20 μM rATP and 20 nM [α-32P] rATP. When a 3’-modified le25 (le25[SpC3], three-carbon spacer group linked to the 3’ extremity) was used as a template, synthesis of products greater than 25 nt was greatly reduced compared to le25. When only [α-32P]rATP and no other rNTP was supplied, only a product with size greater than 25 nt was observed. This result shows that the L:P complex was capable of modifying the 3’ terminus of the template, in addition to engaging in de novo initiation at the promoter. The radiolabeled RNA products were visualized by phosphorimaging. Data are representative of three independent experiments. For gel source data, see Supplementary Fig. 1.
Flow-chart depicting structure determination using cryo-EM.
class="Chemical">Please see methods sections for details.
Phosphoprotein tetramer in complex with L.
a, The Ln class="Chemical">pron>tein (cyan) is represented as a molecular surface and the tetrameric P protein subunits are represented as ribbons, following the color codes in Figs. 1–3 (P1 in magenta, P2 in hot class="Chemical">pink, P3 in salmon and P4 in class="Chemical">pink). b, Structures adopted by the four individual P subunits bound to L, colored as a blue to red “rainbow” from the N- to the C- terminal ends. Secondary structures boundaries are noted for each subunit. c and d, Superposition of the tetramerization helices in the context of the L:P complex and the free P protein. Structures are represented as colored ribbons with the free phosphoprotein coiled-coil (PDB access code 4BXT) colored in gray and the four P subunits reported in this work colored according to Fig. 1 (P1 in magenta; P2, hot pink; P3, salmon and P4, pink). The r.m.s.d. of the superimposition is 1.13 Å over 88 α-carbon atoms. e, View of the complex where L and P have been pulled apart to display electrostatic surfaces. f, Overall view of the L:P complex with P shown as ribbons and L as electrostatic surface. The P tetramer consists of subunits P1(magenta), P2 (hotpink), P3 (salmon), and P4 (pink).
Topology of the L:P complex.
Toclass="Chemical">pological den class="Chemical">piction of the secondary structure elements of L and P. Helices are depicted as tubes and strands as arrows. The color code is the same as in Fig. 1. The RdRp domain and its subdomains and the capping domain are colored as in Figs. 1 and 2: NTD in grey, finger in blue, palm in red, thumb in dark green and CAP in green. The four subunits of the phosphopan class="Chemical">protein P1, P2, P3 and P4 are colored as in Fig. 1. Secondary structure boundaries are indicated.
View of the N-terminal domain (NTD).
NTD is displayed as grey ribbons (following Fig. 2a colors), with evolutionary conserved residues clustered class="Chemical">near the n class="Chemical">pan class="Chemical">rNTP entry tunnel pn>laying a role in transclass="Chemical">cription, represented as sticks and labeled. Colored in lighter grey is the equivalent region of the VSV_L superimposed to HMPV_L.
Model for an elongation complex stalled by the addition of ALS-8176 5’-triphosphate.
ALS-8176 5’-trin class="Chemical">phosphaten> is a class="Chemical">nucleoside triphosphate analog against RSV and HMPV currently in phase 2 clinical trials. The 744GDNQ catalytic motif and positions (A789V, L795I and I796V) of which mutations conferred resistance (identified by passaging RSV) are mapped onto the HMPV-L structure (respectively corresponding to A723, V729 and V730) and displayed as sticks. These conservative mutations probably affect inhibitor binding by inducing a slight repositioning of the helix, due to altered hydrophobic contacts with neighboring helices. Please refer also to the sequence alignment displayed in Supplementary Fig. 2. Protein is colored according to Fig. 2a, and the template and nascent RNA strands according to Fig. 3a.
MS2 spectrum of the Ser148 P phosphopeptide.
(a) One n class="Species">MS2n> spectrum used for identification of the phosphorylated P peptide 142DALDLLS#DNEEEDAESSILTFEER is displayed. Tandem mass spectrum (top) and deviation (bottom) allowed detection of phosphorylation (symbol #) at site Ser148. Peptides fragmented from the N-terminus (b-fragments) and C-terminus (y-fragments) are colored in blue and red, respectively. (b) y and b ion series m/z identified in the spectrum (a) and their deviation from theoretical m/z are displayed in the Table. The present pattern of phosphorylation agrees with observations showing that phosphorylation of the peptide comprising residues 100–120 (ref 44) of RSV-P - in particular phosphorylation of Thr108 (ref 45) corresponding to Ser148 of HMPV-P (Extended Data Fig. 9)) - controls its interaction with the M2–1 protein.
Structure-based sequence alignment of the phosphoprotein from HMPV (labeled HMPV-A, strain CAN97-83) and other known pneumoviral P proteins:
class="Chemical">pan class="Species">HMPV-B, pan class="Species">human metapneumovirus subgroup B; HRSV-A and B, humanrespiratory syncytial virus subgroup A and B respectively; BRSV, bovine respiratory syncytial virus; PVM, pneumonia virus of mice; AMPV-A and AMPV-C, avian metapneumovirussubgroup A and C respectively. Sequences accession codes for the alignment HMPV-A: AAQ67693.1 (used in this work), HMPV-B: AAQ67684.1, HRSV-A: AAX23990.1, HRSV-B: AAR14262.1, BRSV: AAL49395.1, PVM: AAW79177.1, AMPV-A: AAT68644.1 and AMPV-C: AAT86110.1.
The secondary structure of class="Chemical">pan class="Species">HMn class="Chemical">PV_P subunit P1 (tpn>an class="Chemical">his work) is displayed above the alignment. Phosphorylation sites are highlighted in brown. Positively-charged residues of HMPV_P are shaded in blue, negatively charged residues in magenta and hydrophobic residues 29 to 135 in yellow. The conserved region containing hydrophobic residues critical for L:P interactions are highlighted in green. Structural alignment of P from HMPV and RSV[16] showed similar overall tetramer organization. However, differences are observed in subunit P1 with an r.m.s.d. of 2.24 Å over 82 residues. Although P is in general more mobile with weaker densities and higher B factors compared to L, the region following the beta-hairpin (residues 175–215 in HMPV) does adopt a slightly different conformation compared to RSV P1. Subunit P3 has an r.m.s.d. of 1.94 Å over 45 residues due to a slightly tilted C-terminal helix compared to RSV. Subunit P2 is most similar with an r.m.s.d. of 0.92 Å over 56 residues. Subunit P4 has an r.m.s.d. of 1.33 Å over 47 residues. The eight residues of HRSV-P, that are crucial for interacting with HRSV-L and whose substitutions impair viral replication, are shaded in dark green (data from reference 16). With the exception of Asn189 (HRSV-P) where a deletion is present in HMPV-P, these residues are conserved in HMPV-P and other known pneumoviral P proteins.
class="Chemical">pan class="Chemical">Cryo-EM data collection, structure refinement and model statistics.
Refinement statistics were obtained with class="Chemical">pan class="Chemical">program Phenix (class="Chemical">phenix.real_spn>ace_refinement) (ref 30) excepn>t otherwise noted.
Molclass="Chemical">pan class="Chemical">probity score was obtained from the Molpan class="Chemical">probity online server (100th class="Chemical">percentile is the best among structures within the resolution specified) and the class="Chemical">present structure is in the 98th class="Chemical">percentile (N=27675, 0 – 99Å)
The assignment of n class="Chemical">Pron>-1041 as a class="Chemical">cis-Proline is supclass="Chemical">ported by electron density showing a hydrogen bond between the carbonyl oxygen atom of Thr-1040 and the N atom of Gln-1353.
Authors: Bo Liang; Zongli Li; Simon Jenni; Amal A Rahmeh; Benjamin M Morin; Timothy Grant; Nikolaus Grigorieff; Stephen C Harrison; Sean P J Whelan Journal: Cell Date: 2015-07-02 Impact factor: 41.582
Authors: Jerome Deval; Jin Hong; Guangyi Wang; Josh Taylor; Lucas K Smith; Amy Fung; Sarah K Stevens; Hong Liu; Zhinan Jin; Natalia Dyatkina; Marija Prhavc; Antitsa D Stoycheva; Vladimir Serebryany; Jyanwei Liu; David B Smith; Yuen Tam; Qingling Zhang; Martin L Moore; Rachel Fearns; Sushmita M Chanda; Lawrence M Blatt; Julian A Symons; Leo Beigelman Journal: PLoS Pathog Date: 2015-06-22 Impact factor: 6.823
Authors: B G van den Hoogen; J C de Jong; J Groen; T Kuiken; R de Groot; R A Fouchier; A D Osterhaus Journal: Nat Med Date: 2001-06 Impact factor: 53.440
Authors: Malene Ringkjøbing Jensen; Filip Yabukarski; Guillaume Communie; Eric Condamine; Caroline Mas; Valentina Volchkova; Nicolas Tarbouriech; Jean-Marie Bourhis; Viktor Volchkov; Martin Blackledge; Marc Jamin Journal: Biophys J Date: 2020-04-18 Impact factor: 4.033
Authors: Francine C A Gérard; Marc Jamin; Martin Blackledge; Danielle Blondel; Jean-Marie Bourhis Journal: J Virol Date: 2020-02-28 Impact factor: 5.103
Authors: Joseph R Gould; Shihong Qiu; Qiao Shang; Tomoaki Ogino; Peter E Prevelige; Chad M Petit; Todd J Green Journal: J Virol Date: 2020-02-28 Impact factor: 5.103