Literature DB >> 25026548

Secondary structure of a conserved domain in an intron of influenza A M1 mRNA.

Tian Jiang¹, Scott D Kennedy, Walter N Moss, Elzbieta Kierzek, Douglas H Turner.

Abstract

Influenza A virus utilizes RNA throughout infection. Little is known, however, about the roles of RNA structures. A previous bioinformatics survey predicted multiple regions of influenza A virus that are likely to generate evolutionarily conserved and stable RNA structures. One predicted conserved structure is in the pre-mRNA coding for essential proteins, M1 and M2. This structure starts 79 nucleotides downstream of the M2 mRNA 5' splice site. Here, a combination of biochemical structural mapping, mutagenesis, and NMR confirms the predicted three-way multibranch structure of this RNA. Imino proton NMR spectra reveal no change in secondary structure when 80 mM KCl is supplemented with 4 mM MgCl2. Optical melting curves in 1 M NaCl and in 100 mM KCl with 10 mM MgCl2 are very similar, with melting temperatures ∼14 °C higher than that for 100 mM KCl alone. These results provide a firm basis for designing experiments and potential therapeutics to test for function in cell culture.

Entities: CellLine Chemical Disease Mutation Species

Mesh：

Substances：
RNA, Messenger
RNA, Viral

Year: 2014 PMID： 25026548 PMCID： PMC4139153 DOI： 10.1021/bi500611j

Source DB: PubMed Journal: Biochemistry ISSN： 0006-2960 Impact factor: 3.162

Influenza A virus is a member of the Orthomyxoviridae family of enveloped viruses with segmented, single-stranded, negative-sense RNA genomes. Each year, influenza A infects an estimated three to five million people worldwide, killing up to 500 000 people.[1] Moreover, influenza pandemics have occurred several times in the past 100 years. For example, the 1918–1920 “Spanish Flu” claimed more than 50 million lives.[2] Few diseases have had a greater impact than influenza on public health and global economic output. In 2012–2013, the United States had an unusually bad flu season, with overall vaccine effectiveness about 47 and 67% against influenza A and B, respectively.[3] Currently, few drugs treat influenza; neuraminidase inhibitors are essentially the only therapeutics available.[4] Moreover, sporadic cases of drug-resistant influenza viruses have been detected worldwide.[5−7] The once widely used adamantanes are now mostly ineffective toward currently circulating influenza (H3N2).[8] Thus, it is important to develop novel antiviral treatments as well as more effective vaccines.[9,10] RNA structure plays key roles in many viruses, including influenza. For example, a panhandle/corkscrew structure in influenza genomic viral (v)RNA required for RNA transcription, replication, and packaging is formed by annealing the 5′ and 3′ ends of influenza vRNAs.[11] Beyond this vRNA, knowledge of influenza virus RNA structures is limited. vRNA is coated with viral nucleoprotein (NP) much of the time, which may melt secondary structure.[12] At various stages of infection, however, regions of vRNA are free of NP and may form functional RNA structures. Influenza positive-sense RNAs are predicted to contain extensive conserved and stable secondary structure. In particular, segments 7 and 8, both of which are alternatively spliced, are enriched in predicted structures.[13] A recent survey of predicted structure in influenza B and C discovered evidence of conserved structures in coding RNAs from spliced segments.[14] Interestingly, in all three viral species (influenza A, B, and C), predicted conserved structures occur at or near splice sites, suggesting common strategies for the regulation of splicing. Knowledge of influenza virus RNA structure can inform experiments to reveal function, enrich understanding of molecular mechanisms underlying the viral life cycle, and facilitate development of new therapeutics. Segment 7 of influenza A encodes M1 protein and is alternatively spliced to produce M2, M3, and occasionally M4 mRNA (Figure 1A).[15,16] M1 and M2 proteins are essential in the viral life cycle. M1 (252 amino acids) is the most abundant influenza protein. It is the matrix protein that connects vRNPs to each other and to the viral envelope. It also determines the directionality of vRNP transport.[17] M2 (97 amino acids) is a transmembrane ion channel protein that permits the flow of protons from the endosome into the virion interior to facilitate viral uncoating.[18,19] Temporal control of splicing is required for generating various mRNA isoforms that must be present at differing abundances over the course of infection.[20] Furthermore, the alternative splicing of segment 7 is complex, and except for M2, the products of spliced mRNAs are not well characterized. For example, the M3 mRNA 5′ splice site more closely fits the consensus 5′ splice site motif than the M2 mRNA 5′ splice site, the latter of which has C rather than G at the 3′ end of the 5′ exon; this finding is surprising because M2 is essential for viral replication, while the M3 protein has yet to be observed.[20] Additionally, some viral strains have an M4 mRNA 5′ splice site.[16] Normally, M4 mRNA is not translated, but when the M2 mRNA 5′ splice site is disrupted, M4 mRNA can produce M42 protein, which can functionally replace M2 to support efficient replication in tissue culture cells and exhibit pathogenicity in an animal host.[21] M2, M3, and M4 mRNA share the same 3′ splice site. Multiple factors, including viral polymerase complex, NS1 protein, and cellular splicing factor SF2/ASF, have been implicated in helping to regulate alternative splicing of segment 7.[20,22,23]

Figure 1

Location and predicted structures of the wild-type RNA. (A) Diagrammatic summary of mRNA splice variants. Gray boxes represent coding regions. Bold lines represent noncoding regions. Thin lines represent introns. M4 mRNA has two different open reading frames.[21] It was only observed in a few viral strains.[16] The M4 mRNA 5′ splice site, GAG/GUUCUC (nucleotides 118–126 with/as splice site), is not present in the consensus sequence. (B) Predicted multibranch model[13] and hairpin model. Base pairs are color-annotated with information from base pair counts from an alignment of all available unique sequences (see Supporting Information). Conserved nucleotides are shown in blue (see color annotation key). A 63 nucleotide fragment in segment 7 containing the 3′ splice site, key residues of SF2/ASF binding site, and the polypyrimidine tract required for splicing exists in an equilibrium between a pseudoknot and a hairpin structure.[24] The conformational switch places each splicing element into different structural contexts. Interestingly, a similar pseudoknot/hairpin conformational switch was described at the 3′ splice site of segment 8.[25] In addition to the structure at the 3′ splice site of segment 7, a region 79 nucleotides downstream from the 5′ splice site for M2 mRNA (nucleotides 105–192) was predicted to have an especially stable and conserved secondary structure (Figure 1).[13] This sequence and predicted structure are conserved within influenza A viruses that infect human, swine, and avian species. A multibranch model was predicted[13] for the consensus sequence of all wild-type sequences available in the National Center for Biotechnology Information (NCBI) influenza virus resource.[26] A possible alternative hairpin structure was also hypothesized for this region, with equally conserved base pairing and similar predicted free energy. This putative alternative hairpin for segment 7 is attractive: a region 51 nucleotides downstream from the 5′ splice site of segment 8 folds into a hairpin conformation in solution.[27] Here, data from nondenaturing polyacrylamide gel electrophoresis, enzymatic/chemical mapping, isoenergetic microarray mapping, and NMR indicate that only the multibranch structure forms in solution.

Experimental Methods

Preparing the Wild-Type and Mutant RNAs

The consensus sequence[13] for the intron region of M1 mRNA was deduced from all available unique sequences from the NCBI Influenza Virus Resource.[26] This sequence was also found in wild-type sequences of human, swine, and avian influenza viruses (GenBank accession nos. CY147367, GQ404614, CY037901, CY021382, CY009453, and M63525). The RNA was transcribed using the AmpliScribe T7 high yield transcription kit (Epicentre) from a double-stranded DNA oligonucleotide template. The antisense sequence is 5′-GAACACAAATCCTAAAATCCCCTTAGTCAGAGGTGACAGGATTGGTCTTGTCTTTAGCCATTCCATGAGAGCCTCAAGATCTGTGTTC TATAGTGAGTCGTATTAGAATTC-3′. The italic letters are complementary to the T7 promoter sequence. DNA templates were removed using DNase I after incubating the reactions at 37 °C for 4 h following Epicentre’s protocol. RNA was purified using denaturing 8% PAGE and, when required, 5′-end-labeled with [γ32P]-ATP. Two mutants were made and labeled in the same way.

Denaturing and Nondenaturing PAGE

Nondenaturing 6% PAGE was run for all full-length constructs. The gel was prepared and run with 1× THEM buffer (66 mM HEPES, 34 mM Tris, 0.1 mM disodium EDTA, and 10 mM MgCl2, pH 7.4).[28] End-labeled RNAs were renatured by heating at 90 °C for 2 min and then slow cooling to 37 °C, at which point the RNAs were incubated for 20 min in 10 mM Tris (pH 7.0), 100 mM KCl, and different concentrations of MgCl2. Electrophoresis was conducted at 15 W for 12 h at 4 °C. Dried gels were analyzed by exposing to a phosphorimager screen. Denaturing 6% PAGE (7.5 M urea) was run for the same RNAs prior to end-labeling. The gel was prepared and run with 1× TBE buffer. RNAs were denatured by heating at 90 °C for 2 min in gel loading buffer II (Ambion) before loading. The gel was stained with 1× SYBR Green II (Invitrogen).

Enzymatic and Chemical Mapping

RNAs used in all mapping experiments were folded in 10 mM Tris (pH 7.0), 100 mM KCl, and 10 mM MgCl2, as described for native gel analysis. All mapping experiments were carried out at room temperature. Enzymatic mapping reactions were adapted from manufacturer’s protocol[29] and carried out on 5′-end-labeled RNAs. Optimal enzyme concentration was determined by titration. The digestion reactions were stopped by ethanol precipitation. DMS, CMCT, and SHAPE mapping reactions were adapted from published protocols[30,31] and carried out on unlabeled RNAs. Reactions were stopped by ethanol precipitation. Modifications were read out by primer extension using an LNA-modified primer (5′-GAALCALCALAALTCLCTALAAA-3′) complementary to nucleotides 176–192. The primer was synthesized according to published method,[32,33] 5′-end-labeled with [γ32P]-ATP, and purified by denaturing 8% PAGE. DEPC reactions were run as described by Moss et al.[24] End-labeled RNAs were incubated with 0.69 mM DEPC, followed by NaBH4 reduction and aniline cleavage. Reactions were stopped by ethanol precipitation. Digestion and modification products were fractionated on a denaturing 8% polyacrylamide gel. All gels were dried, exposed to a phosphorscreen, and imaged with a Bio-Rad personal molecular imager. The intensities of the product bands were quantified using ImageJ,[34] and the bands were considered strong and medium when the integrated intensities were ≥2/3 and ≥1/3 of the strongest integrated intensity, respectively.

Isoenergetic Microarray Mapping

Microarray probing was conducted on the wild-type RNA with isoenergetic pentamer and hexamer 2′-O-methyl oligonucleotide probes with LNA and 2,6-diaminopurine substitutions.[35] Universal microarrays with 861 probes divided into two microarray slides were used. Negative internal controls were U, UUUUU, and spotted buffer. Microarrays were printed at the European Center of Bioinformatics and Genomics in Poznan, Poland. Radioactively labeled RNA was folded in 10 mM Tris-HCl (pH 7.0), 300 (or 100) mM KCl, and 10 mM MgCl2, as described above. RNA in folding buffer was incubated with the microarray at 4 °C for 18 h to allow hybridization. Then, buffer with RNA was poured out, and slides were washed in the same buffer for 1 min at 0 °C and dried by centrifugation. Hybridization was visualized by exposure to a phosphorimager screen, and quantitative analysis was performed with ArrayGaugeV2.1. Possible alternative binding sites were predicted using RNA–RNA thermodynamics[36] with the RNAstructure program.,[37]

NMR Sample Preparation

The 61 nucleotide sequence corresponding to the multibranch structure was transcribed from a double-stranded DNA oligonucleotide template with antisense sequence 5′-TmGmGATCCCCTTAGTCAGAGGTGACAGGATTGGTCTTGTCTTTAGCCATTCCATGAGAGCCC TATAGTGAGTCGTATTAGAATTC-3′. The italic letters are complementary to the T7 promoter sequence. The last two nucleotides of the 5′ end were modified with C2′-methoxyls (annotated by “m”) to reduce nontemplated nucleotide addition at the 3′ end by the T7 polymerase.[38] Transcription was carried out in 40 mM Tris (pH 7.0), 10 mM spermidine, 0.01% Triton, 10 mM MgCl2, 40 mM DTT, 4 mM each NTP, 1 U/mL inorganic pyrophosphatase, with 5 μM DNA template, and 120 μg/mL recombinant T7 RNA polymerase in 20 mL reaction volume. Recombinant T7 RNA polymerase was expressed and purified from BL21 cells.[39] The reaction was stopped after incubating 4 h at 37 °C upon adding 800 μL of 0.5 M EDTA. RNA was purified by denaturing 8% PAGE and dialyzed into 20 mM KH2PO4 (pH 6.5), 80 mM KCl, 0.05 mM EDTA using Millipore Amicon Ultra-15 centrifugal filter units (MWCO 3 kDa).

NMR Experimental Conditions

The RNA was renatured by heating at 90 °C for 2 min and slow cooling to 37 °C. Then, it was put in a Shigemi NMR tube with 10% D2O. NMR spectra were acquired on a Varian Inova 600 MHz spectrometer. One-dimensional imino proton spectra were recorded at temperatures ranging from 0 to 25 °C. Two-dimensional NOESY spectra were recorded at 0, 5, 12, 20, and 25 °C with mixing time ranging from 100 to 250 ms, processed with NMRpipe,[40] and analyzed with Sparky.[41] Subsequently, up to 4 mM final concentration MgCl2 was added in NMR buffer. RNA was renatured by incubating at 37 °C for 20 min after adding MgCl2. 2D-NOESY spectra were then measured under the conditions described above.

Optical Melting Curves

Absorbance versus temperature melting curves were measured at 280 nm with a heating rate of 0.5 °C min–1 in (A) 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, and 100 mM KCl; (B) 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, 100 mM KCl, and 10 mM MgCl2; (C) 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, 300 mM KCl, and 10 mM MgCl2; and (D) 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, and 1 M NaCl on a Beckman Coulter DU 640 spectrophotometer. Melting curves were normalized and analyzed with MeltWin 3.5.[42]

Results

Native Gel Electrophoresis and Enzymatic/Chemical Mapping

The two potential secondary structures of M1 mRNA are identical at the basal stem, but nucleotides between positions 127 and 170 are folded differently. Comparative sequence analysis does not favor one structure over the other (Figures 1B and Table S1). To distinguish between the structures, native PAGE and enzymatic/chemical mapping experiments were performed on in vitro transcribed wild-type RNA and on mutants designed to stabilize the multibranch (MBmutant) or the hairpin (HPmutant) conformation. Figure 2 shows native PAGE of the wild-type RNA and mutants. All RNA ran as a single band, indicating that the wild-type RNA and mutants each fold into a single conformation. The mobility of the wild-type RNA is the same as that for MBmutant but slower than that for HPmutant. The three constructs are of the same size (see denaturing gel in Figure 2B) This suggests that the wild-type RNA folds into the multibranch conformation.

Figure 2

Native gel of the wild-type RNA and mutants. (A) The native gel. RNAs were folded in 10 mM Tris (pH 7.0) and 100 mM KCl with increasing MgCl2 concentration (0, 10, and 50 mM). (B) Denaturing gel of wild-type and mutant RNAs. Figure 3 shows enzymatic/chemical mapping results for the wild-type RNA. Enzymatic mapping[43] used RNase A (cleaves after unpaired pyrimidines), RNase T1 (cleaves after unpaired G), RNase I (cleaves after single-stranded residues),[44] and RNase V1 (cleaves after double-stranded or structured regions). Chemical mapping[43] used DMS (modifies N1 of A and N3 of C when unpaired), CMCT (modifies N3 of U and N1 of G when unpaired), and DEPC (modifies an exposed N7 of A). SHAPE mapping[45] was used to identify flexible regions. The results agree with the predicted multibranch structure but not with the hairpin model. In particular, A142 and C143 were heavily hit by single-strand sensitive probes, which is inconsistent with the hairpin model. Also, U133, C148, C149, C154, and G165, which are single-stranded in the hairpin model but paired in the multibranch model, were not hit by any single-strand sensitive probes.

Figure 3

Mapping results of the wild-type RNA. (A) The multibranch model. (B) The hairpin model. The color annotation key is given at the top. For single-strand sensitive nucleases and small molecule probes, filled and open shapes indicate strong and medium hits, respectively. For RNase V1, bold italic letters and regular letters indicate strong and medium hits, respectively. For microarray probing, solid and dashed boxes indicate the center of strong and medium binding sites, respectively. The predicted free energies of folding, ΔG37°,[54] for the multibranch and hairpin models are −29.1 and −25.5 kcal/mol, respectively. Mapping results of two mutants provide additional evidence for the multibranch structure. MBmutant was designed to stabilize the multibranch conformation (and forbid the hairpin) by changing the A132–A151 pair into a CG pair (Figure 4). This mutation results in a predicted free energy, ΔG37°, of −35.3 kcal/mol for the multibranch conformation, a gain of −6.2 kcal/mol in stability compared to that of the multibranch structure of the wild-type RNA, without a change in predicted ΔG37° for the hairpin (see captions to Figures 3 and 4). Thus, the equilibrium constant of the mutated sequence for folding to the multibranch conformation is predicted to be 8 × 106-fold more favorable than to the hairpin. As shown in Figure 4, the mapping results of MBmutant are very similar to that of the wild-type RNA. There were significant differences only in P2: in the wild-type RNA, nucleotides A131, A132, A150, and A151 were hit by several single-strand sensitive probes (Figure 3), whereas in MBmutant, this region was unreactive to single-strand sensitive probes, as expected for the multibranch conformation but not the hairpin conformation. Thus, the enzymatic/chemical mapping results for MBmutant are consistent with the multibranch model but not the hairpin model.

Figure 4

Mapping results of MBmutant. (A) The multibranch model. (B) The hairpin model. Color annotation is the same as in Figure 3. Mutations are indicated by arrows. The predicted free energies of folding, ΔG37°, for the multibranch and hairpin models are −35.3 and −25.5 kcal/mol, respectively. HPmutant was designed to fold into the hairpin conformation by changing the CCAA tetraloop in P3′ into GCAA, predicted to be more stable by −0.4 kcal/mol at 37 °C. Also, three mismatches along the stem were mutated to canonical base pairs by A132 → G, A145 → G, and U162 → A changes (Figure 5). These mutations make a very stable hairpin conformation with −40.6 kcal/mol predicted free energy at 37 °C, making the equilibrium constant for folding to the hairpin predicted to be 8 × 1010-fold more favorable than to the multibranch conformation (see captions to Figures 3 and 5). The enzymatic/chemical mapping results for this mutant are consistent with the hairpin model but not the multibranch model. As shown in Figure 5, G141, A142, and C143 were not hit by single-strand sensitive probes in HPmutant as they were in the wild-type RNA (Figure 3). Additionally, nucleotides that are expected to be single-stranded in the hairpin model of both HPmutant and the wild-type RNA, including nucleotides 148 and 165, were only hit by single-strand sensitive probes in HPmutant. These observations suggest that the wild-type RNA does not fold into the hairpin model.

Figure 5

Mapping results of HPmutant. (A) The multibranch model. (B) The hairpin model. Color annotation is the same as in Figure 3. Mutations are indicated by arrows. The predicted free energies of folding, ΔG37°, for the multibranch and hairpin models are −25.1 and −40.6 kcal/mol, respectively.

Isoenergetic Microarray Probing of the Wild-Type RNA

A complementary method for probing RNA structure, isoenergetic microarray mapping,[35] was also applied. The isoenergetic microarray uses pentamer and hexamer 2′-O-methyl RNA probes modified by inclusion of LNA and 2,6-diaminopurine to provide similar free energies of binding to unfolded complementary RNA, independent of sequence, and to stabilize the binding compared with unmodified probes.[46,47] Thus, probe binding interrogates primarily RNA folding rather than differences in thermodynamic binding affinity. The average predicted free energy of binding for the modified library to complementary sites on the wild-type RNA is −9.2 ± 0.8 kcal/mol at 37 °C, compared with −2.4 ± 1.6 kcal/mol for an unmodified DNA library. The stability enhancement averages 60 000-fold, and just as important, the sequence dependence of free energies, compared to unmodified probes, is reduced; this greatly simplifies interpretation of binding results. The isoenergetic microarray binding results on the wild-type RNA are consistent with enzymatic/chemical mapping results and also support the multibranch structure (Figures 3 and S1 and Table S2). Five unambiguous strong binding sites were revealed: C143, C163, G165, A166, and G182 (center nucleotides of probe binding sites). Except for G182, most nucleotides at binding sites were also hit by single-strand sensitive probes, confirming their accessibility in the structure. Strong binding sites with possible alternative binding sites include nucleotides C126, A142, and A147, which also react with single-strand sensitive probes. Interestingly, when microarray probing was done in 100 mM KCl/10 mM MgCl2 instead of 300 mM KCl/10 mM MgCl2, no detectable binding was observed. This suggests that the wild-type RNA folds into a very compact and stable structure at 100 mM KCl/10 mM MgCl2. Evidently, the 300 mM KCl makes the binding sites in the RNA more accessible. This increased accessibility of RNA induced by higher KCl concentration was also observed in a conserved RNA structure in the NS gene.[27]

NMR Spectra Are Consistent with the Multibranch Structure

The consensus sequence of the wild-type RNA was cut after G121–U176, and two GC pairs plus a 3′ dangling A were added after G121–U176 to stabilize P1a (Figure 6A). NMR spectra were taken for this 61 nucleotide RNA in 20 mM KH2PO4 (pH 6.5), 80 mM KCl, and 0.05 mM EDTA. Figure 6B shows the imino region of a 2D-NOESY spectrum acquired at 12 °C, with 100 ms mixing time. Base pair types of the imino resonances were identified on the basis of their proton chemical shifts as well as typical NOE cross peaks to amino and nonexchangeable protons. More definitive peak assignments would require isotope labeling, which is beyond the scope of this work.

Figure 6

2D-NOESY spectrum of the multibranch structure. (A) The 61 nucleotide construct of the multibranch structure. (B) The imino region of a 2D-NOESY spectrum recorded at 12 °C, with 100 ms mixing time. RNA was folded in 20 mM KH2PO4 (pH 6.5), 80 mM KCl, and 0.05 mM EDTA. Base pairs involved in observed helical walks are shown by colored dots (red = GC, blue = AU, green = GU). Four helical walks are shown by different color, which corresponds to the color of base pairs in panel A. The inset shows the region between 11.8–12.8 ppm with increased contour level to show the imino cross peaks of G129 to G130 and G135 to G146. The imino cross peak between G130 and U152 is not apparent from this plot but can be seen with contour level lowered. The evidence of connection between G130–C153 and A131–U152 is shown in Figure S2. NOE cross peaks are formed between imino protons in adjacent base pairs. Four helical walks consistent with the multibranch structure were observed for this RNA. No NOE connections to G119–C178, U123–G174, or U137–A145 were observed. These pairs are adjacent to loops or the end of the structure. In P1a, the imino cross peak between G121 and U176 appeared to be split, suggesting conformational exchange in this helix. The signal for G171 was weaker than that of G173, consistent with G171–C126 closing the multibranch junction. In P2, the imino cross peak between U152 and G130 was weak but could be observed with the contour level lowered. A131H2 and G130H1 gave a clear NOE peak (Figure S2), confirming the connection between G130–C153 and A131–U152. The imino cross peak between U133 and G134 was broad. In P3, no NOE connections were seen, which is consistent with A166 being hit by single-strand sensitive probes (Figure 3). P3 may be a dynamic region. Adding up to 4 mM MgCl2 in the NMR buffer did not significantly affect the spectrum (Figure S3). Some NOE peaks appeared weaker, possibly because MgCl2 made the multibranch structure more rigid.

Optical Melting Experiments Reveal Similar Melting Profiles in 1 M NaCl and in 10 mM MgCl2

Figure 7 shows melting profiles of the wild-type RNA in 100 mM KCl with/without 10 mM MgCl2, in 300 mM KCl with 10 mM MgCl2, and in 1 M NaCl without MgCl2. Addition of 10 mM Mg2+ to 100 mM KCl increased the melting temperature by ∼14 °C. Increasing the K+ concentration in the presence of Mg2+ had little effect. The melting profile of wild-type RNA in 1 M NaCl is similar to that in 100 (or 300) mM KCl with 10 mM MgCl2. Similar agreement was found for melting of a cyclized group I intron.[48] Chemical mapping of the cyclized group I intron[48] and of the specificity domain of RNase P RNA[49] in 1 M NaCl and in 10 mM MgCl2 have also shown good agreement. Evidently, 1 M NaCl is a reasonable approximation for buffers containing Mg2+. This supports the common practice of using thermodynamic parameters measured in 1 M NaCl[50−52] to predict RNA secondary structures in the presence of Mg2+.[53−55] For the RNA sequence studied here, free energy minimization with 1 M NaCl parameters and no experimental restraints predicts all the base pairs shown in Figure 3A when slippage of ±1 nucleotide is allowed.

Figure 7

Optical melting curves for the wild-type RNA. The absorbance was measured at 280 nm in 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, and 100 mM KCl (blue line); 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, 100 mM KCl, and 10 mM MgCl2 (black line); 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, 300 mM KCl, and 10 mM MgCl2 (green line); and 20 mM sodium cacodylate (pH 7.0), 0.5 mM EDTA, and 1 M NaCl (red line).

Discussion

Functional Implications of the Multibranch Structure

On the basis of predicted thermodynamics, sequence comparison, and bioinformatics, the multibranch structure was predicted to be one of the most thermodynamically stable and conserved structures in M1 mRNA.[13] The experimental data reported here confirms these predictions. This agreement contrasts with a similar 68 nucleotide region that is 51 nucleotides downstream from the 5′ splice site of segment 8. In that case, the bioinformatics approach predicted a multibranch model, but chemical and isoenergetic mapping revealed a hairpin structure in solution.[27] The multibranch structure in segment 7 is typically only 79 nucleotides downstream from the 5′ splice site for M2 mRNA (Figure 1). Intriguingly, in all of the spliced segments in influenza A, B, and C, unusually stable and conserved structures were predicted at or near the splice sites,[13,14,25] which indicates that these structures may play common roles in the regulation of influenza splicing. The average canonical base pair conservation of this multibranch structure is 93.6% (Figure 1 and Table S1). In particular, the hairpin loops of P2 and P3 have most nucleotides with greater than 97% conservation. Although the base pairs in P1 and P1a are less conserved than those in P2 and P3, roughly one-third of the mutations retain canonical pairing. When mutations lead to noncanonical pairs, about one-third are CA pairs (Table S1). CA pairs are isosteric with GU wobble pairs, can have a high pKa,[56] and preserve the A form RNA helix.[57] In general, variations in sequences and structures of mRNA are highly restricted in coding areas with cis-acting functions, presumably because of the necessity to maintain both protein and RNA structures.[58] The special conservation of this multibranch structure suggests that this motif is functionally important. mRNA secondary structures can play roles in the modulation of splicing, for example, by hiding or revealing splice sites and other regulatory elements[59] or by modifying the spatial distance between cis-acting elements.[60] Splicing can also be regulated via protein-induced RNA conformational switching[61,62] or small molecule binding.[63,64] Many alternative splicing processes are associated with RNA secondary structure formation in pre-mRNA.[65−68] Previous studies have postulated roles for RNA secondary structure in the regulation of splicing in influenza and other viruses.[69,70] While not present in the consensus sequence studied here, the basal helix of the multibranch structure in many influenza A strains contains the M4 mRNA 5′ splice site, GAG/GUUCUC (nucleotides 118–126), where the slash represents the splice site.[16] Additionally, there is a putative human intronic splicing enhancing sequence,[71] GGGGAUU (nucleotides 171–177), within this structure. In both cases, the regulatory sequences are embedded in secondary structure, where they would be expected to be less accessible to splicing factors.[68,72] Interestingly, at the 3′ splice site of the M2 mRNA, a pseudoknot is formed, which also buries splicing regulatory sequences in structure.[24,73] Thus, likely inhibitory cis-regulatory structures are at or near both splice sites of the M2 mRNA. Such structures could facilitate temporal control of M2 production, which occurs late in infection.[20] Mutagenesis studies carried out on segment 7 of influenza A virus in the Shih group[74] are supportive of the proposed role for the multibranch structure in modulating alternative splicing of segment 7. The sequence used in their study contains the M4 mRNA 5′ splice site GAG/GUUCUC (nucleotides 118–126). When G120 was mutated to A, the virus was still viable, but the amounts of alternatively spliced products were changed, and the viral growth rate was attenuated. Similar results were observed when G121 was mutated to A. In contrast, when G121 was mutated to C or U, the virus could not be rescued. These results are consistent with the structure shown in Figure 3A, where G120 → A and G121 → A mutations maintain base pairing, but G121 → C/U disrupts base pairing. The mutation results thus also support the proposal that this multibranch conformation is important for regulating alternative splicing of segment 7. Changing this RNA structure may affect the alternative splicing process or even diminish viral viability. This has significance for designing attenuated live-virus vaccines.[75] The structure presented in Figure 3A provides a basis for further tests of functions for this motif.

Potential Tertiary Interactions and Dynamics in the Multibranch Structure

Three-way multibranch loops are common RNA secondary structural motifs, which are important in organizing tertiary interactions in large molecules.[76] They have been extensively studied in Varkud satellite ribozymes,[77] the hepatitis C virus internal ribosome entry site,[78−80] hammerhead ribozymes,[81,82] the P4–P6 domain of a group I intron,[83] and so forth. Comparison of the secondary structure in Figure 3A with 3D structures of other three-way multibranch loops suggests possible tertiary interactions. Three-way multibranch loops usually fold by coaxial stacking of two of the three arms.[84,85] It is possible that small rearrangements in the multibranch structure (Figure 3A) would allow P1a to stack with P2 and connect with P3 by a U168 A169 A170 triloop closed by C167–G171. Both A residues in the triloop could form base triples with adjacent G129–C154 and G130–C153 pairs (Figure S4). CG/A base triples are sterically possible.[86] Such a long-range interaction was observed in RNase P and rRNAs[87−89] and is characteristic of UAA triloops.[90] Extensive enzymatic and chemical mapping data are consistent with potential tertiary interactions and dynamics in this multibranch structure. For instance, the 3′ side of P3 was extensively hit by single-strand sensitive probes and bound by oligonucleotides (Figure 3A). Moreover, no NOESY helical walks were detected for P3 (Figures 6 and S3). These observations indicate that this helix is relatively flexible. It agrees with the potential 3D stacking, in which G156–C167 does not form, P3 becomes a relatively weak helix, and A166 becomes reactive to single-strand sensitive probes (Figure S4). The other potentially flexible region of the multibranch structure is P1a, where most nucleotides on one side of the stem were hit by both RNases V1 and I (Figure 3). Also, no evidence for U123–G174 was observed, and more than one conformation was seen for G121–U176 in NOESY spectra (Figures 6 and S3). P1a may form a weak and dynamic helix, by shifting the pairing between CUCUC (nucleotides 122–126) and GGGGA (nucleotides 171–175), thus placing the putative intronic splicing enhancer GGGGAUU in different structural contexts. This dynamic region could serve as a good target for protein, small molecule, or oligonucleotide binding, as illustrated by medium microarray binding on both sides of P1a (Figure 3A). More sophisticated NMR techniques can be used to measure RNA dynamics,[91] as exemplified by some studies conducted on riboswitches.[92,93] For instance, 1H/15N-heteronuclear exchange-sensitive NMR allows the detection of structural changes occurring within the time frame of 15N-longitudinal relaxation.[94] In P2, A131, A132, A150, and A151 were hit by DEPC only when the RNA was folded without Mg2+. None reacted with NMIA to a detectable level, but A132 and A151 reacted with DMS in the presence and absence of Mg2+ (Figure 3A). High reactivities with DMS and negligible reactivities with NMIA of adenosines suggest that the Hoogesteen edge of the adenosines are buried and nucleobases are stacked on both faces.[95] It is possible that A132 and A151 pair in a sheared configuration (trans Hoogsteen/Sugar-edge), with two hydrogen bonds forming from the two amino protons of one adenosine to N3 and O2′ of the other adenosine, respectively.[83,96] In this way, the N 1s of A132 and A151 are available to be modified by DMS, but the ribose groups may not sample a conformation necessary for reactions with NMIA. In the absence of Mg2+, A132 and A151 can be accessed by DEPC, possibly because of lack of tertiary interaction and relatively weak stacking with adjacent AU pairs. In the presence of Mg2+, however, long-range contacts with other nucleotides may occur in the position of the sheared AA pair, thus protecting N7 of A residues in this region. The A131H2 to U152H3 and the U133H3 to A150H2 cross peaks are both weak in NOESY spectra (Figures 6, S2, and S3), suggesting that the imino protons of the four residues are exchanging with water. It is possible that the two adenosines in the sheared AA pair are rapidly exchanging positions.[97,98] An imino cross peak between G146 and G135 was observed in NOESY spectra (Figures 6 and S3). A147 may form a base triple with C136–G146 by interacting with its minor groove. Three hydrogen bonds may form to stabilize this interaction, including from N7 of A to H22 of G and from two amino protons of A to O2 and O2′ of C, respectively.[99] This kind of CG/A triple has been observed in rRNAs and a group I intron.[100−103]

Implications for Therapeutics against Influenza

Isoenergetic microarrays identified oligonucleotide binding sites in this multibranch conformation. Most of them are in flexible regions of the structure. Microarray probes can also bind to regions that are not hit by single-strand sensitive probes,[49,104] as apparent for nucleotide G182 (Figures 3 and S1 and Table S2). G182 is flanked by eight non-GC pairs on its 5′ side and four on its 3′ side. Evidently, the four modifications of probe 182 allow it to invade the P1 helix. Following strand invasion, U114 and U117 may further enhance hybrid stability by forming base triple interactions with A183 and A180, respectively. Thus, microarrays can reveal binding sites not apparent from secondary structure prediction or from enzymatic/chemical mapping. Several approaches have been developed to target RNA secondary structure for therapeutics, including RNAi,[105] antisense RNA,[106] and aptamers.[107] More and more RNA structures that can be used for therapeutic targets in major human diseases have been found in recent years.[108,109] Influenza A is a significant public health threat, with evolved resistance to current vaccines and treatments. The potential biological function and sequence/structure conservation of the multibranch structure identified in this work makes this region a new attractive therapeutic target. Probe binding sites found in the microarray study are therefore promising for use in a chemical genetics approach to test for function and for designing oligonucleotides as potential therapeutics against influenza A infection. The loops identified may also be targetable with small molecules.[110]

105 in total

1. Updating the accounts: global mortality of the 1918-1920 "Spanish" influenza pandemic.

Authors: Niall P A S Johnson; Juergen Mueller
Journal: Bull Hist Med Date: 2002 Impact factor: 1.314

2. The influenza virus resource at the National Center for Biotechnology Information.

Authors: Yiming Bao; Pavel Bolotov; Dmitry Dernovoy; Boris Kiryutin; Leonid Zaslavsky; Tatiana Tatusova; Jim Ostell; David Lipman
Journal: J Virol Date: 2007-10-17 Impact factor: 5.103

3. Nuclear transport of influenza virus ribonucleoproteins: the viral matrix protein (M1) promotes export and inhibits import.

Authors: K Martin; A Helenius
Journal: Cell Date: 1991-10-04 Impact factor: 41.582

4. The crystal structure of an all-RNA hammerhead ribozyme: a proposed mechanism for RNA catalytic cleavage.

Authors: W G Scott; J T Finch; A Klug
Journal: Cell Date: 1995-06-30 Impact factor: 41.582

5. The choice of alternative 5' splice sites in influenza virus M1 mRNA is regulated by the viral polymerase complex.

Authors: S R Shih; M E Nemeroff; R M Krug
Journal: Proc Natl Acad Sci U S A Date: 1995-07-03 Impact factor: 11.205

6. pKa shifting in double-stranded RNA is highly dependent upon nearest neighbors and bulge positioning.

Authors: Jennifer L Wilcox; Philip C Bevilacqua
Journal: Biochemistry Date: 2013-10-07 Impact factor: 3.162

7. Secondary structures for 5' regions of R2 retrotransposon RNAs reveal a novel conserved pseudoknot and regions that evolve under different constraints.

Authors: Elzbieta Kierzek; Shawn M Christensen; Thomas H Eickbush; Ryszard Kierzek; Douglas H Turner; Walter N Moss
Journal: J Mol Biol Date: 2009-05-03 Impact factor: 5.469

8. Neuraminidase inhibitor resistance after oseltamivir treatment of acute influenza A and B in children.

Authors: Iain Stephenson; Jane Democratis; Angie Lackenby; Teresa McNally; James Smith; Manish Pareek; Joanna Ellis; Alison Bermingham; Karl Nicholson; Maria Zambon
Journal: Clin Infect Dis Date: 2009-02-15 Impact factor: 9.079

9. Identification of conserved RNA secondary structures at influenza B and C splice sites reveals similarities and differences between influenza A, B, and C.

Authors: Lumbini I Dela-Moss; Walter N Moss; Douglas H Turner
Journal: BMC Res Notes Date: 2014-01-09

10. Isoenergetic penta- and hexanucleotide microarray probing and chemical mapping provide a secondary structure model for an RNA element orchestrating R2 retrotransposon protein function.

Authors: Elzbieta Kierzek; Ryszard Kierzek; Walter N Moss; Shawn M Christensen; Thomas H Eickbush; Douglas H Turner
Journal: Nucleic Acids Res Date: 2008-02-05 Impact factor: 16.971

19 in total

1. Computational and molecular analysis of conserved influenza A virus RNA secondary structures involved in infectious virion production.

Authors: Yuki Kobayashi; Bernadeta Dadonaite; Neeltje van Doremalen; Yoshiyuki Suzuki; Wendy S Barclay; Oliver G Pybus
Journal: RNA Biol Date: 2016-07-11 Impact factor: 4.652

2. Nearest neighbor rules for RNA helix folding thermodynamics: improved end effects.

Authors: Jeffrey Zuber; Susan J Schroeder; Hongying Sun; Douglas H Turner; David H Mathews
Journal: Nucleic Acids Res Date: 2022-05-20 Impact factor: 19.160

3. In vivo analysis of influenza A mRNA secondary structures identifies critical regulatory motifs.

Authors: Lisa Marie Simon; Edoardo Morandi; Anna Luganini; Giorgio Gribaudo; Luis Martinez-Sobrido; Douglas H Turner; Salvatore Oliviero; Danny Incarnato
Journal: Nucleic Acids Res Date: 2019-07-26 Impact factor: 16.971

4. Bridging the gap between in vitro and in vivo RNA folding.

Authors: Kathleen A Leamy; Sarah M Assmann; David H Mathews; Philip C Bevilacqua
Journal: Q Rev Biophys Date: 2016-06-24 Impact factor: 5.318

5. Structural features of a 3' splice site in influenza a.

Authors: Jonathan L Chen; Scott D Kennedy; Douglas H Turner
Journal: Biochemistry Date: 2015-05-21 Impact factor: 3.162

Review 6. Microarrays for identifying binding sites and probing structure of RNAs.

Authors: Ryszard Kierzek; Douglas H Turner; Elzbieta Kierzek
Journal: Nucleic Acids Res Date: 2014-12-12 Impact factor: 16.971

7. Mutations Designed by Ensemble Defect to Misfold Conserved RNA Structures of Influenza A Segments 7 and 8 Affect Splicing and Attenuate Viral Replication in Cell Culture.

Authors: Tian Jiang; Aitor Nogales; Steven F Baker; Luis Martinez-Sobrido; Douglas H Turner
Journal: PLoS One Date: 2016-06-07 Impact factor: 3.240

8. A Conserved Secondary Structural Element in the Coding Region of the Influenza A Virus Nucleoprotein (NP) mRNA Is Important for the Regulation of Viral Proliferation.

Authors: Marta Soszynska-Jozwiak; Paula Michalak; Walter N Moss; Ryszard Kierzek; Elzbieta Kierzek
Journal: PLoS One Date: 2015-10-21 Impact factor: 3.240

9. Self-Folding of Naked Segment 8 Genomic RNA of Influenza A Virus.

Authors: Elzbieta Lenartowicz; Julita Kesy; Agnieszka Ruszkowska; Marta Soszynska-Jozwiak; Paula Michalak; Walter N Moss; Douglas H Turner; Ryszard Kierzek; Elzbieta Kierzek
Journal: PLoS One Date: 2016-02-05 Impact factor: 3.240

10. Nuclear Magnetic Resonance-Assisted Prediction of Secondary Structure for RNA: Incorporation of Direction-Dependent Chemical Shift Constraints.

Authors: Jonathan L Chen; Stanislav Bellaousov; Jason D Tubbs; Scott D Kennedy; Michael J Lopez; David H Mathews; Douglas H Turner
Journal: Biochemistry Date: 2015-11-03 Impact factor: 3.162