Literature DB >> 24766159

Standardization of RNA chemical mapping experiments.

Wipapat Kladwang1, Thomas H Mann, Alex Becka, Siqi Tian, Hanjoo Kim, Sungroh Yoon, Rhiju Das.   

Abstract

Chemical mapping experiments offer powerful information about RNA structure but currently involve ad hoc assumptions in data processing. We show that simple dilutions, referencing standards (GAGUA hairpins), and HiTRACE/MAPseeker analysis allow rigorous overmodification correction, background subtraction, and normalization for electrophoretic data and a ligation bias correction needed for accurate deep sequencing data. Comparisons across six noncoding RNAs stringently test the proposed standardization of dimethyl sulfate (DMS), 2'-OH acylation (SHAPE), and carbodiimide measurements. Identification of new signatures for extrahelical bulges and DMS "hot spot" pockets (including tRNA A58, methylated in vivo) illustrates the utility and necessity of standardization for quantitative RNA mapping.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24766159      PMCID: PMC4033625          DOI: 10.1021/bi5003426

Source DB:  PubMed          Journal:  Biochemistry        ISSN: 0006-2960            Impact factor:   3.162


Structure mapping, also known as footprinting, provides a rapid means for probing nucleic acid conformation at single-nucleotide resolution. New modification chemistries, higher-throughput readouts, multidimensional expansions, error analysis, and resources for sharing data are advancing the approach.[1] Despite powerful insights from separate data sets, ad hoc choices in data processing have precluded robust comparison of chemical reactivities across RNAs and readouts.[2−7] For example, “hot spots” that might signal specific noncanonical features[6,7] in one RNA cannot be confidently established in other RNAs without universal reactivity scales, analogous to problems in nuclear magnetic resonance chemical shift analysis prior to the adoption of referencing samples.[8] In principle, establishing reactivities should be unambiguous. Modification fractions r of nucleotides i can be directly computed from the numbers of “raw” observed products F by(derivation in the Supporting Information). While F0, the number of “full-length” products without chemical modification, is visible for RNA domains of up to 500 nucleotides, accurate quantitation is typically precluded by detector saturation of this strong band in electrophoresis data or by ligation biases in deep sequencing data. Our lab’s previous likelihood framework for F0 depended on a priori reactivity distributions that were approximate.[2] Aviran et al. explored setting F0 to zero when it could not be measured,[5] a poor assumption under typical “single-hit” conditions. Karabiber et al. proposed equalizing reactivities observed in the 5′ half versus the 3′ half of the data,[3,4] a generally inaccurate approximation. Several recent studies have not applied eq 1.[9] Further complicating cross-experiment comparisons are differences in whether eq 1 is applied to no-modifier control samples, in sequence alignment tools, in error estimation, and in normalization procedures,[2,3,5] as well as a lack of validation protocols. To address these issues, we implemented two straightforward standardization strategies: (1) dilution comparisons to mitigate saturation and (2) use of universal internal controls (Figure 1A,B). To illustrate, Figure 1C gives capillary electrophoresis (CE) data of primer extension products for the P4–P6 domain of the Tetrahymena ribozyme probed with dimethyl sulfate (DMS) to methylate exposed N1/N3 atoms of A/C nucleotides.[10] The saturated peak shape for the fully extended product is apparent; 10-fold dilution of the same sample gave a weaker signal-to-noise ratio overall but an unsaturated, Gaussian shape for the F0 peak (Figure 1D; further dilutions verified the lack of saturation). Automated scaling of these dilution data allowed unbiased measurement of F0 (Figure 1E,F). Application of eq 1, background subtraction, and normalization (see below) gave the reactivity profile in Figure 1F. The final results agreed within error with averaged data collected by different experimenters (Figure 1F and Methods and Figure 1 of the Supporting Information). Further, as expected (but not assumed), DMS reactivities at G and U nucleotides were within error of zero. Tests comparing data from 8-fold variations of DMS and reagents 1-cyclohexyl(2-morpholinoethyl)carbodiimide metho-p-toluene sulfonate (CMCT, modifying G/U)[10] and 1-methyl-7-N-isatoic anhydride (1M7, modifying 2′-OH; SHAPE[3,4]) further confirmed this standardization (Figure 2 of the Supporting Information).
Figure 1

Proposed steps to standardize chemical mapping experiments (red and blue text) read out by (A) capillary electrophoresis and (B) deep sequencing (MAP-seq). CE profiles for the P4–P6–2HP RNA probed with DMS at (C) standard dilution and (D) 10-fold dilution. (E) Automated scaling matches diluted sample data to undiluted data. (F) Final reactivity profile (black), validated by data taken at 4-fold lower DMS concentrations (green, nearly indistinguishable) and equality at GAGUA referencing hairpins (red). MAP-seq data for P4–P6 RNA without (F) and with (G) ligation bias correction determined from internal referencing. (H) Overlay of CE and MAP-seq data; errors are standard deviations of replicates (Figure 1 of the Supporting Information).

Independent validation of this procedure came from incorporating “reference” hairpins in 5′ and 3′ flanking cassettes.[3,4] GAGUA hairpin loops (Figure 2a) give strong signals for DMS (at the A’s), CMCT (at the bulge U), and 1M7 (all five residues). “Raw” F counts were 5-fold lower at the 5′ GAGUA than at the 3′ GAGUA (red bars in Figure 1E), as reverse transcriptases encountered stops in between those segments (“overmodification”, also called attenuation or signal decay). The equality of the GAGUA final reactivities r confirmed accurate overmodification correction and background subtraction of these data (red bars in Figure 1F) and supported use of the GAGUA data as normalization standards.
Figure 2

Three-dimensional environments associated with high chemical reactivity to Watson–Crick edge modifiers [DMS for A/C and CMCT for G/U (base color)] and/or 2′-OH acylation [1M7 (backbone color)]. (A) GAGUA hairpin sets the normalization scale for DMS (A2 and A5), CMCT (U4), and 1M7 (all nucleotides). (B) L6b from the P4–P6 domain. (C) Interdomain linker from the glycine riboswitch. (D) Bulge in the ligand binding pocket of the adenine riboswitch. (E–G) Pockets promoting high adenosine N1 reactivity and low 2′-OH reactivity in tRNA (N1-methyl shown) (E) and the P4–P6 domain (F and G). Hot spot nucleotides are labeled in panels B–G. Protein Data Bank entries are listed in Table 1 of the Supporting Information.

Proposed steps to standardize chemical mapping experiments (red and blue text) read out by (A) capillary electrophoresis and (B) deep sequencing (MAP-seq). CE profiles for the P4–P6–2HP RNA probed with DMS at (C) standard dilution and (D) 10-fold dilution. (E) Automated scaling matches diluted sample data to undiluted data. (F) Final reactivity profile (black), validated by data taken at 4-fold lower DMS concentrations (green, nearly indistinguishable) and equality at GAGUA referencing hairpins (red). MAP-seq data for P4–P6 RNA without (F) and with (G) ligation bias correction determined from internal referencing. (H) Overlay of CE and MAP-seq data; errors are standard deviations of replicates (Figure 1 of the Supporting Information). An alternative readout, MAP-seq (multiplexed accessibility probing), follows nucleic acid modification and primer extension with ligation of an Illumina adapter and deep sequencing, without bias-introducing polymerase chain reaction amplification (Methods of the Supporting Information).[11] We previously observed (through CE) that ligation yields were systematically low for full-length cDNA products. This effect led to underestimation of F0 and to an apparent discordance between the 5′ and 3′ GAGUA references (red bars, Figure 1G). Nevertheless, the requirement of equality at these sequences allowed automated estimation of a ligation bias correction factor [0.18 in this case (Methods of the Supporting Information)]. Despite involving rather different protocols, the CE and MAP-seq results then agreed within errors estimated from replicates (Figure 1H, and see below). To comprehensively test the standardization protocol, we took measurements with DMS, CMCT, and 1M7, using both CE and MAP-seq protocols on several structured RNAs, including ligand-bound riboswitches and rRNA domains (Figures 3–8 of the Supporting Information).[2,10] In the MAP-seq experiment, data for the P4–P6–2HP domain established the ligation bias correction factor and normalization for the coloaded RNAs. The agreement within error between reactivities at GAGUA reference hairpins across all constructs and general agreement between CE and MAP-seq data sets confirmed the accuracy of the proposed standardization (Figure 1 of the Supporting Information). No length bias was detected for MAP-seq, but a residual sequence bias was seen in reactive purine-rich segments; these mostly occurred in flanking sequences outside the structured RNA domains (Figures 3–8 of the Supporting Information). In both CE and MAP-seq data, normalization to GAGUA references exposed limitations of prior heuristics that normalize based on high percentile values within each RNA (or in 5′ and 3′ halves);[2−4,9,10] these values in fact vary by >2-fold across the different RNAs. The standardization procedures allowed the identification of 33 hot spot nucleotides, defined here as those giving DMS, CMCT, or 1M7 reactivity of >1.5, well above control values (1.0) established by GAGUA references (Table 2 of the Supporting Information). First, in agreement with conventional use of these data to infer secondary structure,[10] all 16 cases of high DMS/CMCT/1M7 reactivities observed within stretches of more than two residues corresponded to apical loops (Figure 2B) or unpaired “linkers” (Figure 2C). Second, three isolated adenosines with high 1M7 but low DMS reactivity were stacked on one face, a structural feature previously requiring differential SHAPE measurements for identification.[6] Third, all seven isolated highly CMCT/1M7-reactive uridines and two highly 1M7-reactive adenosines were extrahelical bulges[7] (Figure 2D), a powerful signature for guiding or validating tertiary structure modeling.[12] Most intriguing were five adenosines with DMS reactivities of >1.5 but negligible 1M7 reactivity (Figure 2E–G). Each of these adenosines showed Hoogsteen edge burial and nucleobase stacking on both faces; such burial information should be useful in tertiary structure modeling. The most DMS-reactive nucleotide, A58 in Saccharomyces cerevisiae tRNA(phe) (Figure 2E), is also methylated at the N1 position in vivo.[13] The pocket around DMS hot spot nucleotides may thus be under selection for electronegativity to enhance enzymatic reaction or hydrogen bonding to partners. As further examples, A198 and A207 (Figure 2F,G) in the isolated P4–P6 domain are buried, but N1 atoms are available for contacts in the full Tetrahymena ribozyme or recognition by protein partners. These signatures could not be identified unambiguously in prior work because of uncertain data scaling. The inclusion of dilution samples and referencing hairpins allows standardization, validation, and deeper analysis of structure mapping experiments at negligible additional cost. For CE studies, obtaining the necessary data simply involves diluting the prepared samples into running buffer and repeating electrophoresis and HiTRACE/HiTRACE-Web analysis[14] (Figure 1A). Inclusion of GAGUA hairpins was used here to test the overmodification correction and normalize CE data but was only strictly necessary in MAP-seq experiments. In fact, just a single construct with flanking reference hairpins needs to be doped into the MAP-seq RNA pool; standardization is then automated via MAPseeker analysis[11] (Figure 1B). The general adoption of simple standardization steps, and their extension to very long transcripts and to other solution conditions and modifiers, should help RNA structure mapping data become more accurate and more transferrable between molecules and experiments. Three-dimensional environments associated with high chemical reactivity to Watson–Crick edge modifiers [DMS for A/C and CMCT for G/U (base color)] and/or 2′-OH acylation [1M7 (backbone color)]. (A) GAGUA hairpin sets the normalization scale for DMS (A2 and A5), CMCT (U4), and 1M7 (all nucleotides). (B) L6b from the P4–P6 domain. (C) Interdomain linker from the glycine riboswitch. (D) Bulge in the ligand binding pocket of the adenine riboswitch. (E–G) Pockets promoting high adenosine N1 reactivity and low 2′-OH reactivity in tRNA (N1-methyl shown) (E) and the P4–P6 domain (F and G). Hot spot nucleotides are labeled in panels B–G. Protein Data Bank entries are listed in Table 1 of the Supporting Information.
  21 in total

1.  Modified constructs of the tRNA TPsiC domain to probe substrate conformational requirements of m(1)A(58) and m(5)U(54) tRNA methyltransferases.

Authors:  R Sengupta; S Vainauskas; C Yarian; E Sochacka; A Malkiewicz; R H Guenther; K M Koshlap; P F Agris
Journal:  Nucleic Acids Res       Date:  2000-03-15       Impact factor: 16.971

2.  A mutate-and-map strategy for inferring base pairs in structured nucleic acids: proof of concept on a DNA/RNA helix.

Authors:  Wipapat Kladwang; Rhiju Das
Journal:  Biochemistry       Date:  2010-09-07       Impact factor: 3.162

3.  RNA structure analysis at single nucleotide resolution by selective 2'-hydroxyl acylation and primer extension (SHAPE).

Authors:  Edward J Merino; Kevin A Wilkinson; Jennifer L Coughlan; Kevin M Weeks
Journal:  J Am Chem Soc       Date:  2005-03-30       Impact factor: 15.419

4.  In-line probing analysis of riboswitches.

Authors:  Elizabeth E Regulski; Ronald R Breaker
Journal:  Methods Mol Biol       Date:  2008

5.  Sharing and archiving nucleic acid structure mapping data.

Authors:  Philippe Rocca-Serra; Stanislav Bellaousov; Amanda Birmingham; Chunxia Chen; Pablo Cordero; Rhiju Das; Lauren Davis-Neulander; Caia D S Duncan; Matthew Halvorsen; Rob Knight; Neocles B Leontis; David H Mathews; Justin Ritz; Jesse Stombaugh; Kevin M Weeks; Craig L Zirbel; Alain Laederach
Journal:  RNA       Date:  2011-05-24       Impact factor: 4.942

6.  HiTRACE: high-throughput robust analysis for capillary electrophoresis.

Authors:  Sungroh Yoon; Jinkyu Kim; Justine Hum; Hanjoo Kim; Seunghyun Park; Wipapat Kladwang; Rhiju Das
Journal:  Bioinformatics       Date:  2011-05-10       Impact factor: 6.937

7.  Genome-wide measurement of RNA secondary structure in yeast.

Authors:  Michael Kertesz; Yue Wan; Elad Mazor; John L Rinn; Robert C Nutter; Howard Y Chang; Eran Segal
Journal:  Nature       Date:  2010-09-02       Impact factor: 49.962

8.  Chemical probes for higher-order structure in RNA.

Authors:  D A Peattie; W Gilbert
Journal:  Proc Natl Acad Sci U S A       Date:  1980-08       Impact factor: 11.205

9.  Fingerprinting noncanonical and tertiary RNA structures by differential SHAPE reactivity.

Authors:  Kady-Ann Steen; Greggory M Rice; Kevin M Weeks
Journal:  J Am Chem Soc       Date:  2012-08-01       Impact factor: 15.419

10.  Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo.

Authors:  Silvi Rouskin; Meghan Zubradt; Stefan Washietl; Manolis Kellis; Jonathan S Weissman
Journal:  Nature       Date:  2013-12-15       Impact factor: 49.962

View more
  35 in total

1.  Probing-directed identification of novel structured RNAs.

Authors:  Svetlana V Vinogradova; Roman A Sutormin; Andrey A Mironov; Ruslan A Soldatov
Journal:  RNA Biol       Date:  2016       Impact factor: 4.652

2.  Automated band annotation for RNA structure probing experiments with numerous capillary electrophoresis profiles.

Authors:  Seungmyung Lee; Hanjoo Kim; Siqi Tian; Taehoon Lee; Sungroh Yoon; Rhiju Das
Journal:  Bioinformatics       Date:  2015-05-05       Impact factor: 6.937

3.  RiboSketch: versatile visualization of multi-stranded RNA and DNA secondary structure.

Authors:  Jacob S Lu; Eckart Bindewald; Wojciech K Kasprzak; Bruce A Shapiro
Journal:  Bioinformatics       Date:  2018-12-15       Impact factor: 6.937

4.  Consistent global structures of complex RNA states through multidimensional chemical mapping.

Authors:  Clarence Yu Cheng; Fang-Chieh Chou; Wipapat Kladwang; Siqi Tian; Pablo Cordero; Rhiju Das
Journal:  Elife       Date:  2015-06-02       Impact factor: 8.140

Review 5.  Computational analysis of RNA structures with chemical probing data.

Authors:  Ping Ge; Shaojie Zhang
Journal:  Methods       Date:  2015-02-14       Impact factor: 3.608

6.  Anomalous Reverse Transcription through Chemical Modifications in Polyadenosine Stretches.

Authors:  Wipapat Kladwang; Ved V Topkar; Bei Liu; Ramya Rangan; Tracy L Hodges; Sarah C Keane; Hashim Al-Hashimi; Rhiju Das
Journal:  Biochemistry       Date:  2020-06-01       Impact factor: 3.162

7.  Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction.

Authors:  Eleonora De Leonardis; Benjamin Lutz; Sebastian Ratz; Simona Cocco; Rémi Monasson; Alexander Schug; Martin Weigt
Journal:  Nucleic Acids Res       Date:  2015-09-29       Impact factor: 16.971

8.  Allosteric mechanism of the V. vulnificus adenine riboswitch resolved by four-dimensional chemical mapping.

Authors:  Siqi Tian; Wipapat Kladwang; Rhiju Das
Journal:  Elife       Date:  2018-02-15       Impact factor: 8.140

9.  RNA Structural Modules Control the Rate and Pathway of RNA Folding and Assembly.

Authors:  Brant Gracia; Yi Xue; Namita Bisaria; Daniel Herschlag; Hashim M Al-Hashimi; Rick Russell
Journal:  J Mol Biol       Date:  2016-07-22       Impact factor: 5.469

10.  An expanded class of histidine-accepting viral tRNA-like structures.

Authors:  Conner J Langeberg; Madeline E Sherlock; Andrea MacFadden; Jeffrey S Kieft
Journal:  RNA       Date:  2021-04-02       Impact factor: 5.636

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.