| Literature DB >> 29114018 |
Steven Busan1, Kevin M Weeks2.
Abstract
Mutational profiling (MaP) enables detection of sites of chemical modification in RNA as sequence changes during reverse transcription (RT), subsequently read out by massively parallel sequencing. We introduce ShapeMapper 2, which integrates careful handling of all classes of adduct-induced sequence changes, sequence variant correction, basecall quality filters, and quality-control warnings to now identify RNA adduct sites as accurately as achieved by careful manual analysis of electrophoresis data, the prior highest-accuracy standard. MaP and ShapeMapper 2 provide a robust, experimentally concise, and accurate approach for reading out nucleic acid chemical probing experiments.Keywords: 1M6; 1M7; 5NIA; NAI; NMIA; RING; RNA structure modeling; SHAPE; correlated chemical probing; dimethyl sulfate; mutational profiling; single molecule
Mesh:
Substances:
Year: 2017 PMID: 29114018 PMCID: PMC5769742 DOI: 10.1261/rna.061945.117
Source DB: PubMed Journal: RNA ISSN: 1355-8382 Impact factor: 4.942
FIGURE 1.MaP experiment and analysis overview. (A) Quantification of chemical probing reactivities by MaP, based on massively parallel sequencing. (B) Algorithmic steps implemented in ShapeMapper. (C) Types of observed mutations and their frequencies in MaP-based analysis of E. coli 16S and 23S ribosomal RNA data sets collected previously under protein-free conditions using the 1M7 SHAPE reagent (Deigan et al. 2009; Siegfried et al. 2014). (D) Examples of simple and complex mutations detected in reads from the E. coli rRNA data set.
FIGURE 2.Importance of counting all mutation types. Receiver operating characteristic (ROC) curves for SHAPE-MaP reactivity profiles calculated using all mutation types or only certain types from the E. coli ribosomal RNA data set. SHAPE reactivity profiles were evaluated against reported Watson–Crick base-pairing interactions identified from crystal structures (Bernier et al. 2014). True positive rate: fraction of unpaired nucleotides with SHAPE reactivity above a given threshold; false positive rate: fraction of paired nucleotides with SHAPE reactivity above a given threshold. True positive and false positive rates were evaluated at all possible SHAPE reactivity thresholds from the lowest value in the data set to the highest. Inserts are far less frequent than other mutation types (see Fig. 1C), which accounts for low recovery of base-pairing information when analyzed alone.
FIGURE 3.Recovery of base-pairing information by the MaP experiment analyzed by ShapeMapper. (A) ROC curves for both subunits of the E. coli ribosome comparing accuracy of ShapeMapper 2 with ShapeMapper 1 and the prior high-accuracy standard, analysis by capillary electrophoresis. SHAPE-MaP data analyzed by ShapeMapper 2 gives both a higher true positive rate and a lower false positive rate than both ShapeMapper 1 and manually curated electrophoresis SHAPE for any given reactivity threshold, reflected by a statistically significant increase in area under the curve (inset). Capillary electrophoresis data were collected previously (Deigan et al. 2009; Siegfried et al. 2014). Shaded area shows the 95% confidence interval for the true positive rate calculated with 2000 bootstrap samples at 400 evenly spaced false positive rates. Error bars in the inset show a 95% confidence interval for the area under the curve calculated with 2000 bootstrap samples. (*) P = 0.05, (****) P ≤ 5 × 10–12. (B) SHAPE reactivity data obtained by manually curated electrophoresis (left) and by SHAPE-MaP (right) superimposed on a representative region of the E. coli large ribosomal subunit RNA domain II. (C) Structure modeling accuracy using Superfold (Smola et al. 2015b) and RNAstructure (Reuter and Mathews 2010) with no SHAPE data or with SHAPE data from electrophoresis or MaP readouts as soft constraints, as previously described (Deigan et al. 2009). Models were evaluated against nonconflicting canonical Watson–Crick base-pairing interactions identified from crystal structures (Bernier et al. 2014), allowing base pairs offset by up to one nucleotide in either direction, and excluding base pairs separated by more than 600 nt in primary sequence. Sensitivity: the fraction of base pairs in the reference structure present in a modeled structure. PPV: positive predictive value, the fraction of modeled pairs present in the reference structure. These sensitivity and ppv values underestimate the likely true values by ≥5%, because regions where experimental SHAPE data are inconsistent with the reference structure have not been excluded (see Deigan et al. 2009).