Literature DB >> 27663500

FoldAtlas: a repository for genome-wide RNA structure probing data.

Matthew Norris1, Chun Kit Kwok2, Jitender Cheema1, Matthew Hartley1, Richard J Morris1, Sharon Aviran3, Yiliang Ding1.   

Abstract

Most RNA molecules form internal base pairs, leading to a folded secondary structure. Some of these structures have been demonstrated to be functionally significant. High-throughput RNA structure chemical probing methods generate millions of sequencing reads to provide structural constraints for RNA secondary structure prediction. At present, processed data from these experiments are difficult to access without computational expertise. Here we present FoldAtlas, a web interface for accessing raw and processed structural data across thousands of transcripts. FoldAtlas allows a researcher to easily locate, view, and retrieve probing data for a given RNA molecule. We also provide in silico and in vivo secondary structure predictions for comparison, visualized in the browser as circle plots and topology diagrams. Data currently integrated into FoldAtlas are from a new high-depth Structure-seq data analysis in Arabidopsis thaliana, released with this work.
AVAILABILITY AND IMPLEMENTATION: The FoldAtlas website can be accessed at www.foldatlas.com Source code is freely available at github.com/mnori/foldatlas under the MIT license. Raw reads data are available under the NCBI SRA accession SRP066985. CONTACT: yiliang.ding@jic.ac.uk or matthew.norris@jic.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27663500      PMCID: PMC5254078          DOI: 10.1093/bioinformatics/btw611

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

RNA structure plays an important role in all steps of gene expression and regulation (Mortimer ; Sharp, 2009). Earlier studies inferred the secondary structures of individual RNA sequences using low throughput in vitro probing or in silico prediction approaches. More recently, genome-wide in vivo structure probing methods have emerged, allowing structures to be determined across the transcriptomes of living cells (Ding ; Rouskin ; Spitale ; Talkish ; Tang ). Chemical probing methods can be used to determine RNA secondary structure in living cells (Kwok ; McGinnis and Weeks, 2014; Spitale ). These methods include dimethyl sulfate (DMS) probing and selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE). In DMS probing, the N1 position of adenine and the N3 position of cytosine are methylated when the base is not involved in Watson–Crick base pairing. In SHAPE, all unpaired bases are modified. Chemically modified bases lead to stalling of reverse transcriptase. With reverse transcription, PCR, deep sequencing and normalization, reactivities can be assigned to individual RNA sequence positions. These reactivities describe the extent of exposure of a nucleotide to solution, and can be exploited as pseudo-free energy constraints for RNA secondary structure prediction. At present, raw and processed reactivity data are hard to access without computational expertise. Here we introduce FoldAtlas, a repository and web interface for accessing genome-scale RNA structure probing data. We also provide visualization of data-constrained RNA structures across the genome. The data included with FoldAtlas are from a new high-depth Structure-seq DMS analysis in Arabidopsis thaliana, covering over 11 000 transcripts.

2 Results

FoldAtlas allows a researcher without computational expertise to select a transcript of interest and retrieve its corresponding raw and processed structure probing data, along with pre-generated RNA structure predictions. FoldAtlas is the first tool that provides this functionality across the genome. When a transcript is selected and loaded, the d3nome genome browser (Fig. 1A) released with this work, displays the splicing configuration of the selected transcript, along with alternative splice isoforms, where relevant. An overview of the normalized chemical reactivities (Fig. 1B) is also shown, which can be expanded to show detailed nucleotide-resolution chemical reactivities. The reactivities are generated as described in the Supplementary Results section of the Supplementary Material. Tab delimited text files containing normalized chemical reactivities are available for download. We also provide corresponding raw read termination counts from 3 independent biological replicates, allowing the significance of structure probing data to be estimated by assigning errors to reactivities.
Fig. 1

(A) d3nome genome browser, indicating splice isoforms. (B) Normalized reactivities. The sequence position is shown on the x axis, whilst the y axis provides the normalized reactivity value. (C) Principal components analysis, indicating structural similarity. Each dot represents a single RNA structure prediction, with red dots indicating the lowest free energies. Structures with similar base pair configurations are plotted in close proximity to each other. (D) Circle plot describing a single RNA structure prediction. Sequence positions are indicated around the edge of the plot, with lines between positions indicating base pairs. Green lines indicate high base pair probability. (E) RNA fold prediction diagram. Bases with high reactivities are in red, whilst green indicates little or no reactivity

(A) d3nome genome browser, indicating splice isoforms. (B) Normalized reactivities. The sequence position is shown on the x axis, whilst the y axis provides the normalized reactivity value. (C) Principal components analysis, indicating structural similarity. Each dot represents a single RNA structure prediction, with red dots indicating the lowest free energies. Structures with similar base pair configurations are plotted in close proximity to each other. (D) Circle plot describing a single RNA structure prediction. Sequence positions are indicated around the edge of the plot, with lines between positions indicating base pairs. Green lines indicate high base pair probability. (E) RNA fold prediction diagram. Bases with high reactivities are in red, whilst green indicates little or no reactivity For each transcript, we include the 20 lowest free energy unconstrained in silico and data-constrained in vivo structures generated by using the Fold program, from version 5.7 of the RNAstructure package (Reuter and Mathews, 2010), with default slope and intercept parameters of 1.8 and -0.6 kcal/mol respectively. The structure prediction set includes the MFE structure alongside suboptimal low free energy structures. Differences and similarities between these structures are visualized using a Principal Components Analysis (PCA) view (Fig. 1C). PCA plots were generated using a previously described method (Halvorsen ). Each structure can also be visualized using both circle plots (Fig. 1D) and structure diagrams (Fig. 1E) generated using the ViennaRNA package (Hofacker, 2013; Kerpedjiev , Lorenz ). The corresponding MFE structures can be downloaded as tab-delimited text files. The FoldAtlas chemical reactivity data are from a DMS chemical modification experiment in Arabidopsis thaliana. These data were generated by using a previously established Structure-seq method (Ding , 2015), but with two rounds of poly-A selection to enrich the proportion of mRNA. Detailed analysis of this experiment is provided in the Supplementary Results section of the Supplementary Material.

3 Conclusions and future work

FoldAtlas provides convenient access to in vivo RNA structure probing data across thousands of transcripts. The current release, 1.1, includes data from a high depth genome-scale probing experiment in Arabidopsis thaliana. To predict structure for a transcript, we generated up to 20 secondary structures using the RNAstructure Fold tool, and visualized the structure ensemble using PCA plots. In this work, our preference to use RNAstructure is due to the ability to specify experimental constraints, and is consistent with the approach taken in our earlier work (Ding ). In future versions of FoldAtlas, we plan to also provide options to visualize structure predictions made using other methods, including SeqFold (Ouyang ), and Vienna RNAfold, which now allows experimental constraints (Lorenz , 2015). We are also considering including SHAPE probing data, in vitro data, reactivities calculated using alternative normalization methods, data from other organisms, and data from other studies. Click here for additional data file.
  18 in total

1.  Genome-wide profiling of in vivo RNA structure at single-nucleotide resolution using structure-seq.

Authors:  Yiliang Ding; Chun Kit Kwok; Yin Tang; Philip C Bevilacqua; Sarah M Assmann
Journal:  Nat Protoc       Date:  2015-06-18       Impact factor: 13.491

2.  StructureFold: genome-wide RNA secondary structure mapping and reconstruction in vivo.

Authors:  Yin Tang; Emil Bouvier; Chun Kit Kwok; Yiliang Ding; Anton Nekrutenko; Philip C Bevilacqua; Sarah M Assmann
Journal:  Bioinformatics       Date:  2015-04-16       Impact factor: 6.937

3.  Ribosome RNA assembly intermediates visualized in living cells.

Authors:  Jennifer L McGinnis; Kevin M Weeks
Journal:  Biochemistry       Date:  2014-05-12       Impact factor: 3.162

4.  RNAstructure: software for RNA secondary structure prediction and analysis.

Authors:  Jessica S Reuter; David H Mathews
Journal:  BMC Bioinformatics       Date:  2010-03-15       Impact factor: 3.169

5.  Disease-associated mutations that alter the RNA structural ensemble.

Authors:  Matthew Halvorsen; Joshua S Martin; Sam Broadaway; Alain Laederach
Journal:  PLoS Genet       Date:  2010-08-19       Impact factor: 5.917

6.  SeqFold: genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data.

Authors:  Zhengqing Ouyang; Michael P Snyder; Howard Y Chang
Journal:  Genome Res       Date:  2012-10-11       Impact factor: 9.043

7.  ViennaRNA Package 2.0.

Authors:  Ronny Lorenz; Stephan H Bernhart; Christian Höner Zu Siederdissen; Hakim Tafer; Christoph Flamm; Peter F Stadler; Ivo L Hofacker
Journal:  Algorithms Mol Biol       Date:  2011-11-24       Impact factor: 1.405

8.  Forna (force-directed RNA): Simple and effective online RNA secondary structure diagrams.

Authors:  Peter Kerpedjiev; Stefan Hammer; Ivo L Hofacker
Journal:  Bioinformatics       Date:  2015-06-22       Impact factor: 6.937

9.  SHAPE directed RNA folding.

Authors:  Ronny Lorenz; Dominik Luntzer; Ivo L Hofacker; Peter F Stadler; Michael T Wolfinger
Journal:  Bioinformatics       Date:  2015-09-09       Impact factor: 6.937

10.  Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo.

Authors:  Silvi Rouskin; Meghan Zubradt; Stefan Washietl; Manolis Kellis; Jonathan S Weissman
Journal:  Nature       Date:  2013-12-15       Impact factor: 49.962

View more
  7 in total

1.  Comparative and integrative analysis of RNA structural profiling data: current practices and emerging questions.

Authors:  Krishna Choudhary; Fei Deng; Sharon Aviran
Journal:  Quant Biol       Date:  2017-03-30

2.  RNA G-quadruplex structure contributes to cold adaptation in plants.

Authors:  Xiaofei Yang; Haopeng Yu; Susan Duncan; Yueying Zhang; Jitender Cheema; Haifeng Liu; J Benjamin Miller; Jie Zhang; Chun Kit Kwok; Huakun Zhang; Yiliang Ding
Journal:  Nat Commun       Date:  2022-10-20       Impact factor: 17.694

Review 3.  Probing RNA structure in vivo.

Authors:  David Mitchell; Sarah M Assmann; Philip C Bevilacqua
Journal:  Curr Opin Struct Biol       Date:  2019-09-13       Impact factor: 6.809

4.  Rice In Vivo RNA Structurome Reveals RNA Secondary Structure Conservation and Divergence in Plants.

Authors:  Hongjing Deng; Jitender Cheema; Hang Zhang; Hugh Woolfenden; Matthew Norris; Zhenshan Liu; Qi Liu; Xiaofei Yang; Minglei Yang; Xian Deng; Xiaofeng Cao; Yiliang Ding
Journal:  Mol Plant       Date:  2018-02-01       Impact factor: 13.164

5.  RNAProbe: a web server for normalization and analysis of RNA structure probing data.

Authors:  Tomasz K Wirecki; Katarzyna Merdas; Agata Bernat; Michał J Boniecki; Janusz M Bujnicki; Filip Stefaniak
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

Review 6.  Advances in RNA 3D Structure Modeling Using Experimental Data.

Authors:  Bing Li; Yang Cao; Eric Westhof; Zhichao Miao
Journal:  Front Genet       Date:  2020-10-26       Impact factor: 4.599

7.  RNA G-quadruplex structures exist and function in vivo in plants.

Authors:  Xiaofei Yang; Jitender Cheema; Yueying Zhang; Hongjing Deng; Susan Duncan; Mubarak Ishaq Umar; Jieyu Zhao; Qi Liu; Xiaofeng Cao; Chun Kit Kwok; Yiliang Ding
Journal:  Genome Biol       Date:  2020-09-01       Impact factor: 13.583

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.