Literature DB >> 27153676

Isoform-level ribosome occupancy estimation guided by transcript abundance with Ribomap.

Hao Wang1, Joel McManus2, Carl Kingsford1.   

Abstract

UNLABELLED: : Ribosome profiling is a recently developed high-throughput sequencing technique that captures approximately 30 bp long ribosome-protected mRNA fragments during translation. Because of alternative splicing and repetitive sequences, a ribosome-protected read may map to many places in the transcriptome, leading to discarded or arbitrary mappings when standard approaches are used. We present a technique and software that addresses this problem by assigning reads to potential origins proportional to estimated transcript abundance. This yields a more accurate estimate of ribosome profiles compared with a naïve mapping.
AVAILABILITY AND IMPLEMENTATION: Ribomap is available as open source at http://www.cs.cmu.edu/∼ckingsf/software/ribomap CONTACT: carlk@cs.cmu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27153676      PMCID: PMC4908323          DOI: 10.1093/bioinformatics/btw085

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Ribosome profiling (ribo-seq) provides snapshots of the positions of translating ribosomes by sequencing ribosome-protected fragments (Ingolia , 2012). The distribution of ribo-seq footprints along a transcript, called the ribosome profile, can be used to analyze translational regulation and discover alternative initiation (Gao ), alternative translation and frameshifting (Michel ), and may eventually lead to a better understanding of the regulation of cell growth, the progression of aging (Kuersten ) and the development of diseases (Hsieh ; Thoreen ). Different environmental conditions such as stress or starvation alter the ribosome profile patterns (Ingolia ; Gerashchenko ), indicating possible changes in translational regulation. In higher eukaryotes, alternative transcription initiation, pre-mRNA splicing, and 3′ end formation result in the production of multiple isoforms for most genes. The resulting isoforms can have dramatically different effects on mRNA stability (Lareau ) and translation regulation (Sterne-Weiler ). However, to date ribosome profiling analyses have been conducted at the gene, rather than isoform, level using either a single ‘representative’ isoform (e.g. Guo ) or exon union profiles (e.g. Olshen ). The lack of isoform-level analysis of ribo-seq data is partially due to the absence of the necessary bioinformatic tools. Here, we present a conceptual framework and software (Ribomap) to quantify isoform-level ribosome profiles. By accounting for multi-mapping sequence reads using RNA-seq estimates of isoform abundance, Ribomap produces accurate isoform-specific ribosome profiles. The challenge in estimating isoform ribosome profiles is that a short ribo-seq read may map to many different transcripts. Ambiguous mappings are not rare in ribo-seq data and can be caused by either repetitive sequences along the genome or alternative splicing (Ingolia, 2014). For example, in the human Hela cell ribo-seq data (GSM546920, Guo ), among all mapped reads (about 50% of all reads), only 14% can be uniquely mapped to a single location of a single mRNA isoform, 22% can be mapped to multiple regions on the reference genome due to repetitive sequences, and 64% can be mapped to multiple mRNAs due to alternative splicing. Ribomap deals with both types of ambiguous mappings, and therefore does not discard multi-mapped reads, resulting in more of the data being used. In this example, the mapping rate of Ribomap is 50% compared to 7% if only uniquely mapped reads are used. Estimation of mRNA isoform abundance from RNA-seq has also had to deal with ambiguous mappings (Jiang and Wong, 2009; Mortazavi ; Pachter, 2011). However, unlike in RNA-seq, coverage in ribo-seq is highly non-uniform regardless of sequencing bias since ribosomes move along mRNAs at non-uniform rates, and it is in fact the non-uniformities that are of interest (Ingolia, 2014). Further, ambiguous mappings are much worse for ribo-seq data since the read length cannot exceed the ribosome size (approximately 30 bp), while paired-end and longer reads can be generated from RNA-seq experiments to reduce the problem of ambiguous mappings. Methods developed for transcript abundance are therefore not applicable to assigning ribo-seq reads. By observing that ambiguous mappings are mainly caused by multiple isoforms (Supplementary Fig. S2), Ribomap assigns ribo-seq reads to locations using estimated transcript abundance of the candidate locations. On synthetic data, our approach yields a more precise estimation of ribosome profiles compared with a pure mapping-based approach. Further, the ribosome abundance derived using our method correlates better with the transcript abundance on real ribo-seq data.

2 Approach

Ribomap works in 3 stages (Fig. 1; see also Supplementary Material):
Fig. 1.

Ribomap pipeline for estimating ribosome profiles

Ribomap pipeline for estimating ribosome profiles Transcript abundance estimation. Since RNA-seq experiments should always be performed in parallel with ribo-seq (Ingolia, 2014), the abundance α per base of each transcript t can be estimated from the RNA-seq data using Sailfish (Patro ), an ultra-fast mRNA isoform quantification package. Ribomap also accepts transcript abundance estimations from cufflinks (Trapnell ) and eXpress (Roberts and Pachter, 2013). Mapping ribo-seq reads to the reference transcriptome. We obtain all the transcript-location pairs L where the read sequence r matches the transcript sequence by aligning the entire set of ribo-seq reads R to the transcriptome with STAR (Dobin ). Ribosome profile estimation. Let c be the number of ribo-seq reads with sequence r. Ribomap sets the number of footprints c with sequence r that originate from a specific location i on transcript t to be proportional to the transcript abundance α of transcript t: , where the denominator is the total transcript abundance with a sequence matching r. The total number of reads c that are assigned to transcript t, location i, is then . The c give the profiles for each transcript. The sum is needed here because there can exist multiple read sequences being mapped to the same transcript location due to sequencing errors, so the final estimated ribosome count for a transcript location should be the sum of the estimated count for all matched read sequences.

3 Results and discussion

To evaluate the performance of Ribomap, we synthetically generated ribo-seq reads with known ground truth profiles using transcript abundance of GSM546921 RNA-seq data (Guo ) and a dynamic range of initiation rates. Ribosome occupancy probabilities for locations on a given transcript were simulated using the ribosome flow model (Reuveni ). Errors were added to the reads using a Poisson process with a rate of 0.5%, which was estimated from the ribo-seq data GSM546920 (Guo ). For comparison, we also test a naïve approach, called ‘Star prime’, that maps each read to a single candidate location. More details are in Supplementary material. The Pearson correlation coefficients between Ribomap’s ribosome profiles and the ground truth is significantly higher than that of Star prime (Fig. 2): 81% of our profiles have a higher Pearson correlation (Mann–Whitney U test ) and 68% have a smaller root mean square error (Mann–Whitney U test ). This suggests that Ribomap more accurately recovers the ribosome profiles than the standard mapping procedure applied to isoforms.
Fig. 2.

Histogram of the Pearson correlation between the footprint assignments and the ground truth profiles. Ribomap has a significant higher Pearson correlation (median: 0.83) than Star prime (median: 0.28). The spike at 0 of Star prime is due to STAR not assigning footprints to transcripts that are estimated to be present

Histogram of the Pearson correlation between the footprint assignments and the ground truth profiles. Ribomap has a significant higher Pearson correlation (median: 0.83) than Star prime (median: 0.28). The spike at 0 of Star prime is due to STAR not assigning footprints to transcripts that are estimated to be present The good correlation between the ground truth profile and the estimated profile also leads to a good estimation of the total ribosome loads on a transcript. Ribomap’s ribosome loads estimation on non-synthetic ribo-seq data (GSM546920, Guo ) correlates well with the estimated transcript abundance (Pearson r = 0.71). We do not expect a perfect correlation due to isoform-specific translational regulation. On the other hand, the pure mapping-based approach of Star prime does not correlate as well (r = 0.28). Through two lines of evidence, on real and synthetic ribo-seq data, we show that Ribomap produces useful, high-quality ribosome profiles along individual isoforms. It can serve as a useful first step for downstream analysis of translational regulation from ribo-seq data.
  20 in total

1.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

2.  Assessing gene-level translational control from ribosome profiling.

Authors:  Adam B Olshen; Andrew C Hsieh; Craig R Stumpf; Richard A Olshen; Davide Ruggero; Barry S Taylor
Journal:  Bioinformatics       Date:  2013-09-18       Impact factor: 6.937

3.  Mammalian microRNAs predominantly act to decrease target mRNA levels.

Authors:  Huili Guo; Nicholas T Ingolia; Jonathan S Weissman; David P Bartel
Journal:  Nature       Date:  2010-08-12       Impact factor: 49.962

4.  Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress.

Authors:  Maxim V Gerashchenko; Alexei V Lobanov; Vadim N Gladyshev
Journal:  Proc Natl Acad Sci U S A       Date:  2012-10-08       Impact factor: 11.205

5.  Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling.

Authors:  Nicholas T Ingolia; Sina Ghaemmaghami; John R S Newman; Jonathan S Weissman
Journal:  Science       Date:  2009-02-12       Impact factor: 47.728

6.  Observation of dually decoded regions of the human genome using ribosome profiling data.

Authors:  Audrey M Michel; Kingshuk Roy Choudhury; Andrew E Firth; Nicholas T Ingolia; John F Atkins; Pavel V Baranov
Journal:  Genome Res       Date:  2012-05-16       Impact factor: 9.043

7.  Genome-scale analysis of translation elongation with a ribosome flow model.

Authors:  Shlomi Reuveni; Isaac Meilijson; Martin Kupiec; Eytan Ruppin; Tamir Tuller
Journal:  PLoS Comput Biol       Date:  2011-09-01       Impact factor: 4.475

8.  Streaming fragment assignment for real-time analysis of sequencing experiments.

Authors:  Adam Roberts; Lior Pachter
Journal:  Nat Methods       Date:  2012-11-18       Impact factor: 28.547

9.  The translational landscape of mTOR signalling steers cancer initiation and metastasis.

Authors:  Andrew C Hsieh; Yi Liu; Merritt P Edlind; Nicholas T Ingolia; Matthew R Janes; Annie Sher; Evan Y Shi; Craig R Stumpf; Carly Christensen; Michael J Bonham; Shunyou Wang; Pingda Ren; Michael Martin; Katti Jessen; Morris E Feldman; Jonathan S Weissman; Kevan M Shokat; Christian Rommel; Davide Ruggero
Journal:  Nature       Date:  2012-02-22       Impact factor: 69.504

10.  Quantitative profiling of initiating ribosomes in vivo.

Authors:  Xiangwei Gao; Ji Wan; Botao Liu; Ming Ma; Ben Shen; Shu-Bing Qian
Journal:  Nat Methods       Date:  2014-12-08       Impact factor: 28.547

View more
  17 in total

1.  Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq.

Authors:  Becky C Carlyle; Robert R Kitchen; Jing Zhang; Rashaun S Wilson; Tukiet T Lam; Joel S Rozowsky; Kenneth R Williams; Nenad Sestan; Mark B Gerstein; Angus C Nairn
Journal:  J Proteome Res       Date:  2018-09-06       Impact factor: 4.466

2.  Time-Resolved Proteomics Extends Ribosome Profiling-Based Measurements of Protein Synthesis Dynamics.

Authors:  Tzu-Yu Liu; Hector H Huang; Diamond Wheeler; Yichen Xu; James A Wells; Yun S Song; Arun P Wiita
Journal:  Cell Syst       Date:  2017-05-31       Impact factor: 10.304

3.  Transcriptome-wide measurement of translation by ribosome profiling.

Authors:  Nicholas J McGlincy; Nicholas T Ingolia
Journal:  Methods       Date:  2017-06-01       Impact factor: 3.608

4.  Multimapping confounds ribosome profiling analysis: A case-study of the Hsp90 molecular chaperone.

Authors:  Jackson C Halpin; Radhika Jangi; Timothy O Street
Journal:  Proteins       Date:  2019-07-19

5.  RiboDiPA: a novel tool for differential pattern analysis in Ribo-seq data.

Authors:  Keren Li; C Matthew Hope; Xiaozhong A Wang; Ji-Ping Wang
Journal:  Nucleic Acids Res       Date:  2020-12-02       Impact factor: 16.971

6.  Using the Ribodeblur pipeline to recover A-sites from yeast ribosome profiling data.

Authors:  Hao Wang; Carl Kingsford; C Joel McManus
Journal:  Methods       Date:  2018-01-09       Impact factor: 3.608

7.  Global mRNA polarization regulates translation efficiency in the intestinal epithelium.

Authors:  Andreas E Moor; Matan Golan; Efi E Massasa; Doron Lemze; Tomer Weizman; Rom Shenhav; Shaked Baydatch; Orel Mizrahi; Roni Winkler; Ofra Golani; Noam Stern-Ginossar; Shalev Itzkovitz
Journal:  Science       Date:  2017-08-10       Impact factor: 47.728

8.  Quantification of translation uncovers the functions of the alternative transcriptome.

Authors:  Lorenzo Calviello; Antje Hirsekorn; Uwe Ohler
Journal:  Nat Struct Mol Biol       Date:  2020-06-29       Impact factor: 15.369

Review 9.  Insights into the mechanisms of eukaryotic translation gained with ribosome profiling.

Authors:  Dmitry E Andreev; Patrick B F O'Connor; Gary Loughran; Sergey E Dmitriev; Pavel V Baranov; Ivan N Shatsky
Journal:  Nucleic Acids Res       Date:  2016-12-06       Impact factor: 16.971

Review 10.  Control of translation by eukaryotic mRNA transcript leaders-Insights from high-throughput assays and computational modeling.

Authors:  Christina Akirtava; Charles Joel McManus
Journal:  Wiley Interdiscip Rev RNA       Date:  2020-08-31       Impact factor: 9.957

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.