| Literature DB >> 29170441 |
Peng Zhang1, Dandan He1, Yi Xu1, Jiakai Hou1, Bih-Fang Pan2, Yunfei Wang1, Tao Liu3, Christel M Davis4, Erik A Ehli4, Lin Tan1, Feng Zhou5, Jian Hu6, Yonghao Yu7, Xi Chen8, Tuan M Nguyen8,9, Jeffrey M Rosen8, David H Hawke2, Zhe Ji10,11, Yiwen Chen12.
Abstract
Translation is principally regulated at the initiation stage. The development of the translation initiation (TI) sequencing (TI-seq) technique has enabled the global mapping of TIs and revealed unanticipated complex translational landscapes in metazoans. Despite the wide adoption of TI-seq, there is no computational tool currently available for analyzing TI-seq data. To fill this gap, we develop a comprehensive toolkit named Ribo-TISH, which allows for detecting and quantitatively comparing TIs across conditions from TI-seq data. Ribo-TISH can also predict novel open reading frames (ORFs) from regular ribosome profiling (rRibo-seq) data and outperform several established methods in both computational efficiency and prediction accuracy. Applied to published TI-seq/rRibo-seq data sets, Ribo-TISH uncovers a novel signature of elevated mitochondrial translation during amino-acid deprivation and predicts novel ORFs in 5'UTRs, long noncoding RNAs, and introns. These successful applications demonstrate the power of Ribo-TISH in extracting biological insights from TI-seq/rRibo-seq data.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29170441 PMCID: PMC5701008 DOI: 10.1038/s41467-017-01981-8
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1A schematic overview of Ribo-TISH. Ribo-TISH starts from quality control of the aligned sequencing data to identifying and differential analysis of translation initiations from TI-seq/QTI-seq data, and to predicting actively translated ORFs from rRibo-seq data
Fig. 2Quality control of TI-seq and rRibo-seq data. Quality control with Ribo-TISH for two TI-seq data sets generated using a LTM or b Harr, and c one rRibo-seq data set generated using CHX. Upper panel: length distribution of RPFs uniquely mapped to annotated protein-coding regions. Lower panel: different quality profiles/metrics for RPFs uniquely mapped to annotated protein-coding regions. The data corresponding to the first, second and third reading frame are colored in pink, light green and sky blue, respectively. Each row shows the RPFs with indicated length. Column 1: distribution of RPF 5′ end across three reading frames in all annotated codons; showing the fraction of RPF counts from dominant reading frame (f d). Column 2: distribution of RPF 5′ end count near annotated TISs; showing estimated P-site offset and the ratio (f t) between the RPF counts at the annotated TISs and the sum of the RPF counts near the annotated TISs (from −1 to +1 relative to the annotated TISs) after P-site offset correction. Column 3: distribution of RPF 5′ end count near annotated stop codon. Column 4: RPF count profile throughout protein-coding regions across three reading frames; showing TIS enrichment score for TI-seq data
Fig. 3Modeling background distribution of TI-seq data. a An illustration of the typical TI-seq RPF count profile across a hypothetical protein-coding transcript. The RPF counts at the first base of CDS in-frame codons (the pink bars inside ORF), excluding AUG or near-cognate start codons, from annotated PCGs, were used to model TI-seq data background. TI-seq RPF count profile for the major isoform of b GAPDH and c UBTD1. d Fitting of different distributions including Poisson, zero-inflated Poisson (ZIP), negative bionomial (NB), and zero-inflated negative binomial (ZINB) to the observed background RPF count distribution. e NB distribution parameters (r and p) estimated from different TI-seq expression groups. f The use of different NB background distributions for transcripts/genes with different TI-seq signal density improved TIS identification compared to the use of a single/global NB background distribution
Fig. 4Genome-wide identification and differential analysis of TIs. a rRibo-seq and LTM-based TI-seq RPF count profiles in HEK293 cell line for TUBA1B, suggesting a uORF being translated across a different reading frame (pink) from annotated one (skyblue). b The normalized QTI-seq RPF count profiles under normal condition and amino-acid deprivation for the longest isoform of C1QBP. The top enriched c biological processes and d cellular components, based on the GO enrichment analysis of the genes with significantly elevated translation initiation efficiency under amino-acid deprivation. The normalized QTI-seq RPF and RNA-seq count profiles under normal condition and amino-acid deprivation for genes encoding mitochondrial ribosomal proteins e MRPL27 and f MRPS14
Fig. 5Evaluating the performance of different methods in ORF prediction. a An illustration of how the frame test was performed to predict ORFs from rRibo-seq data by Ribo-TISH. b ROC curves across six strategies of ORF predictions implemented in Ribo-TISH, RiboTaper, and ORF-RATER. An RPKM value of 1 was used as a cutoff to define actively translating genes for positive and negative sets. c The short ORFs (<100 aa) of CCDS in Ensembl or d the experimentally validated uORFs curated by uORFdb were used as a positive set (RPKM ≥ 1) for the ROC analyses across 7 strategies implemented in Ribo-TISH, RiboTaper, ORF-RATER, and riboHMM. ROC curves across six strategies implemented in Ribo-TISH, RiboTaper, and ORF-RATER when the annotated ORFs of CCDS in Ensembl, with RPKM e between 0.1 and 0.5 or f between 0.5 and 1, are used as a positive set, respectively
Fig. 6Experimental validations of the computationally predicted smORFs. a rRibo-seq and LTM-based TI-seq RPF count profiles in HEK293 cell line for a predicted uORF in 5′UTR of EIF5. b FLAG-tagged uORF within the context of the host mRNA was ectopically expressed and translation of the predicted polypeptide was detected by western blot with an anti-FLAG antibody. β-actin protein was used as internal control for western blot analysis. c rRibo-seq and LTM-based TI-seq RPF count profiles for a predicted ORF encoded by lncRNA DANCR. d FLAG-tagged smORF within the context of host lncRNA was ectopically expressed and translation of the predicted polypeptide was detected by western blot with an anti-FLAG antibody. e rRibo-seq and LTM-based TI-seq RPF count profiles for a predicted ORF encoded by an intron of BLOC1S3. f FLAG-tagged intronic smORF within the context of the host mRNA was ectopically expressed and translation of predicted polypeptide was detected by western blot with an anti-FLAG antibody