Wei Wang1, Zhi Wei1, Hongzhe Li1. 1. Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102 and Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
Abstract
MOTIVATION: Next-generation RNA sequencing offers an opportunity to investigate transcriptome in an unprecedented scale. Recent studies have revealed widespread alternative polyadenylation (polyA) in eukaryotes, leading to various mRNA isoforms differing in their 3' untranslated regions (3'UTR), through which, the stability, localization and translation of mRNA can be regulated. However, very few, if any, methods and tools are available for directly analyzing this special alternative RNA processing event. Conventional methods rely on annotation of polyA sites; yet, such knowledge remains incomplete, and identification of polyA sites is still challenging. The goal of this article is to develop methods for detecting 3'UTR switching without any prior knowledge of polyA annotations. RESULTS: We propose a change-point model based on a likelihood ratio test for detecting 3'UTR switching. We develop a directional testing procedure for identifying dramatic shortening or lengthening events in 3'UTR, while controlling mixed directional false discovery rate at a nominal level. To our knowledge, this is the first approach to analyze 3'UTR switching directly without relying on any polyA annotations. Simulation studies and applications to two real datasets reveal that our proposed method is powerful, accurate and feasible for the analysis of next-generation RNA sequencing data. CONCLUSIONS: The proposed method will fill a void among alternative RNA processing analysis tools for transcriptome studies. It can help to obtain additional insights from RNA sequencing data by understanding gene regulation mechanisms through the analysis of 3'UTR switching. AVAILABILITY AND IMPLEMENTATION: The software is implemented in Java and can be freely downloaded from http://utr.sourceforge.net/. CONTACT: zhiwei@njit.edu or hongzhe@mail.med.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Next-generation RNA sequencing offers an opportunity to investigate transcriptome in an unprecedented scale. Recent studies have revealed widespread alternative polyadenylation (polyA) in eukaryotes, leading to various mRNA isoforms differing in their 3' untranslated regions (3'UTR), through which, the stability, localization and translation of mRNA can be regulated. However, very few, if any, methods and tools are available for directly analyzing this special alternative RNA processing event. Conventional methods rely on annotation of polyA sites; yet, such knowledge remains incomplete, and identification of polyA sites is still challenging. The goal of this article is to develop methods for detecting 3'UTR switching without any prior knowledge of polyA annotations. RESULTS: We propose a change-point model based on a likelihood ratio test for detecting 3'UTR switching. We develop a directional testing procedure for identifying dramatic shortening or lengthening events in 3'UTR, while controlling mixed directional false discovery rate at a nominal level. To our knowledge, this is the first approach to analyze 3'UTR switching directly without relying on any polyA annotations. Simulation studies and applications to two real datasets reveal that our proposed method is powerful, accurate and feasible for the analysis of next-generation RNA sequencing data. CONCLUSIONS: The proposed method will fill a void among alternative RNA processing analysis tools for transcriptome studies. It can help to obtain additional insights from RNA sequencing data by understanding gene regulation mechanisms through the analysis of 3'UTR switching. AVAILABILITY AND IMPLEMENTATION: The software is implemented in Java and can be freely downloaded from http://utr.sourceforge.net/. CONTACT: zhiwei@njit.edu or hongzhe@mail.med.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Peter Smibert; Pedro Miura; Jakub O Westholm; Sol Shenker; Gemma May; Michael O Duff; Dayu Zhang; Brian D Eads; Joe Carlson; James B Brown; Robert C Eisman; Justen Andrews; Thomas Kaufman; Peter Cherbas; Susan E Celniker; Brenton R Graveley; Eric C Lai Journal: Cell Rep Date: 2012-03-29 Impact factor: 9.423
Authors: Michael G Berg; Larry N Singh; Ihab Younis; Qiang Liu; Anna Maria Pinto; Daisuke Kaida; Zhenxi Zhang; Sungchan Cho; Scott Sherrill-Mix; Lili Wan; Gideon Dreyfuss Journal: Cell Date: 2012-07-06 Impact factor: 41.582
Authors: Yuefeng Lin; Zhihua Li; Fatih Ozsolak; Sang Woo Kim; Gustavo Arango-Argoty; Teresa T Liu; Scott A Tenenbaum; Timothy Bailey; A Paula Monaghan; Patrice M Milos; Bino John Journal: Nucleic Acids Res Date: 2012-06-29 Impact factor: 16.971
Authors: Igor Ulitsky; Alena Shkumatava; Calvin H Jan; Alexander O Subtelny; David Koppstein; George W Bell; Hazel Sive; David P Bartel Journal: Genome Res Date: 2012-06-21 Impact factor: 9.043
Authors: Zheng Xia; Lawrence A Donehower; Thomas A Cooper; Joel R Neilson; David A Wheeler; Eric J Wagner; Wei Li Journal: Nat Commun Date: 2014-11-20 Impact factor: 14.919
Authors: Inanç Birol; Anthony Raymond; Readman Chiu; Ka Ming Nip; Shaun D Jackman; Maayan Kreitzman; T Roderick Docking; Catherine A Ennis; A Gordon Robertson; Aly Karsan Journal: Pac Symp Biocomput Date: 2015
Authors: Nitika Kandhari; Calvin A Kraupner-Taylor; Paul F Harrison; David R Powell; Traude H Beilharz Journal: Int J Mol Sci Date: 2021-05-18 Impact factor: 5.923