Literature DB >> 29986088

A deep recurrent neural network discovers complex biological rules to decipher RNA protein-coding potential.

Steven T Hill1, Rachael Kuintzle2, Amy Teegarden2, Erich Merrill1, Padideh Danaee1, David A Hendrix1,2.   

Abstract

The current deluge of newly identified RNA transcripts presents a singular opportunity for improved assessment of coding potential, a cornerstone of genome annotation, and for machine-driven discovery of biological knowledge. While traditional, feature-based methods for RNA classification are limited by current scientific knowledge, deep learning methods can independently discover complex biological rules in the data de novo. We trained a gated recurrent neural network (RNN) on human messenger RNA (mRNA) and long noncoding RNA (lncRNA) sequences. Our model, mRNA RNN (mRNN), surpasses state-of-the-art methods at predicting protein-coding potential despite being trained with less data and with no prior concept of what features define mRNAs. To understand what mRNN learned, we probed the network and uncovered several context-sensitive codons highly predictive of coding potential. Our results suggest that gated RNNs can learn complex and long-range patterns in full-length human transcripts, making them ideal for performing a wide range of difficult classification tasks and, most importantly, for harvesting new biological insights from the rising flood of sequencing data.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29986088      PMCID: PMC6144860          DOI: 10.1093/nar/gky567

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  16 in total

1.  Fast model-based protein homology detection without alignment.

Authors:  Sepp Hochreiter; Martin Heusel; Klaus Obermayer
Journal:  Bioinformatics       Date:  2007-05-08       Impact factor: 6.937

2.  Compensatory mutations are repeatable and clustered within proteins.

Authors:  Brad H Davis; Art F Y Poon; Michael C Whitlock
Journal:  Proc Biol Sci       Date:  2009-02-25       Impact factor: 5.349

3.  FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome.

Authors:  Valentin Wucher; Fabrice Legeai; Benoît Hédan; Guillaume Rizk; Lætitia Lagoutte; Tosso Leeb; Vidhya Jagannathan; Edouard Cadieu; Audrey David; Hannes Lohi; Susanna Cirera; Merete Fredholm; Nadine Botherel; Peter A J Leegwater; Céline Le Béguec; Hille Fieten; Jeremy Johnson; Jessica Alföldi; Catherine André; Kerstin Lindblad-Toh; Christophe Hitte; Thomas Derrien
Journal:  Nucleic Acids Res       Date:  2017-05-05       Impact factor: 16.971

4.  Synergistic fitness interactions and a high frequency of beneficial changes among mutations accumulated under relaxed selection in Saccharomyces cerevisiae.

Authors:  W Joseph Dickinson
Journal:  Genetics       Date:  2008-02-01       Impact factor: 4.562

5.  GENCODE: the reference human genome annotation for The ENCODE Project.

Authors:  Jennifer Harrow; Adam Frankish; Jose M Gonzalez; Electra Tapanari; Mark Diekhans; Felix Kokocinski; Bronwen L Aken; Daniel Barrell; Amonida Zadissa; Stephen Searle; If Barnes; Alexandra Bignell; Veronika Boychenko; Toby Hunt; Mike Kay; Gaurab Mukherjee; Jeena Rajan; Gloria Despacio-Reyes; Gary Saunders; Charles Steward; Rachel Harte; Michael Lin; Cédric Howald; Andrea Tanzer; Thomas Derrien; Jacqueline Chrast; Nathalie Walters; Suganthi Balasubramanian; Baikang Pei; Michael Tress; Jose Manuel Rodriguez; Iakes Ezkurdia; Jeltje van Baren; Michael Brent; David Haussler; Manolis Kellis; Alfonso Valencia; Alexandre Reymond; Mark Gerstein; Roderic Guigó; Tim J Hubbard
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

6.  Synergistic and compensatory effects of two point mutations conferring target-site resistance to fipronil in the insect GABA receptor RDL.

Authors:  Yixi Zhang; Xiangkun Meng; Yuanxue Yang; Hong Li; Xin Wang; Baojun Yang; Jianhua Zhang; Chunrui Li; Neil S Millar; Zewen Liu
Journal:  Sci Rep       Date:  2016-08-25       Impact factor: 4.379

7.  DeepCpG: accurate prediction of single-cell DNA methylation states using deep learning.

Authors:  Christof Angermueller; Heather J Lee; Wolf Reik; Oliver Stegle
Journal:  Genome Biol       Date:  2017-04-11       Impact factor: 13.583

8.  A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts.

Authors:  Hugo W Schneider; Taina Raiol; Marcelo M Brigido; Maria Emilia M T Walter; Peter F Stadler
Journal:  BMC Genomics       Date:  2017-10-18       Impact factor: 3.969

9.  CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model.

Authors:  Liguo Wang; Hyun Jung Park; Surendra Dasari; Shengqin Wang; Jean-Pierre Kocher; Wei Li
Journal:  Nucleic Acids Res       Date:  2013-01-17       Impact factor: 16.971

10.  TITER: predicting translation initiation sites by deep learning.

Authors:  Sai Zhang; Hailin Hu; Tao Jiang; Lei Zhang; Jianyang Zeng
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

View more
  10 in total

1.  Application of deep learning in genomics.

Authors:  Jianxiao Liu; Jiying Li; Hai Wang; Jianbing Yan
Journal:  Sci China Life Sci       Date:  2020-10-10       Impact factor: 6.038

2.  Can artificial neural replicators be useful for studying RNA replicators?

Authors:  Alexandr A Ezhov
Journal:  Arch Virol       Date:  2020-08-19       Impact factor: 2.574

3.  CPPred: coding potential prediction based on the global description of RNA sequence.

Authors:  Xiaoxue Tong; Shiyong Liu
Journal:  Nucleic Acids Res       Date:  2019-05-07       Impact factor: 16.971

4.  Class similarity network for coding and long non-coding RNA classification.

Authors:  Yu Zhang; Yahui Long; Chee Keong Kwoh
Journal:  BMC Bioinformatics       Date:  2021-12-20       Impact factor: 3.169

5.  Deep learning tools are top performers in long non-coding RNA prediction.

Authors:  Tea Ammunét; Ning Wang; Sofia Khan; Laura L Elo
Journal:  Brief Funct Genomics       Date:  2022-05-21       Impact factor: 4.840

Review 6.  A primer on deep learning in genomics.

Authors:  James Zou; Mikael Huss; Abubakar Abid; Pejman Mohammadi; Ali Torkamani; Amalio Telenti
Journal:  Nat Genet       Date:  2018-11-26       Impact factor: 38.330

7.  sgRNACNN: identifying sgRNA on-target activity in four crops using ensembles of convolutional neural networks.

Authors:  Mengting Niu; Yuan Lin; Quan Zou
Journal:  Plant Mol Biol       Date:  2021-01-01       Impact factor: 4.076

8.  Combining signal and sequence to detect RNA polymerase initiation in ATAC-seq data.

Authors:  Ignacio J Tripodi; Murad Chowdhury; Margaret Gruca; Robin D Dowell
Journal:  PLoS One       Date:  2020-04-30       Impact factor: 3.240

9.  Computational identification of vesicular transport proteins from sequences using deep gated recurrent units architecture.

Authors:  Nguyen Quoc Khanh Le; Edward Kien Yee Yapp; N Nagasundaram; Matthew Chin Heng Chua; Hui-Yuan Yeh
Journal:  Comput Struct Biotechnol J       Date:  2019-10-25       Impact factor: 7.271

10.  miTAR: a hybrid deep learning-based approach for predicting miRNA targets.

Authors:  Tongjun Gu; Xiwu Zhao; William Bradley Barbazuk; Ji-Hyun Lee
Journal:  BMC Bioinformatics       Date:  2021-02-27       Impact factor: 3.169

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.