Literature DB >> 33816929

Genome annotation across species using deep convolutional neural networks.

Ghazaleh Khodabandelou1,2, Etienne Routhier1, Julien Mozziconacci1,3,4.   

Abstract

Application of deep neural network is a rapidly expanding field now reaching many disciplines including genomics. In particular, convolutional neural networks have been exploited for identifying the functional role of short genomic sequences. These approaches rely on gathering large sets of sequences with known functional role, extracting those sequences from whole-genome-annotations. These sets are then split into learning, test and validation sets in order to train the networks. While the obtained networks perform well on validation sets, they often perform poorly when applied on whole genomes in which the ratio of positive over negative examples can be very different than in the training set. We here address this issue by assessing the genome-wide performance of networks trained with sets exhibiting different ratios of positive to negative examples. As a case study, we use sequences encompassing gene starts from the RefGene database as positive examples and random genomic sequences as negative examples. We then demonstrate that models trained using data from one organism can be used to predict gene-start sites in a related species, when using training sets providing good genome-wide performance. This cross-species application of convolutional neural networks provides a new way to annotate any genome from existing high-quality annotations in a related reference species. It also provides a way to determine whether the sequence motifs recognised by chromatin-associated proteins in different species are conserved or not. ©2020 Khodabandelou et al.

Entities:  

Keywords:  DNA motifs; Deep learning; Genome annotation; Promoters; Sequence evolution; Transcription start sites; Unbalanced datasets

Year:  2020        PMID: 33816929      PMCID: PMC7924482          DOI: 10.7717/peerj-cs.278

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  30 in total

1.  Genome-wide analysis of mammalian promoter architecture and evolution.

Authors:  Piero Carninci; Albin Sandelin; Boris Lenhard; Shintaro Katayama; Kazuro Shimokawa; Jasmina Ponjavic; Colin A M Semple; Martin S Taylor; Pär G Engström; Martin C Frith; Alistair R R Forrest; Wynand B Alkema; Sin Lam Tan; Charles Plessy; Rimantas Kodzius; Timothy Ravasi; Takeya Kasukawa; Shiro Fukuda; Mutsumi Kanamori-Katayama; Yayoi Kitazume; Hideya Kawaji; Chikatoshi Kai; Mari Nakamura; Hideaki Konno; Kenji Nakano; Salim Mottagui-Tabar; Peter Arner; Alessandra Chesi; Stefano Gustincich; Francesca Persichetti; Harukazu Suzuki; Sean M Grimmond; Christine A Wells; Valerio Orlando; Claes Wahlestedt; Edison T Liu; Matthias Harbers; Jun Kawai; Vladimir B Bajic; David A Hume; Yoshihide Hayashizaki
Journal:  Nat Genet       Date:  2006-04-28       Impact factor: 38.330

2.  Computational detection and location of transcription start sites in mammalian genomic DNA.

Authors:  Thomas A Down; Tim J P Hubbard
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

3.  Predicting effects of noncoding variants with deep learning-based sequence model.

Authors:  Jian Zhou; Olga G Troyanskaya
Journal:  Nat Methods       Date:  2015-08-24       Impact factor: 28.547

4.  A universal SNP and small-indel variant caller using deep neural networks.

Authors:  Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo
Journal:  Nat Biotechnol       Date:  2018-09-24       Impact factor: 54.908

5.  Predicting Splicing from Primary Sequence with Deep Learning.

Authors:  Kishore Jaganathan; Sofia Kyriazopoulou Panagiotopoulou; Jeremy F McRae; Siavash Fazel Darbandi; David Knowles; Yang I Li; Jack A Kosmicki; Juan Arbelaez; Wenwu Cui; Grace B Schwartz; Eric D Chow; Efstathios Kanterakis; Hong Gao; Amirali Kia; Serafim Batzoglou; Stephan J Sanders; Kyle Kai-How Farh
Journal:  Cell       Date:  2019-01-17       Impact factor: 41.582

Review 6.  A primer on deep learning in genomics.

Authors:  James Zou; Mikael Huss; Abubakar Abid; Pejman Mohammadi; Ali Torkamani; Amalio Telenti
Journal:  Nat Genet       Date:  2018-11-26       Impact factor: 38.330

Review 7.  Deep learning for computational biology.

Authors:  Christof Angermueller; Tanel Pärnamaa; Leopold Parts; Oliver Stegle
Journal:  Mol Syst Biol       Date:  2016-07-29       Impact factor: 11.429

8.  An integrated encyclopedia of DNA elements in the human genome.

Authors: 
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

Review 9.  Opportunities and obstacles for deep learning in biology and medicine.

Authors:  Travers Ching; Daniel S Himmelstein; Brett K Beaulieu-Jones; Alexandr A Kalinin; Brian T Do; Gregory P Way; Enrico Ferrero; Paul-Michael Agapow; Michael Zietz; Michael M Hoffman; Wei Xie; Gail L Rosen; Benjamin J Lengerich; Johnny Israeli; Jack Lanchantin; Stephen Woloszynek; Anne E Carpenter; Avanti Shrikumar; Jinbo Xu; Evan M Cofer; Christopher A Lavender; Srinivas C Turaga; Amr M Alexandari; Zhiyong Lu; David J Harris; Dave DeCaprio; Yanjun Qi; Anshul Kundaje; Yifan Peng; Laura K Wiley; Marwin H S Segler; Simina M Boca; S Joshua Swamidass; Austin Huang; Anthony Gitter; Casey S Greene
Journal:  J R Soc Interface       Date:  2018-04       Impact factor: 4.293

10.  Sequential regulatory activity prediction across chromosomes with convolutional neural networks.

Authors:  David R Kelley; Yakir A Reshef; Maxwell Bileschi; David Belanger; Cory Y McLean; Jasper Snoek
Journal:  Genome Res       Date:  2018-03-27       Impact factor: 9.043

View more
  2 in total

1.  ReFeaFi: Genome-wide prediction of regulatory elements driving transcription initiation.

Authors:  Ramzan Umarov; Yu Li; Takahiro Arakawa; Satoshi Takizawa; Xin Gao; Erik Arner
Journal:  PLoS Comput Biol       Date:  2021-09-07       Impact factor: 4.475

2.  Genomics enters the deep learning era.

Authors:  Etienne Routhier; Julien Mozziconacci
Journal:  PeerJ       Date:  2022-06-24       Impact factor: 3.061

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.