Literature DB >> 33483546

Detecting operons in bacterial genomes via visual representation learning.

Rida Assaf1, Fangfang Xia2,3, Rick Stevens4,2.   

Abstract

Contiguous genes in prokaryotes are often arranged into operons. Detecting operons plays a critical role in inferring gene functionality and regulatory networks. Human experts annotate operons by visually inspecting gene neighborhoods across pileups of related genomes. These visual representations capture the inter-genic distance, strand direction, gene size, functional relatedness, and gene neighborhood conservation, which are the most prominent operon features mentioned in the literature. By studying these features, an expert can then decide whether a genomic region is part of an operon. We propose a deep learning based method named Operon Hunter that uses visual representations of genomic fragments to make operon predictions. Using transfer learning and data augmentation techniques facilitates leveraging the powerful neural networks trained on image datasets by re-training them on a more limited dataset of extensively validated operons. Our method outperforms the previously reported state-of-the-art tools, especially when it comes to predicting full operons and their boundaries accurately. Furthermore, our approach makes it possible to visually identify the features influencing the network's decisions to be subsequently cross-checked by human experts.

Entities:  

Mesh:

Year:  2021        PMID: 33483546      PMCID: PMC7822928          DOI: 10.1038/s41598-021-81169-9

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  39 in total

1.  Modeling and predicting transcriptional units of Escherichia coli genes using hidden Markov models.

Authors:  T Yada; M Nakao; Y Totoki; K Nakai
Journal:  Bioinformatics       Date:  1999-12       Impact factor: 6.937

2.  A historical perspective on gene/protein functional assignment.

Authors:  T C Hodgman
Journal:  Bioinformatics       Date:  2000-01       Impact factor: 6.937

3.  A probabilistic learning approach to whole-genome operon prediction.

Authors:  M Craven; D Page; J Shavlik; J Bockhorst; J Glasner
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  2000

4.  Operon prediction without a training set.

Authors:  B P Westover; J D Buhler; J L Sonnenburg; J I Gordon
Journal:  Bioinformatics       Date:  2004-11-11       Impact factor: 6.937

5.  Operon prediction for sequenced bacterial genomes without experimental information.

Authors:  Nicholas H Bergman; Karla D Passalacqua; Philip C Hanna; Zhaohui S Qin
Journal:  Appl Environ Microbiol       Date:  2006-11-22       Impact factor: 4.792

6.  Functional analysis of an intergenic non-coding sequence within mce1 operon of M.tuberculosis.

Authors:  Monika Joon; Shipra Bhatia; Rashmi Pasricha; Mridula Bose; Vani Brahmachari
Journal:  BMC Microbiol       Date:  2010-04-27       Impact factor: 3.605

7.  ODB: a database for operon organizations, 2011 update.

Authors:  Shujiro Okuda; Akiyasu C Yoshizawa
Journal:  Nucleic Acids Res       Date:  2010-11-04       Impact factor: 16.971

8.  ProOpDB: Prokaryotic Operon DataBase.

Authors:  Blanca Taboada; Ricardo Ciria; Cristian E Martinez-Guerrero; Enrique Merino
Journal:  Nucleic Acids Res       Date:  2011-11-16       Impact factor: 16.971

9.  PATtyFams: Protein Families for the Microbial Genomes in the PATRIC Database.

Authors:  James J Davis; Svetlana Gerdes; Gary J Olsen; Robert Olson; Gordon D Pusch; Maulik Shukla; Veronika Vonstein; Alice R Wattam; Hyunseung Yoo
Journal:  Front Microbiol       Date:  2016-02-08       Impact factor: 5.640

10.  RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12.

Authors:  Alberto Santos-Zavaleta; Heladia Salgado; Socorro Gama-Castro; Mishael Sánchez-Pérez; Laura Gómez-Romero; Daniela Ledezma-Tejeida; Jair Santiago García-Sotelo; Kevin Alquicira-Hernández; Luis José Muñiz-Rascado; Pablo Peña-Loredo; Cecilia Ishida-Gutiérrez; David A Velázquez-Ramírez; Víctor Del Moral-Chávez; César Bonavides-Martínez; Carlos-Francisco Méndez-Cruz; James Galagan; Julio Collado-Vides
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

View more
  2 in total

1.  Current and Emerging Tools of Computational Biology To Improve the Detoxification of Mycotoxins.

Authors:  Natalie Sandlin; Darius Russell Kish; John Kim; Marco Zaccaria; Babak Momeni
Journal:  Appl Environ Microbiol       Date:  2021-12-08       Impact factor: 5.005

2.  OperonSEQer: A set of machine-learning algorithms with threshold voting for detection of operon pairs using short-read RNA-sequencing data.

Authors:  Raga Krishnakumar; Anne M Ruffing
Journal:  PLoS Comput Biol       Date:  2022-01-05       Impact factor: 4.475

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.