Literature DB >> 33635857

Balrog: A universal protein model for prokaryotic gene prediction.

Markus J Sommer1,2, Steven L Salzberg1,2,3.   

Abstract

Low-cost, high-throughput sequencing has led to an enormous increase in the number of sequenced microbial genomes, with well over 100,000 genomes in public archives today. Automatic genome annotation tools are integral to understanding these organisms, yet older gene finding methods must be retrained on each new genome. We have developed a universal model of prokaryotic genes by fitting a temporal convolutional network to amino-acid sequences from a large, diverse set of microbial genomes. We incorporated the new model into a gene finding system, Balrog (Bacterial Annotation by Learned Representation Of Genes), which does not require genome-specific training and which matches or outperforms other state-of-the-art gene finding tools. Balrog is freely available under the MIT license at https://github.com/salzberg-lab/Balrog.

Entities:  

Mesh:

Year:  2021        PMID: 33635857      PMCID: PMC7946324          DOI: 10.1371/journal.pcbi.1008727

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  18 in total

1.  Daniel Bernoulli (1738): evolution and economics under risk.

Authors:  S C Stearns
Journal:  J Biosci       Date:  2000-09       Impact factor: 1.826

Review 2.  Completely derandomized self-adaptation in evolution strategies.

Authors:  N Hansen; A Ostermeier
Journal:  Evol Comput       Date:  2001       Impact factor: 3.277

3.  Identifying bacterial genes and endosymbiont DNA with Glimmer.

Authors:  Arthur L Delcher; Kirsten A Bratke; Edwin C Powers; Steven L Salzberg
Journal:  Bioinformatics       Date:  2007-01-19       Impact factor: 6.937

4.  Finding Genes in Genome Sequence.

Authors:  Alice Carolyn McHardy; Andreas Kloetgen
Journal:  Methods Mol Biol       Date:  2017

5.  Prokka: rapid prokaryotic genome annotation.

Authors:  Torsten Seemann
Journal:  Bioinformatics       Date:  2014-03-18       Impact factor: 6.937

6.  MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets.

Authors:  Martin Steinegger; Johannes Söding
Journal:  Nat Biotechnol       Date:  2017-10-16       Impact factor: 54.908

7.  Microbial gene identification using interpolated Markov models.

Authors:  S L Salzberg; A L Delcher; S Kasif; O White
Journal:  Nucleic Acids Res       Date:  1998-01-15       Impact factor: 16.971

8.  A new genomic blueprint of the human gut microbiota.

Authors:  Alexandre Almeida; Alex L Mitchell; Miguel Boland; Samuel C Forster; Gregory B Gloor; Aleksandra Tarkowska; Trevor D Lawley; Robert D Finn
Journal:  Nature       Date:  2019-02-11       Impact factor: 49.962

9.  Prodigal: prokaryotic gene recognition and translation initiation site identification.

Authors:  Doug Hyatt; Gwo-Liang Chen; Philip F Locascio; Miriam L Land; Frank W Larimer; Loren J Hauser
Journal:  BMC Bioinformatics       Date:  2010-03-08       Impact factor: 3.169

10.  Theoretical prediction and experimental verification of protein-coding genes in plant pathogen genome Agrobacterium tumefaciens strain C58.

Authors:  Qian Wang; Yang Lei; Xiwen Xu; Gejiao Wang; Ling-Ling Chen
Journal:  PLoS One       Date:  2012-09-11       Impact factor: 3.240

View more
  6 in total

1.  RiboReport - benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria.

Authors:  Rick Gelhausen; Teresa Müller; Sarah L Svensson; Omer S Alkhnbashi; Cynthia M Sharma; Florian Eggenhofer; Rolf Backofen
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 11.622

2.  Unifying the known and unknown microbial coding sequence space.

Authors:  Chiara Vanni; Matthew S Schechter; Silvia G Acinas; Albert Barberán; Pier Luigi Buttigieg; Emilio O Casamayor; Tom O Delmont; Carlos M Duarte; A Murat Eren; Robert D Finn; Renzo Kottmann; Alex Mitchell; Pablo Sánchez; Kimmo Siren; Martin Steinegger; Frank Oliver Gloeckner; Antonio Fernàndez-Guerra
Journal:  Elife       Date:  2022-03-31       Impact factor: 8.713

Review 3.  A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data.

Authors:  Chao Yang; Debajyoti Chowdhury; Zhenmiao Zhang; William K Cheung; Aiping Lu; Zhaoxiang Bian; Lu Zhang
Journal:  Comput Struct Biotechnol J       Date:  2021-11-23       Impact factor: 7.271

4.  Evaluating Plant Gene Models Using Machine Learning.

Authors:  Shriprabha R Upadhyaya; Philipp E Bayer; Cassandria G Tay Fernandez; Jakob Petereit; Jacqueline Batley; Mohammed Bennamoun; Farid Boussaid; David Edwards
Journal:  Plants (Basel)       Date:  2022-06-20

Review 5.  Metagenomics: a path to understanding the gut microbiome.

Authors:  Sandi Yen; Jethro S Johnson
Journal:  Mamm Genome       Date:  2021-07-14       Impact factor: 2.957

6.  No one tool to rule them all: Prokaryotic gene prediction tool annotations are highly dependent on the organism of study.

Authors:  Nicholas J Dimonaco; Wayne Aubrey; Kim Kenobi; Amanda Clare; Christopher J Creevey
Journal:  Bioinformatics       Date:  2021-12-07       Impact factor: 6.937

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.