Literature DB >> 25428365

Expanded microbial genome coverage and improved protein family annotation in the COG database.

Michael Y Galperin1, Kira S Makarova1, Yuri I Wolf1, Eugene V Koonin2.   

Abstract

Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the COGs is expected to become an important tool for microbial genomics. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by US Government employees and is in the public domain in the US.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25428365      PMCID: PMC4383993          DOI: 10.1093/nar/gku1223

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  68 in total

1.  Phylogeny of Firmicutes with special reference to Mycoplasma (Mollicutes) as inferred from phosphoglycerate kinase amino acid sequence data.

Authors:  Matthias Wolf; Tobias Müller; Thomas Dandekar; J Dennis Pollack
Journal:  Int J Syst Evol Microbiol       Date:  2004-05       Impact factor: 2.747

2.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

Review 3.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

4.  ε, a new subunit of RNA polymerase found in gram-positive bacteria.

Authors:  Andrew N Keller; Xiao Yang; Jana Wiedermannová; Olivier Delumeau; Libor Krásný; Peter J Lewis
Journal:  J Bacteriol       Date:  2014-08-04       Impact factor: 3.490

5.  glc locus of Escherichia coli: characterization of genes encoding the subunits of glycolate oxidase and the glc regulator protein.

Authors:  M T Pellicer; J Badía; J Aguilar; L Baldomà
Journal:  J Bacteriol       Date:  1996-04       Impact factor: 3.490

6.  Extensive domain shuffling in transcription regulators of DNA viruses and implications for the origin of fungal APSES transcription factors.

Authors:  Lakshminarayan M Iyer; Eugene V Koonin; L Aravind
Journal:  Genome Biol       Date:  2002-02-13       Impact factor: 13.583

7.  Protein domains of unknown function are essential in bacteria.

Authors:  Norman F Goodacre; Dietlind L Gerloff; Peter Uetz
Journal:  MBio       Date:  2013-12-31       Impact factor: 7.867

8.  Elongated structure of the outer-membrane activator of peptidoglycan synthesis LpoA: implications for PBP1A stimulation.

Authors:  Nicolas L Jean; Catherine M Bougault; Adam Lodge; Adeline Derouaux; Gilles Callens; Alexander J F Egan; Isabel Ayala; Richard J Lewis; Waldemar Vollmer; Jean-Pierre Simorre
Journal:  Structure       Date:  2014-06-19       Impact factor: 5.006

9.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

Review 10.  Recent advances in radical SAM enzymology: new structures and mechanisms.

Authors:  Jiarui Wang; Rory P Woldring; Gabriel D Román-Meléndez; Alan M McClain; Brian R Alzua; E Neil G Marsh
Journal:  ACS Chem Biol       Date:  2014-07-16       Impact factor: 5.100

View more
  488 in total

1.  Evaluation of CpxRA as a Therapeutic Target for Uropathogenic Escherichia coli Infections.

Authors:  Lana Dbeibo; Julia J van Rensburg; Sara N Smith; Kate R Fortney; Dharanesh Gangaiah; Hongyu Gao; Juan Marzoa; Yunlong Liu; Harry L T Mobley; Stanley M Spinola
Journal:  Infect Immun       Date:  2018-02-20       Impact factor: 3.441

2.  A null model for microbial diversification.

Authors:  Timothy J Straub; Olga Zhaxybayeva
Journal:  Proc Natl Acad Sci U S A       Date:  2017-06-19       Impact factor: 11.205

3.  Two forms of phosphomannomutase in gammaproteobacteria: The overlooked membrane-bound form of AlgC is required for twitching motility of Lysobacter enzymogenes.

Authors:  Guoliang Qian; Shifang Fei; Michael Y Galperin
Journal:  Environ Microbiol       Date:  2019-05-23       Impact factor: 5.491

4.  Complete genome sequence of acetate-producing Klebsiella pneumoniae L5-2 isolated from infant feces.

Authors:  Yong-Soo Park; Jisu Kang; Jung-Hoon Yoon; Dong-Ho Seo; Won-Hyong Chung; Mi Young Lim; Myung-Ji Seo; Young-Do Nam
Journal:  3 Biotech       Date:  2019-02-15       Impact factor: 2.406

5.  Genome Characterization of Oleaginous Aspergillus oryzae BCC7051: A Potential Fungal-Based Platform for Lipid Production.

Authors:  Chinae Thammarongtham; Intawat Nookaew; Tayvich Vorapreeda; Tanawut Srisuk; Miriam L Land; Sukanya Jeennor; Kobkul Laoteng
Journal:  Curr Microbiol       Date:  2017-09-01       Impact factor: 2.188

6.  Freshwater bacteria release methane as a byproduct of phosphorus acquisition.

Authors:  Mengyin Yao; Cynthia Henny; Julia A Maresca
Journal:  Appl Environ Microbiol       Date:  2016-09-30       Impact factor: 4.792

7.  Proteogenomic Insights into the Physiology of Marine, Sulfate-Reducing, Filamentous Desulfonema limicola and Desulfonema magnum.

Authors:  Vanessa Schnaars; Lars Wöhlbrand; Sabine Scheve; Christina Hinrichs; Richard Reinhardt; Ralf Rabus
Journal:  Microb Physiol       Date:  2021-02-19

8.  Schumannella soli sp. nov., a novel actinomycete isolated from mangrove soil by in situ cultivation.

Authors:  Feina Li; Qinpei Lu; Shuilin Liao; Li Tuo; Shaowei Liu; Qin Yang; Adong Shen; Chenghang Sun
Journal:  Antonie Van Leeuwenhoek       Date:  2021-08-02       Impact factor: 2.271

9.  Impacts of horizontal gene transfer on the compact genome of the clavulanic acid-producing Streptomyces strain F613-1.

Authors:  Jun Li; Zhilong Zhao; Weihong Zhong; Chuanqing Zhong; Gongli Zong; Jiafang Fu; Guangxiang Cao
Journal:  3 Biotech       Date:  2018-11-08       Impact factor: 2.406

10.  Cyclic di-AMP, a second messenger of primary importance: tertiary structures and binding mechanisms.

Authors:  Jin He; Wen Yin; Michael Y Galperin; Shan-Ho Chou
Journal:  Nucleic Acids Res       Date:  2020-04-06       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.