Literature DB >> 18636476

Genome-wide enzyme annotation with precision control: catalytic families (CatFam) databases.

Chenggang Yu1, Nela Zavaljevski, Valmik Desai, Jaques Reifman.   

Abstract

In this article, we present a new method termed CatFam (Catalytic Families) to automatically infer the functions of catalytic proteins, which account for 20-40% of all proteins in living organisms and play a critical role in a variety of biological processes. CatFam is a sequence-based method that generates sequence profiles to represent and infer protein catalytic functions. CatFam generates profiles through a stepwise procedure that carefully controls profile quality and employs nonenzymes as negative samples to establish profile-specific thresholds associated with a predefined nominal false-positive rate (FPR) of predictions. The adjustable FPR allows for fine precision control of each profile and enables the generation of profile databases that meet different needs: function annotation with high precision and hypothesis generation with moderate precision but better recall. Multiple tests of CatFam databases (generated with distinct nominal FPRs) against enzyme and nonenzyme datasets show that the method's predictions have consistently high precision and recall. For example, a 1% FPR database predicts protein catalytic functions for a dataset of enzymes and nonenzymes with 98.6% precision and 95.0% recall. Comparisons of CatFam databases against other established profile-based methods for the functional annotation of 13 bacterial genomes indicate that CatFam consistently achieves higher precision and (in most cases) higher recall, and that (on average) CatFam provides 21.9% additional catalytic functions not inferred by the other similarly reliable methods. These results strongly suggest that the proposed method provides a valuable contribution to the automated prediction of protein catalytic functions. The CatFam databases and the database search program are freely available at http://www.bhsai.org/downloads/catfam.tar.gz. Copyright 2008 Wiley-Liss, Inc.

Mesh:

Substances:

Year:  2009        PMID: 18636476     DOI: 10.1002/prot.22167

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  34 in total

1.  Draft genome sequence of Bacteroides faecis MAJ27T, a strain isolated from human feces.

Authors:  Min-Soo Kim; Tae Woong Whon; Seong Woon Roh; Na-Ri Shin; Jin-Woo Bae
Journal:  J Bacteriol       Date:  2011-12       Impact factor: 3.490

2.  Draft genome sequence of Dietzia alimentaria 72T, belonging to the family Dietziaceae, isolated from a traditional Korean food.

Authors:  Jandi Kim; Seong Woon Roh; Jin-Woo Bae
Journal:  J Bacteriol       Date:  2011-12       Impact factor: 3.490

3.  Genome sequence of Brachybacterium squillarum M-6-3(T), isolated from salt-fermented seafood.

Authors:  Seong-Kyu Park; Seong Woon Roh; Tae Woong Whon; Jin-Woo Bae
Journal:  J Bacteriol       Date:  2011-11       Impact factor: 3.490

4.  Genome sequence of Lentibacillus jeotgali Grbi(T), isolated from traditional Korean salt-fermented seafood.

Authors:  Mi-Ja Jung; Seong Woon Roh; Min-Soo Kim; Tae Woong Whon; Jin-Woo Bae
Journal:  J Bacteriol       Date:  2011-11       Impact factor: 3.490

5.  Prediction and experimental validation of enzyme substrate specificity in protein structures.

Authors:  Shivas R Amin; Serkan Erdin; R Matthew Ward; Rhonald C Lua; Olivier Lichtarge
Journal:  Proc Natl Acad Sci U S A       Date:  2013-10-21       Impact factor: 11.205

6.  Genome sequence analysis of potential probiotic strain Leuconostoc lactis EFEL005 isolated from kimchi.

Authors:  Jin Seok Moon; Hye Sun Choi; So Yeon Shin; Sol Ji Noh; Che Ok Jeon; Nam Soo Han
Journal:  J Microbiol       Date:  2015-05-03       Impact factor: 3.422

7.  Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers.

Authors:  Jae Yong Ryu; Hyun Uk Kim; Sang Yup Lee
Journal:  Proc Natl Acad Sci U S A       Date:  2019-06-20       Impact factor: 11.205

8.  Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.

Authors:  Pascal Schläpfer; Peifen Zhang; Chuan Wang; Taehyong Kim; Michael Banf; Lee Chae; Kate Dreher; Arvind K Chavali; Ricardo Nilo-Poyanco; Thomas Bernard; Daniel Kahn; Seung Y Rhee
Journal:  Plant Physiol       Date:  2017-02-22       Impact factor: 8.340

9.  EFICAz2.5: application of a high-precision enzyme function predictor to 396 proteomes.

Authors:  Narendra Kumar; Jeffrey Skolnick
Journal:  Bioinformatics       Date:  2012-08-24       Impact factor: 6.937

10.  PSPP: a protein structure prediction pipeline for computing clusters.

Authors:  Michael S Lee; Rajkumar Bondugula; Valmik Desai; Nela Zavaljevski; In-Chul Yeh; Anders Wallqvist; Jaques Reifman
Journal:  PLoS One       Date:  2009-07-16       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.