Literature DB >> 17570149

Gene function prediction based on genomic context clustering and discriminative learning: an application to bacteriophages.

Jason Li1, Saman K Halgamuge, Christopher I Kells, Sen-Lin Tang.   

Abstract

BACKGROUND: Existing methods for whole-genome comparisons require prior knowledge of related species and provide little automation in the function prediction process. Bacteriophage genomes are an example that cannot be easily analyzed by these methods. This work addresses these shortcomings and aims to provide an automated prediction system of gene function.
RESULTS: We have developed a novel system called SynFPS to perform gene function prediction over completed genomes. The prediction system is initialized by clustering a large collection of weakly related genomes into groups based on their resemblance in gene distribution. From each individual group, data are then extracted and used to train a Support Vector Machine that makes gene function predictions. Experiments were conducted with 9 different gene functions over 296 bacteriophage genomes. Cross validation results gave an average prediction accuracy of ~80%, which is comparable to other genomic-context based prediction methods. Functional predictions are also made on 3 uncharacterized genes and 12 genes that cannot be identified by sequence alignment. The software is publicly available at http://www.synteny.net/.
CONCLUSION: The proposed system employs genomic context to predict gene function and detect gene correspondence in whole-genome comparisons. Although our experimental focus is on bacteriophages, the method may be extended to other microbial genomes as they share a number of similar characteristics with phage genomes such as gene order conservation.

Entities:  

Mesh:

Year:  2007        PMID: 17570149      PMCID: PMC1892085          DOI: 10.1186/1471-2105-8-S4-S6

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  28 in total

1.  PipMaker--a web server for aligning two genomic DNA sequences.

Authors:  S Schwartz; Z Zhang; K A Frazer; A Smit; C Riemer; J Bouck; R Gibbs; R Hardison; W Miller
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

2.  Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context.

Authors:  Y I Wolf; I B Rogozin; A S Kondrashov; E V Koonin
Journal:  Genome Res       Date:  2001-03       Impact factor: 9.043

3.  Bringing gene order into bacterial shape.

Authors:  J Tamames; M González-Moreno; J Mingorance; A Valencia; M Vicente
Journal:  Trends Genet       Date:  2001-03       Impact factor: 11.639

4.  Automatic detection of conserved gene clusters in multiple genomes by graph comparison and P-quasi grouping.

Authors:  W Fujibuchi; H Ogata; H Matsuda; M Kanehisa
Journal:  Nucleic Acids Res       Date:  2000-10-15       Impact factor: 16.971

Review 5.  Phage genomics: small is beautiful.

Authors:  Harald Brüssow; Roger W Hendrix
Journal:  Cell       Date:  2002-01-11       Impact factor: 41.582

6.  The KEGG databases at GenomeNet.

Authors:  Minoru Kanehisa; Susumu Goto; Shuichi Kawashima; Akihiro Nakaya
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

7.  Identifying functional links between genes using conserved chromosomal proximity.

Authors:  Itai Yanai; Joseph C Mellor; Charles DeLisi
Journal:  Trends Genet       Date:  2002-04       Impact factor: 11.639

8.  SHOT: a web server for the construction of genome phylogenies.

Authors:  Jan O Korbel; Berend Snel; Martijn A Huynen; Peer Bork
Journal:  Trends Genet       Date:  2002-03       Impact factor: 11.639

9.  Splice site identification using probabilistic parameters and SVM classification.

Authors:  A K M A Baten; B C H Chang; S K Halgamuge; Jason Li
Journal:  BMC Bioinformatics       Date:  2006-12-18       Impact factor: 3.169

10.  Evolution of gene order conservation in prokaryotes.

Authors:  J Tamames
Journal:  Genome Biol       Date:  2001-06-01       Impact factor: 13.583

View more
  5 in total

1.  Genome classification by gene distribution: an overlapping subspace clustering approach.

Authors:  Jason Li; Saman K Halgamuge; Sen-Lin Tang
Journal:  BMC Evol Biol       Date:  2008-04-23       Impact factor: 3.260

2.  The 2006 automated function prediction meeting.

Authors:  Ana P C Rodrigues; Barry J Grant; Adam Godzik; Iddo Friedberg
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

3.  An Ensemble Method to Distinguish Bacteriophage Virion from Non-Virion Proteins Based on Protein Sequence Characteristics.

Authors:  Lina Zhang; Chengjin Zhang; Rui Gao; Runtao Yang
Journal:  Int J Mol Sci       Date:  2015-09-09       Impact factor: 5.923

4.  Investigating Evolutionary Dynamics of RHA1 Operons.

Authors:  Yong Chen; Dandan Geng; Kristina Ehrhardt; Shaoqiang Zhang
Journal:  Evol Bioinform Online       Date:  2016-06-28       Impact factor: 1.625

5.  Identification of Bacteriophage Virion Proteins Using Multinomial Naïve Bayes with g-Gap Feature Tree.

Authors:  Yanyuan Pan; Hui Gao; Hao Lin; Zhen Liu; Lixia Tang; Songtao Li
Journal:  Int J Mol Sci       Date:  2018-06-15       Impact factor: 5.923

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.