Literature DB >> 18430250

Genome classification by gene distribution: an overlapping subspace clustering approach.

Jason Li1, Saman K Halgamuge, Sen-Lin Tang.   

Abstract

BACKGROUND: Genomes of lower organisms have been observed with a large amount of horizontal gene transfers, which cause difficulties in their evolutionary study. Bacteriophage genomes are a typical example. One recent approach that addresses this problem is the unsupervised clustering of genomes based on gene order and genome position, which helps to reveal species relationships that may not be apparent from traditional phylogenetic methods.
RESULTS: We propose the use of an overlapping subspace clustering algorithm for such genome classification problems. The advantage of subspace clustering over traditional clustering is that it can associate clusters with gene arrangement patterns, preserving genomic information in the clusters produced. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome, such as those acquired from different species via horizontal gene transfers. The proposed method involves a novel strategy to vectorize genomes based on their gene distribution. A number of existing subspace clustering and biclustering algorithms were evaluated to identify the best framework upon which to develop our algorithm; we extended a generic subspace clustering algorithm called HARP to incorporate overlapping capability. The proposed algorithm was assessed and applied on bacteriophage genomes. The phage grouping results are consistent overall with the Phage Proteomic Tree and showed common genomic characteristics among the TP901-like, Sfi21-like and sk1-like phage groups. Among 441 phage genomes, we identified four significantly conserved distribution patterns structured by the terminase, portal, integrase, holin and lysin genes. We also observed a subgroup of Sfi21-like phages comprising a distinctive divergent genome organization and identified nine new phage members to the Sfi21-like genus: Staphylococcus 71, phiPVL108, Listeria A118, 2389, Lactobacillus phi AT3, A2, Clostridium phi3626, Geobacillus GBSV1, and Listeria monocytogenes PSA.
CONCLUSION: The method described in this paper can assist evolutionary study through objectively classifying genomes based on their resemblance in gene order, gene content and gene positions. The method is suitable for application to genomes with high genetic exchange and various conserved gene arrangement, as demonstrated through our application on phages.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18430250      PMCID: PMC2383906          DOI: 10.1186/1471-2148-8-116

Source DB:  PubMed          Journal:  BMC Evol Biol        ISSN: 1471-2148            Impact factor:   3.260


  35 in total

1.  Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context.

Authors:  Y I Wolf; I B Rogozin; A S Kondrashov; E V Koonin
Journal:  Genome Res       Date:  2001-03       Impact factor: 9.043

2.  Bringing gene order into bacterial shape.

Authors:  J Tamames; M González-Moreno; J Mingorance; A Valencia; M Vicente
Journal:  Trends Genet       Date:  2001-03       Impact factor: 11.639

3.  Genome-scale evolution: reconstructing gene orders in the ancestral species.

Authors:  Guillaume Bourque; Pavel A Pevzner
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

Review 4.  Phage genomics: small is beautiful.

Authors:  Harald Brüssow; Roger W Hendrix
Journal:  Cell       Date:  2002-01-11       Impact factor: 41.582

5.  New approaches for reconstructing phylogenies from gene order data.

Authors:  B M Moret; L S Wang; T Warnow; S K Wyman
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

6.  The nucleotide sequence of Shiga toxin (Stx) 2e-encoding phage phiP27 is not related to other Stx phage genomes, but the modular genetic structure is conserved.

Authors:  Jürgen Recktenwald; Herbert Schmidt
Journal:  Infect Immun       Date:  2002-04       Impact factor: 3.441

7.  Connected gene neighborhoods in prokaryotic genomes.

Authors:  Igor B Rogozin; Kira S Makarova; Janos Murvai; Eva Czabarka; Yuri I Wolf; Roman L Tatusov; Laszlo A Szekely; Eugene V Koonin
Journal:  Nucleic Acids Res       Date:  2002-05-15       Impact factor: 16.971

Review 8.  Bacteriophages: evolution of the majority.

Authors:  Roger W Hendrix
Journal:  Theor Popul Biol       Date:  2002-06       Impact factor: 1.570

9.  Discovering statistically significant biclusters in gene expression data.

Authors:  Amos Tanay; Roded Sharan; Ron Shamir
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

10.  The Phage Proteomic Tree: a genome-based taxonomy for phage.

Authors:  Forest Rohwer; Rob Edwards
Journal:  J Bacteriol       Date:  2002-08       Impact factor: 3.490

View more
  4 in total

1.  Evolutionarily conserved orthologous families in phages are relatively rare in their prokaryotic hosts.

Authors:  David M Kristensen; Xixu Cai; Arcady Mushegian
Journal:  J Bacteriol       Date:  2011-02-11       Impact factor: 3.490

2.  Clustering of High Throughput Gene Expression Data.

Authors:  Harun Pirim; Burak Ekşioğlu; Andy Perkins; Cetin Yüceer
Journal:  Comput Oper Res       Date:  2012-12       Impact factor: 4.008

3.  The Caulobacter crescentus phage phiCbK: genomics of a canonical phage.

Authors:  Jason J Gill; Joel D Berry; William K Russell; Lauren Lessor; Diego A Escobar-Garcia; Daniel Hernandez; Ashley Kane; Jennifer Keene; Matthew Maddox; Rebecca Martin; Sheba Mohan; Ashlyn M Thorn; David H Russell; Ry Young
Journal:  BMC Genomics       Date:  2012-10-10       Impact factor: 3.969

4.  An application of the Shapley value to the analysis of co-expression networks.

Authors:  Giulia Cesari; Encarnación Algaba; Stefano Moretti; Juan A Nepomuceno
Journal:  Appl Netw Sci       Date:  2018-08-24
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.