Literature DB >> 20500020

Identifying gene clusters within localized regions in multiple genomes.

Qingwu Yang1, Gangman Yi, Fenghui Zhang, Michael R Thon, Sing-Hoi Sze.   

Abstract

An important strategy to study genome evolution is to investigate the clustering of orthologous genes among multiple genomes, in which the most popular approaches require that the distance between adjacent genes in a cluster be small. We investigate a different formulation based on constraining the overall size of a cluster and develop statistical significance estimates that allow direct comparison of clusters of different sizes. We first consider a restricted version which requires that orthologous genes are strictly ordered within each cluster and show that it can be solved in polynomial time. We then develop practical exact algorithms for the unrestricted problem that allows paralogous genes within a genome and clusters that may not appear in every genome while considering a general model in which a gene is allowed to appear in more than one orthologous group. We show that our algorithm can identify biologically relevant gene clusters on four bacterial genomes Bacillus subtilis, Streptococcus pyogenes, Streptococcus pneumoniae, and Clostridium acetobutylicum. We also show that our algorithm can identify significantly more functionally enriched gene clusters on four yeast genomes Saccharomyces cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, and Saccharomyces bayanus than previous algorithms. A software program (GCFinder) and a list of gene clusters found on the bacterial and the yeast genomes are available at http://faculty.cse.tamu.edu/shsze/gcfinder .

Entities:  

Mesh:

Year:  2010        PMID: 20500020     DOI: 10.1089/cmb.2009.0116

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  1 in total

1.  Identification of conserved gene clusters in multiple genomes based on synteny and homology.

Authors:  Anasua Sarkar; Hayssam Soueidan; Macha Nikolski
Journal:  BMC Bioinformatics       Date:  2011-10-05       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.