Literature DB >> 17277410

Ortholog clustering on a multipartite graph.

Akshay Vashist1, Casimir A Kulikowski, Ilya Muchnik.   

Abstract

We present a method for automatically extracting groups of orthologous genes from a large set of genomes by a new clustering algorithm on a weighted multipartite graph. The method assigns a score to an arbitrary subset of genes from multiple genomes to assess the orthologous relationships between genes in the subset. This score is computed using sequence similarities between the member genes and the phylogenetic relationship between the corresponding genomes. An ortholog cluster is found as the subset with the highest score, so ortholog clustering is formulated as a combinatorial optimization problem. The algorithm for finding an ortholog cluster runs in time O(absolute value(E) + absolute value(V) log absolute value(V)), where V and E are the sets of vertices and edges, respectively, in the graph. However, if we discretize the similarity scores into a constant number of bins, the runtime improves to O(absolute value(E) + absolute value(V)). The proposed method was applied to seven complete eukaryote genomes on which the manually curated database of eukaryotic ortholog clusters, KOG, is constructed. A comparison of our results with the manually curated ortholog clusters shows that our clusters are well correlated with the existing clusters.

Mesh:

Year:  2007        PMID: 17277410     DOI: 10.1109/TCBB.2007.1004

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  3 in total

1.  Computational methods for Gene Orthology inference.

Authors:  David M Kristensen; Yuri I Wolf; Arcady R Mushegian; Eugene V Koonin
Journal:  Brief Bioinform       Date:  2011-06-19       Impact factor: 11.622

2.  A low-polynomial algorithm for assembling clusters of orthologous groups from intergenomic symmetric best matches.

Authors:  David M Kristensen; Lavanya Kannan; Michael K Coleman; Yuri I Wolf; Alexander Sorokin; Eugene V Koonin; Arcady Mushegian
Journal:  Bioinformatics       Date:  2010-05-02       Impact factor: 6.937

3.  MultiMSOAR 2.0: an accurate tool to identify ortholog groups among multiple genomes.

Authors:  Guanqun Shi; Meng-Chih Peng; Tao Jiang
Journal:  PLoS One       Date:  2011-06-21       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.