Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A graph-based clustering method for a large set of sequences using a graph partitioning algorithm.

Literature DB >> 11791228

A graph-based clustering method for a large set of sequences using a graph partitioning algorithm.

H Kawaji¹, Y Yamaguchi, H Matsuda, A Hashimoto.

Abstract

A graph-based clustering method is proposed to cluster protein sequences into families, which automatically improves clusters of the conventional single linkage clustering method. Our approach formulates sequence clustering problem as a kind of graph partitioning problem in a weighted linkage graph, which vertices correspond to sequences, edges correspond to higher similarities than given threshold and are weighted by their similarities. The effectiveness of our method is shown in comparison with InterPro families in all mouse proteins in SWISS-PROT. The result clusters match to InterPro families much better than the single linkage clustering method. 77% of proteins in InterPro families are classified into appropriate clusters.

Entities: Species

Mesh：

Substances：
Proteins

Year: 2001 PMID： 11791228

Source DB: PubMed Journal: Genome Inform ISSN： 0919-9454

Keyword Cloud
Cited

2 in total

1. Visualizing sequence similarity of protein families.

Authors: Vamsi Veeramachaneni; Wojciech Makałowski
Journal: Genome Res Date: 2004-05-12 Impact factor: 9.043

2. Medical record linkage in health information systems by approximate string matching and clustering.

Authors: Erik A Sauleau; Jean-Philippe Paumier; Antoine Buemi
Journal: BMC Med Inform Decis Mak Date: 2005-10-11 Impact factor: 2.796

2 in total