Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Using affinity propagation combined post-processing to cluster protein sequences.

Literature DB >> 19594428

Using affinity propagation combined post-processing to cluster protein sequences.

Abstract

The sizes of the protein databases are growing rapidly nowadays thus clustering protein sequences based only on sequence information becomes increasingly important. In this paper, we analyze the limitation of Affinity propagation (AP) algorithm when clustering a dataset generated randomly. Then we propose a post-processing method to improve the AP algorithm. This method uses the median of the input similarities as the shared preference value, and then employs post-processing phase combined mergence and reassignment strategy on the results of the AP algorithm. We have tested our method extensively and compared its performance with other five methods on several datasets of COG (Clusters of Orthologous Groups of proteins) database, SCOP and G-protein family. The number of clusters obtained for a given set of proteins approximate to the correct number of clusters in that set. Moreover, in our experiments, the quality of the clusters as quantified by F-measure was better than that of others (on average, 9% better than BlastClust, 33% better than TribeMCL, 34% better than CLUSS, 59% better than Spectral clustering and 41% better than AP).

Entities: Gene

Mesh：

Substances：
Proteins

Year: 2010 PMID： 19594428 DOI： 10.2174/092986610791190255

Source DB: PubMed Journal: Protein Pept Lett ISSN： 0929-8665 Impact factor: 1.890

Keyword Cloud
Cited

1 in total

1. TBC: a clustering algorithm based on prokaryotic taxonomy.

Authors: Jae-Hak Lee; Hana Yi; Yoon-Seong Jeon; Sungho Won; Jongsik Chun
Journal: J Microbiol Date: 2012-04-27 Impact factor: 3.422

1 in total