Literature DB >> 22025764

Clustering 100,000 protein structure decoys in minutes.

Shuai Cheng Li1, Dongbo Bu, Ming Li.   

Abstract

Ab initio protein structure prediction methods first generate large sets of structural conformations as candidates (called decoys), and then select the most representative decoys through clustering techniques. Classical clustering methods are inefficient due to the pairwise distance calculation, and thus become infeasible when the number of decoys is large. In addition, the existing clustering approaches suffer from the arbitrariness in determining a distance threshold for proteins within a cluster: a small distance threshold leads to many small clusters, while a large distance threshold results in the merging of several independent clusters into one cluster. In this paper, we propose an efficient clustering method through fast estimating cluster centroids and efficient pruning rotation spaces. The number of clusters is automatically detected by information distance criteria. A package named ONION, which can be downloaded freely, is implemented accordingly. Experimental results on benchmark data sets suggest that ONION is 14 times faster than existing tools, and ONION obtains better selections for 31 targets, and worse selection for 19 targets compared to SPICKER’s selections. On an average PC, ONION can cluster 100,000 decoys in around 12 minutes.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22025764     DOI: 10.1109/TCBB.2011.142

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  3 in total

1.  Fast algorithm for population-based protein structural model analysis.

Authors:  Jingfen Zhang; Dong Xu
Journal:  Proteomics       Date:  2013-01-03       Impact factor: 3.984

2.  Ranking near-native candidate protein structures via random forest classification.

Authors:  Hongjie Wu; Hongmei Huang; Weizhong Lu; Qiming Fu; Yijie Ding; Jing Qiu; Haiou Li
Journal:  BMC Bioinformatics       Date:  2019-12-24       Impact factor: 3.169

3.  Reducing Ensembles of Protein Tertiary Structures Generated De Novo via Clustering.

Authors:  Ahmed Bin Zaman; Parastoo Kamranfar; Carlotta Domeniconi; Amarda Shehu
Journal:  Molecules       Date:  2020-05-09       Impact factor: 4.411

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.