Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Fast algorithm for population-based protein structural model analysis.

Literature DB >> 23184517

Fast algorithm for population-based protein structural model analysis.

Abstract

De novo protein structure prediction often generates a large population of candidates (models), and then selects near-native models through clustering. Existing structural model clustering methods are time consuming due to pairwise distance calculation between models. In this paper, we present a novel method for fast model clustering without losing the clustering accuracy. Instead of the commonly used pairwise root mean square deviation and TM-score values, we propose two new distance measures, Dscore1 and Dscore2, based on the comparison of the protein distance matrices for describing the difference and the similarity among models, respectively. The analysis indicates that both the correlation between Dscore1 and root mean square deviation and the correlation between Dscore2 and TM-score are high. Compared to the existing methods with calculation time quadratic to the number of models, our Dscore1-based clustering achieves a linearly time complexity while obtaining almost the same accuracy for near-native model selection. By using Dscore2 to select representatives of clusters, we can further improve the quality of the representatives with little increase in computing time. In addition, for large size (~500 k) models, we can give a fast data visualization based on the Dscore distribution in seconds to minutes. Our method has been implemented in a package named MUFOLD-CL, available at http://mufold.org/clustering.php.

Entities: Chemical Gene Species

Mesh：

Substances：
Proteins

Year: 2013 PMID： 23184517 PMCID： PMC3641909 DOI： 10.1002/pmic.201200334

Source DB: PubMed Journal: Proteomics ISSN： 1615-9853 Impact factor: 3.984

24 in total

1. Completeness of NOEs in protein structure: a statistical analysis of NMR.

Authors: J F Doreleijers; M L Raves; T Rullmann; R Kaptein
Journal: J Biomol NMR Date: 1999-06 Impact factor: 2.835

2. MaxSub: an automated measure for the assessment of protein structure prediction quality.

Authors: N Siew; A Elofsson; L Rychlewski; D Fischer
Journal: Bioinformatics Date: 2000-09 Impact factor: 6.937

3. Protein structure prediction and structural genomics.

Authors: D Baker; A Sali
Journal: Science Date: 2001-10-05 Impact factor: 47.728

Review 4. Protein structure similarities.

Authors: P Koehl
Journal: Curr Opin Struct Biol Date: 2001-06 Impact factor: 6.809

5. A new family of global protein shape descriptors.

Authors: Peter Røgen; Henrik Bohr
Journal: Math Biosci Date: 2003-04 Impact factor: 2.144

6. Fully automated ab initio protein structure prediction using I-SITES, HMMSTR and ROSETTA.

Authors: Christopher Bystroff; Yu Shao
Journal: Bioinformatics Date: 2002 Impact factor: 6.937

7. LGA: A method for finding 3D similarities in protein structures.

Authors: Adam Zemla
Journal: Nucleic Acids Res Date: 2003-07-01 Impact factor: 16.971

8. SPICKER: a clustering approach to identify near-native protein folds.

Authors: Yang Zhang; Jeffrey Skolnick
Journal: J Comput Chem Date: 2004-04-30 Impact factor: 3.376

9. Scoring function for automated assessment of protein structure template quality.

Authors: Yang Zhang; Jeffrey Skolnick
Journal: Proteins Date: 2004-12-01

10. SCUD: fast structure clustering of decoys using reference state to remove overall rotation.

Authors: Hongzhi Li; Yaoqi Zhou
Journal: J Comput Chem Date: 2005-08 Impact factor: 3.376

9 in total

1. Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11.

Authors: Renzhi Cao; Debswapna Bhattacharya; Badri Adhikari; Jilong Li; Jianlin Cheng
Journal: Proteins Date: 2015-09-29

2. Large-scale model quality assessment for improving protein tertiary structure prediction.

Authors: Renzhi Cao; Debswapna Bhattacharya; Badri Adhikari; Jilong Li; Jianlin Cheng
Journal: Bioinformatics Date: 2015-06-15 Impact factor: 6.937

3. Identify High-Quality Protein Structural Models by Enhanced K-Means.

Authors: Hongjie Wu; Haiou Li; Min Jiang; Cheng Chen; Qiang Lv; Chuang Wu
Journal: Biomed Res Int Date: 2017-03-22 Impact factor: 3.411

4. Unsupervised and Supervised Learning over theEnergy Landscape for Protein Decoy Selection.

Authors: Nasrin Akhter; Gopinath Chennupati; Kazi Lutful Kabir; Hristo Djidjev; Amarda Shehu
Journal: Biomolecules Date: 2019-10-14

5. Ranking near-native candidate protein structures via random forest classification.

Authors: Hongjie Wu; Hongmei Huang; Weizhong Lu; Qiming Fu; Yijie Ding; Jing Qiu; Haiou Li
Journal: BMC Bioinformatics Date: 2019-12-24 Impact factor: 3.169

6. Decoy selection for protein structure prediction via extreme gradient boosting and ranking.

Authors: Nasrin Akhter; Gopinath Chennupati; Hristo Djidjev; Amarda Shehu
Journal: BMC Bioinformatics Date: 2020-12-09 Impact factor: 3.169

7. Estimation of model accuracy by a unique set of features and tree-based regressor.

Authors: Mor Bitton; Chen Keasar
Journal: Sci Rep Date: 2022-08-18 Impact factor: 4.996

8. UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling.

Authors: Debswapna Bhattacharya; Renzhi Cao; Jianlin Cheng
Journal: Bioinformatics Date: 2016-06-03 Impact factor: 6.937

9. QDeep: distance-based protein model quality estimation by residue-level ensemble error classifications using stacked deep residual neural networks.

Authors: Md Hossain Shuvo; Sutanu Bhattacharya; Debswapna Bhattacharya
Journal: Bioinformatics Date: 2020-07-01 Impact factor: 6.937

9 in total