Literature DB >> 25524475

Quality assessment of modeled protein structure using physicochemical properties.

Prashant Singh Rana1, Harish Sharma, Mahua Bhattacharya, Anupam Shukla.   

Abstract

Physicochemical properties of proteins always guide to determine the quality of the protein structure, therefore it has been rigorously used to distinguish native or native-like structure from other predicted structures. In this work, we explore nine machine learning methods with six physicochemical properties to predict the Root Mean Square Deviation (RMSD), Template Modeling (TM-score), and Global Distance Test (GDT_TS-score) of modeled protein structure in the absence of its true native state. Physicochemical properties namely total surface area, euclidean distance (ED), total empirical energy, secondary structure penalty (SS), sequence length (SL), and pair number (PN) are used. There are a total of 95,091 modeled structures of 4896 native targets. A real coded Self-adaptive Differential Evolution algorithm (SaDE) is used to determine the feature importance. The K-fold cross validation is used to measure the robustness of the best predictive method. Through the intensive experiments, it is found that Random Forest method outperforms over other machine learning methods. This work makes the prediction faster and inexpensive. The performance result shows the prediction of RMSD, TM-score, and GDT_TS-score on Root Mean Square Error (RMSE) as 1.20, 0.06, and 0.06 respectively; correlation scores are 0.96, 0.92, and 0.91 respectively; R(2) are 0.92, 0.85, and 0.84 respectively; and accuracy are 78.82% (with ± 0.1 err), 86.56% (with ± 0.1 err), and 87.37% (with ± 0.1 err) respectively on the testing data set. The data set used in the study is available as supplement at http://bit.ly/RF-PCP-DataSets.

Keywords:  Physicochemical properties of protein; SaDE; feature importance; machine learning; protein structure prediction; random forest

Mesh:

Substances:

Year:  2014        PMID: 25524475     DOI: 10.1142/S0219720015500055

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  1 in total

1.  Activity assessment of small drug molecules in estrogen receptor using multilevel prediction model.

Authors:  Vishan Kumar Gupta; Prashant Singh Rana
Journal:  IET Syst Biol       Date:  2019-06       Impact factor: 1.615

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.