Literature DB >> 22486614

Predicting protein solubility by the general form of Chou's pseudo amino acid composition: approached from chaos game representation and fractal dimension.

Xiao-Hui Niu1, Xue-Hai Hu, Feng Shi, Jing-Bo Xia.   

Abstract

Obtaining soluble proteins in sufficient concentrations is a major obstacle in various experimental studies. How to predict the propensity of targets in large-scale proteomics projects to be soluble is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) can investigate the patterns hiding in protein sequences, and can visually reveal previously unknown structure. Fractal dimensions are good tools to measure sizes of complex, highly irregular geometric objects. In this paper, we convert each protein sequence into a high-dimensional vector by CGR algorithm and fractal dimension, and then predict protein solubility by these fractal features together with Chou's pseudo amino acid composition features and support vector machine (SVM). We extract and study six groups of features computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test. As the results of comparisons, the group of 445-dimensional vector gets the best results, the average accuracy is 0.8741 and average MCC is 0.7358. The resulting predictor is also compared with existing methods and shows significant improvement.

Mesh:

Substances:

Year:  2012        PMID: 22486614     DOI: 10.2174/092986612802084492

Source DB:  PubMed          Journal:  Protein Pept Lett        ISSN: 0929-8665            Impact factor:   1.890


  6 in total

Review 1.  Some illuminating remarks on molecular genetics and genomics as well as drug development.

Authors:  Kuo-Chen Chou
Journal:  Mol Genet Genomics       Date:  2020-01-01       Impact factor: 3.291

2.  A multilabel model based on Chou's pseudo-amino acid composition for identifying membrane proteins with both single and multiple functional types.

Authors:  Chao Huang; Jing-Qi Yuan
Journal:  J Membr Biol       Date:  2013-04-02       Impact factor: 1.843

3.  FEPS: A Tool for Feature Extraction from Protein Sequence.

Authors:  Hamid Ismail; Clarence White; Hussam Al-Barakati; Robert H Newman; Dukka B Kc
Journal:  Methods Mol Biol       Date:  2022

4.  iSNO-PseAAC: predict cysteine S-nitrosylation sites in proteins by incorporating position specific amino acid propensity into pseudo amino acid composition.

Authors:  Yan Xu; Jun Ding; Ling-Yun Wu; Kuo-Chen Chou
Journal:  PLoS One       Date:  2013-02-07       Impact factor: 3.240

5.  PseAAC-General: fast building various modes of general form of Chou's pseudo-amino acid composition for large-scale protein datasets.

Authors:  Pufeng Du; Shuwang Gu; Yasen Jiao
Journal:  Int J Mol Sci       Date:  2014-02-26       Impact factor: 5.923

6.  Understanding the undelaying mechanism of HA-subtyping in the level of physic-chemical characteristics of protein.

Authors:  Mansour Ebrahimi; Parisa Aghagolzadeh; Narges Shamabadi; Ahmad Tahmasebi; Mohammed Alsharifi; David L Adelson; Farhid Hemmatzadeh; Esmaeil Ebrahimie
Journal:  PLoS One       Date:  2014-05-08       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.