Literature DB >> 24930111

Optimal subset selection of primary sequence features using the genetic algorithm for thermophilic proteins identification.

LiQiang Wang1, CuiFeng Li.   

Abstract

A genetic algorithm (GA) coupled with multiple linear regression (MLR) was used to extract useful features from amino acids and g-gap dipeptides for distinguishing between thermophilic and non-thermophilic proteins. The method was trained by a benchmark dataset of 915 thermophilic and 793 non-thermophilic proteins. The method reached an overall accuracy of 95.4 % in a Jackknife test using nine amino acids, 38 0-gap dipeptides and 29 1-gap dipeptides. The accuracy as a function of protein size ranged between 85.8 and 96.9 %. The overall accuracies of three independent tests were 93, 93.4 and 91.8 %. The observed results of detecting thermophilic proteins suggest that the GA-MLR approach described herein should be a powerful method for selecting features that describe thermostabile machines and be an aid in the design of more stable proteins.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24930111     DOI: 10.1007/s10529-014-1577-3

Source DB:  PubMed          Journal:  Biotechnol Lett        ISSN: 0141-5492            Impact factor:   2.461


  4 in total

Review 1.  Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins.

Authors:  Phasit Charoenkwan; Nalini Schaduangrat; Md Mehedi Hasan; Mohammad Ali Moni; Pietro Lió; Watshara Shoombuatong
Journal:  EXCLI J       Date:  2022-03-02       Impact factor: 4.022

2.  The optimization of Marasmius androsaceus submerged fermentation conditions in five-liter fermentor.

Authors:  Fanxin Meng; Gaoyang Xing; Yutong Li; Jia Song; Yanzhen Wang; Qingfan Meng; Jiahui Lu; Yulin Zhou; Yan Liu; Di Wang; Lirong Teng
Journal:  Saudi J Biol Sci       Date:  2015-06-27       Impact factor: 4.219

3.  A novel sequence-based predictor for identifying and characterizing thermophilic proteins using estimated propensity scores of dipeptides.

Authors:  Phasit Charoenkwan; Warot Chotpatiwetchkul; Vannajan Sanghiran Lee; Chanin Nantasenamat; Watshara Shoombuatong
Journal:  Sci Rep       Date:  2021-12-10       Impact factor: 4.379

4.  Discrimination of Thermophilic Proteins and Non-thermophilic Proteins Using Feature Dimension Reduction.

Authors:  Zifan Guo; Pingping Wang; Zhendong Liu; Yuming Zhao
Journal:  Front Bioeng Biotechnol       Date:  2020-10-22
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.