| Literature DB >> 33266711 |
Chi-Wei Chen1,2, Kai-Po Chang3,4, Cheng-Wei Ho2, Hsung-Pin Chang1, Yen-Wei Chu2,3,5.
Abstract
Thermostability is a protein property that impacts many types of studies, including protein activity enhancement, protein structure determination, and drug development. However, most computational tools designed to predict protein thermostability require tertiary structure data as input. The few tools that are dependent only on the primary structure of a protein to predict its thermostability have one or more of the following problems: a slow execution speed, an inability to make large-scale mutation predictions, and the absence of temperature and pH as input parameters. Therefore, we developed a computational tool, named KStable, that is sequence-based, computationally rapid, and includes temperature and pH values to predict changes in the thermostability of a protein upon the introduction of a mutation at a single site. KStable was trained using basis features and minimal redundancy-maximal relevance (mRMR) features, and 58 classifiers were subsequently tested. To find the representative features, a regular-mRMR method was developed. When KStable was evaluated with an independent test set, it achieved an accuracy of 0.708.Entities:
Keywords: feature selection; hill-climbing algorithm; machine learning; protein thermostability; single-site mutations
Year: 2018 PMID: 33266711 PMCID: PMC7512587 DOI: 10.3390/e20120988
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Workflow schematic for construction of KStable. mRMR, minimal redundancy–maximal relevance.
The performance of the best classifiers in S2605.
| Category | Classifier |
|
|
|---|---|---|---|
| function | LIBSVM | 0.781 | 0.436 |
| lazy | KStar | 0.817 | 0.547 |
| meta | RotationForest | 0.805 | 0.513 |
| rules | PART 1 | 0.788 | 0.478 |
| trees | RandomForest | 0.784 | 0.474 |
| trees | RandomTree | 0.774 | 0.459 |
1 PART: partial decision trees.
Performance of the hill-climbing procedure when regular-mRMR (redundancy–maximal relevance) features were combined with the basis features.
| Feature Combination |
|
|
|
|
|---|---|---|---|---|
| Basis | 0.641 | 0.906 | 0.818 | 0.578 |
| Basis + M324 | 0.656 | 0.887 | 0.810 | 0.562 |
| Basis + V651 | 0.677 | 0.894 | 0.822 | 0.590 |
| Basis + V651 + V453 | 0.687 | 0.888 | 0.821 | 0.590 |
| Basis + V651 + V453 + M149 | 0.708 | 0.887 | 0.827 | 0.607 |
Comparison of feature-selection methods.
| Method |
|
|
|
|
|---|---|---|---|---|
| Regular-mRMR | 0.708 | 0.887 | 0.827 | 0.607 |
| InfoGain | 0.683 | 0.896 | 0.825 | 0.597 |
| ChiSquared | 0.692 | 0.890 | 0.824 | 0.598 |
Comparison of prediction results with the use of S438.
| Method |
|
|
|
|
|---|---|---|---|---|
|
| ||||
| KStable | 0.411 | 0.856 | 0.708 | 0.298 |
| EASE-MM | 0.658 | 0.757 | 0.724 | 0.402 |
| I-Mutant2.0_seq | 0.185 | 0.918 | 0.674 | 0.151 |
| INPS_seq | 0.260 | 0.901 | 0.687 | 0.211 |
| iPTREE-STAB | 0.233 | 0.949 | 0.710 | 0.271 |
| MUpro | 0.267 | 0.901 | 0.689 | 0.218 |
|
| ||||
| AUTO-MUTE SVM | 0.075 | 0.969 | 0.671 | 0.101 |
| AUTO-MUTE RF | 0.137 | 0.976 | 0.696 | 0.222 |
| CUPSAT | 0.342 | 0.747 | 0.612 | 0.093 |
| DUET | 0.308 | 0.962 | 0.744 | 0.382 |
| I-Mutant2.0 | 0.233 | 0.918 | 0.689 | 0.210 |
| MAESTRO | 0.342 | 0.921 | 0.728 | 0.334 |
| mCSM | 0.212 | 0.979 | 0.724 | 0.325 |
| PoPMuSiC | 0.247 | 0.955 | 0.719 | 0.302 |
| SDM | 0.733 | 0.736 | 0.735 | 0.448 |
| SDM2 | 0.616 | 0.774 | 0.721 | 0.384 |
Comparison of performances by KStable and EASE-MM, as correlated with secondary structures and solvent accessibility.
| Structure * | Method |
|
|
|
|
|---|---|---|---|---|---|
| Coil | KStable | 0.444 | 0.818 | 0.636 | 0.284 |
| EASE-MM | 0.571 | 0.682 | 0.628 | 0.255 | |
| Helix | KStable | 0.400 | 0.806 | 0.661 | 0.222 |
| EASE-MM | 0.750 | 0.676 | 0.702 | 0.409 | |
| Strand | KStable | 0.348 | 0.924 | 0.830 | 0.308 |
| EASE-MM | 0.652 | 0.873 | 0.837 | 0.474 | |
| Buried | KStable | 0.419 | 0.903 | 0.777 | 0.368 |
| EASE-MM | 0.500 | 0.835 | 0.748 | 0.339 | |
| Surface | KStable | 0.315 | 0.725 | 0.514 | 0.044 |
| EASE-MM | 0.796 | 0.549 | 0.676 | 0.357 | |
| Under surface | KStable | 0.567 | 0.831 | 0.747 | 0.405 |
| EASE-MM | 0.733 | 0.708 | 0.716 | 0.414 |
* Structures defined by AUTO-MUTE.