| Literature DB >> 32309431 |
Chao Fang1, Yajie Jia1,2, Lihong Hu1, Yinghua Lu1,3, Han Wang1,2,3.
Abstract
As an important category of proteins, alpha-helix transmembrane proteins (αTMPs) play an important role in various biological activities. Because the solved αTMP structures are inadequate, predicting the residue contacts among the transmembrane segments of an αTMP exhibits the basis of protein fold, which can be used to further discover more protein functions. A few efforts have been devoted to predict the interhelical residue contact using machine learning methods based on the prior knowledge of transmembrane protein structure. However, it is still a challenge to improve the prediction accuracy, while the deep learning method provides an opportunity to utilize the structural knowledge in a different insight. For this purpose, we proposed a novel αTMP residue-residue contact prediction method IMPContact, in which a convolutional neural network (CNN) was applied to recognize those interhelical contacts in a TMP using its specific structural features. There were four sequence-based TMP-specific features selected to descript a pair of residues, namely, evolutionary covariation, predicted topology structure, residue relative position, and evolutionary conservation. An up-to-date dataset was used to train and test the IMPContact; our method achieved better performance compared to peer methods. In the case studies, IHRCs in the regular transmembrane helixes were better predicted than in the irregular ones.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32309431 PMCID: PMC7140131 DOI: 10.1155/2020/4569037
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1The neural network construction of IMPContact.
Comparison of candidate evolutionary covariation methods.
| Methods | ACC | MCC |
|---|---|---|
| ELSC | 0.8408 | 0.3371 |
| MI | 0.9742 | 0 |
| OMES | 0.7122 | 0.1335 |
The prediction accuracy comparison of 4 classifiers.
| Methods | ACC | MCC |
|---|---|---|
| RF | 0.6337 | 0.1209 |
| SVM | 0.7451 | 0.1478 |
| NN | 0.8050 | 0.1745 |
| CNN | 0.8408 | 0.3371 |
Figure 2Prediction model comparison on fivefold cross-validation.
Figure 3Sequence length distributions of datasets.
Comparison with peer methods.
| Methods | ACC | Precision | Recall | MCC | Predicted samples |
|---|---|---|---|---|---|
| CCMpred | 0.9978 | — | — | — | 35 |
| PSICOV | 0.9878 | 0.0030 | 0.3842 | 0.0166 | 13 |
| IMPContact | 0.6293 | 0.0035 | 0.5920 | 0.0271 | 35 |
Figure 4Prediction case of 3UDC_A.
Figure 5Prediction case of 2WSC_G.