| Literature DB >> 34992646 |
Abstract
Intrinsically disordered proteins (IDPs) possess at least one region that lacks a single stable structure in vivo, which makes them play an important role in a variety of biological functions. We propose a prediction method for IDPs based on convolutional neural networks (CNNs) and feature selection. The combination of sequence and evolutionary properties is used to describe the differences between disordered and ordered regions. Especially, to highlight the correlation between the target residue and adjacent residues, multiple windows are selected to preprocess the protein sequence through the selected properties. The shorter windows reflect the characteristics of the central residue, and the longer windows reflect the characteristics of the surroundings around the central residue. Moreover, to highlight the specificity of sequence and evolutionary properties, they are preprocessed, respectively. After that, the preprocessed properties are combined into feature matrices as the input of the constructed CNN. Our method is training as well as testing based on the DisProt database. The simulation results show that the proposed method can predict IDPs effectively, and the performance is competitive in comparison with IsUnstruct and ESpritz.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34992646 PMCID: PMC8727116 DOI: 10.1155/2021/4455604
Source DB: PubMed Journal: Comput Intell Neurosci
Datasets used in this paper.
| Datasets | Disordered regions | Ordered regions | Disordered residues | Ordered residues |
|---|---|---|---|---|
| Training set | 1,120 | 1,198 | 85,184 | 289,983 |
| Test set | 134 | 145 | 7,234 | 25,873 |
| Total | 1,254 | 1,343 | 92,418 | 315,856 |
Figure 1The structure of the CNN.
Figure 2The prediction procedure of the proposed method.
Figure 3The trend of each evaluation parameter with different number of windows.
Prediction performance of different number of windows.
| Step length |
| Sens | Spec | Sw | MCC |
|---|---|---|---|---|---|
| 1 | 25 | 0.8032 | 0.7594 | 0.5626 | 0.4852 |
| 2 | 13 | 0.7991 | 0.7676 | 0.5667 | 0.4909 |
| 3 | 9 | 0.8153 | 0.7676 | 0.5829 | 0.5038 |
| 4 | 7 | 0.8061 | 0.7787 | 0.5848 | 0.5089 |
Prediction performance of different number of windows.
| Window distance |
| Sens | Spec | Sw | MCC |
|---|---|---|---|---|---|
| 8 | 7 | 0.8061 | 0.7787 | 0.5848 | 0.5089 |
| 8 | 11 | 0.7833 | 0.8146 | 0.5979 | 0.5332 |
| 6 | 9 | 0.8153 | 0.7676 | 0.5829 | 0.5038 |
| 6 | 15 | 0.8914 | 0.7056 | 0.5970 | 0.5012 |
Prediction performance of different convolutional layers.
| Number of convolutional layers | The scale of convolutional parameter | Sens | Spec | Sw | MCC |
|---|---|---|---|---|---|
| 2 | 3 × 3 × 1 × 8, 2 × 2 × 8 × 8 | 0.7833 | 0.8146 | 0.5979 | 0.5332 |
| 3 | 3 × 3 × 1 × 8, (2 × 2 × 8 × 8) × 2 | 0.7813 | 0.7961 | 0.5774 | 0.5092 |
| 4 | 3 × 3 × 1 × 8, (2 × 2 × 8 × 8) × 3 | 0.7654 | 0.7945 | 0.5599 | 0.4946 |
| 5 | 3 × 3 × 1 × 8, (2 × 2 × 8 × 8) × 4 | 0.7513 | 0.8028 | 0.5541 | 0.4932 |
Prediction performance of different scales of convolution kernel in conv1.
| The scale of conv1 | Sens | Spec | Sw | MCC |
|---|---|---|---|---|
| 2 × 2 | 0.7972 | 0.8090 | 0.6062 | 0.5373 |
| 3 × 3 | 0.7833 | 0.8146 | 0.5979 | 0.5332 |
| 4 × 4 | 0.7553 | 0.8414 | 0.5967 | 0.5457 |
| 5 × 5 | 0.7573 | 0.8377 | 0.5950 | 0.5423 |
Prediction performance of different scales of convolution kernel in conv2.
| The scale of conv2 | Sens | Spec | Sw | MCC |
|---|---|---|---|---|
| 2 × 2 | 0.7972 | 0.8090 | 0.6062 | 0.5373 |
| 3 × 3 | 0.7715 | 0.8291 | 0.6006 | 0.5422 |
| 4 × 4 | 0.7832 | 0.8047 | 0.5879 | 0.5210 |
| 5 × 5 | 0.8169 | 0.7734 | 0.5902 | 0.5114 |
Prediction performance of different number of convolution kernel in conv1.
|
| Sens | Spec | Sw | MCC |
|---|---|---|---|---|
| 4 | 0.7960 | 0.8321 | 0.6281 | 0.5654 |
| 8 | 0.7972 | 0.8090 | 0.6062 | 0.5373 |
| 16 | 0.7442 | 0.8349 | 0.5791 | 0.5282 |
| 32 | 0.8226 | 0.7436 | 0.5660 | 0.4838 |
Prediction performance of different number of convolution kernel in conv1.
|
| Sens | Spec | Sw | MCC |
|---|---|---|---|---|
| 4 | 0.7960 | 0.8321 | 0.6281 | 0.5654 |
| 8 | 0.7972 | 0.8090 | 0.6062 | 0.5373 |
| 16 | 0.7442 | 0.8349 | 0.5791 | 0.5282 |
| 32 | 0.8226 | 0.7436 | 0.5660 | 0.4838 |
Prediction performance of different number of convolution kernel in conv1.
| Methods | Sens | Spec | Sw | MCC |
|---|---|---|---|---|
| Our method | 0.7264 | 0.8301 | 0.5565 | 0.5060 |
| IsUnstruct | 0.7513 | 0.7855 | 0.5368 | 0.4711 |
| ESpritz | 0.7255 | 0.8135 | 0.5389 | 0.4840 |