| Literature DB >> 35935483 |
Xin Jin1,2, Lin Guo1,2, Qian Jiang1,2, Nan Wu1,2, Shaowen Yao1,2.
Abstract
Prediction of the protein secondary structure is a key issue in protein science. Protein secondary structure prediction (PSSP) aims to construct a function that can map the amino acid sequence into the secondary structure so that the protein secondary structure can be obtained according to the amino acid sequence. Driven by deep learning, the prediction accuracy of the protein secondary structure has been greatly improved in recent years. To explore a new technique of PSSP, this study introduces the concept of an adversarial game into the prediction of the secondary structure, and a conditional generative adversarial network (GAN)-based prediction model is proposed. We introduce a new multiscale convolution module and an improved channel attention (ICA) module into the generator to generate the secondary structure, and then a discriminator is designed to conflict with the generator to learn the complicated features of proteins. Then, we propose a PSSP method based on the proposed multiscale convolution module and ICA module. The experimental results indicate that the conditional GAN-based protein secondary structure prediction (CGAN-PSSP) model is workable and worthy of further study because of the strong feature-learning ability of adversarial learning.Entities:
Keywords: channel attention; deep learning; generative adversarial networks; neural networks; protein secondary structure prediction; protein structure prediction
Year: 2022 PMID: 35935483 PMCID: PMC9355137 DOI: 10.3389/fbioe.2022.901018
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
FIGURE 1Construction process of the position-specific scoring matrix.
FIGURE 2Concentrated matrix of the PSSM and one-hot form of protein sequences (Guo et al., 2020).
FIGURE 3Block diagram of the CGAN.
FIGURE 4Schematic of the proposed CGAN-PSSP.
FIGURE 5Generator structure of CGAN-PSSP.
Hyperparameters of the generator structure in CGAN-PSSP.
| Operation | Input | Convolution kernel size | Step | Output |
|---|---|---|---|---|
| Multiscale convolution | 700 × 42 | 11 | 1 | 700 × 256 |
| Multiscale convolution | 700 × 256 | 11 | 1 | 700 × 512 |
| Multiscale convolution | 700 × 512 | 11 | 1 | 700 × 2048 |
| Concatenation | 700 × 2048 | — | — | 700 × 2090 |
| 700 × 42 | ||||
| One-dimensional convolution | 700 × 2090 | 11 | 1 | 700 × 512 |
| One-dimensional convolution | 700 × 1,024 | 11 | 1 | 700 × 128 |
| One-dimensional convolution | 700 × 512 | 11 | 1 | 700 × 32 |
| One-dimensional convolution | 700 × 128 | 11 | 1 | 700 × 16 |
| One-dimensional convolution | 700 × 64 | 11 | 1 | 700 × 8 (700 × 3) |
FIGURE 6Schematic of the proposed multiscale convolution (MSC) module.
FIGURE 7Schematic of the improved channel attention (ICA) module.
FIGURE 8Operational process of the one-dimensional convolution (Guo et al., 2020).
FIGURE 9Discriminator structure of CGAN-PSSP.
Hyperparameters of discriminator structure on eight-state prediction.
| Operation | Input | Convolution kernel size | Step | Output |
|---|---|---|---|---|
| Concatenation | 700 × 42, 700 × 8 | — | — | 700 × 50 |
| One-dimensional convolution | 700 × 50 | 3 | 1 | 700 × 36 |
| One-dimensional convolution | 700 × 36 | 3 | 1 | 700 × 18 |
| One-dimensional convolution | 700 × 18 | 3 | 1 | 700 × 6 |
| One-dimensional convolution | 700 × 6 | 3 | 1 | 700 × 1 |
| Sigmoid | 700 × 1 | — | — | 700 × 1 |
FIGURE 10Schematic of the structure of MCNN-PSSP. Concat, concatenation operation; MSC, multiscale convolution module; 1D Conv, one-dimensional convolution operation.
Hyperparameters of MCNN-PSSP
| Operation | Input | Convolution kernel size | Step | Output |
|---|---|---|---|---|
| Multiscale convolution | 700 × 42 | 11 | 1 | 700 × 84 |
| Multiscale convolution | 700 × 168 | 11 | 1 | 700 × 256 |
| Concatenation | 700 × 256 | 700 × 298 | ||
| 700 × 42 | ||||
| U-net | 700 × 298 | 11 | 1 | 700 × 298 |
| Concatenation | 700 × 298 | 700 × 340 | ||
| 700 × 42 | ||||
| One-dimensional convolution | 700 × 340 | 11 | 1 | 700 × 210 |
| One-dimensional convolution | 700 × 210 | 11 | 1 | 700 × 128 |
| One-dimensional convolution | 700 × 128 | 11 | 1 | 700 × 64 |
| One-dimensional convolution | 700 × 64 | 11 | 1 | 700 × 32 |
| One-dimensional convolution | 700 × 32 | 11 | 1 | 700 × 16 |
| One-dimensional convolution | 700 × 16 | 11 | 1 | 700 × 8 (700 × 3) |
FIGURE 11Prediction module in MCNN-PSSP.
Q8/Q3 of the proposed methods on CullPDB.
| Training set (%) | Validation set (%) | Training set (%) | Validation set (%) | |
|---|---|---|---|---|
| Q8 accuracy | 86.7 | 75.1 | 87.4 | 84.1 |
| Q3 accuracy | 92.4 | 85.9 | 96.5 | 87.2 |
| CGAN-PSSP | CGAN-PSSP | MCNN-PSSP | MCNN-PSSP |
Q8 of different prediction models (-- means no testing).
| Method | CullPDB (%) | CB513 (%) | CASP10 (%) | CASP11 (%) |
|---|---|---|---|---|
| RaptorX-SS | 69.7 | 64.9 | 64.8 | 65.1 |
| GSN | 72.1 | 66.4 | — | — |
| DeepCNF | 75.2 | 68.3 | 71.8 | 72.3 |
| DCRNN | -- | 70.4 | 73.9 | 71.2 |
| SSREDN | 73.1 | 68.2 | — | — |
| CNNH_PSS | 74.0 | 70.3 | — | — |
| MUFOLD-SS | — | 70.5 | 74.2 | 71.6 |
| CRRNN | — | 71.4 | 73.8 | 71.6 |
| F1DCNN-SS | 74.1 | 70.5 | 74.9 | 71.3 |
| MCNN- PSSP | 74.2 | 70.6 | 74.9 | 71.5 |
| CGAN- PSSP | 74.0 | 70.3 | 74.6 | 71.3 |
Q3 of different prediction models (-- means no testing).
| Method | CullPDB (%) | CB513 (%) | CASP10 (%) | CASP11 (%) |
|---|---|---|---|---|
| RaptorX-SS | 81.5 | 78.3 | 78.9 | 79.1 |
| JPRED | 82.5 | 83.3 | 82.4 | 82.0 |
| DeepCNF | 85.4 | 82.3 | 84.4 | 84.7 |
| SSREDN | 84.2 | 82.9 | — | — |
| MUFOLD-SS | — | 82.7 | 84.3 | 82.3 |
| CRRNN | — | 85.3 | 86.1 | 84.2 |
| F1DCNN-SS | 86.2 | 84.5 | 87.8 | 84.7 |
| MCNN-PSSP | 86.3 | 84.7 | 87.7 | 84.8 |
| CGAN-PSSP | 86.0 | 84.3 | 87.4 | 84.8 |