| Literature DB >> 35540205 |
Shiyang Long1, Pu Tian2.
Abstract
Protein secondary structure (SS) prediction is important for studying protein structure and function. Both traditional machine learning methods and deep learning neural networks have been utilized and great progress has been achieved in approaching the theoretical limit. Convolutional and recurrent neural networks are two major types of deep learning architectures with comparable prediction accuracy but different training procedures to achieve optimal performance. We are interested in seeking a novel architectural style with competitive performance and in understanding the performance of different architectures with similar training procedures. We constructed a context convolutional neural network (Contextnet) and compared its performance with popular models (e.g. convolutional neural network, recurrent neural network, conditional neural fields…) under similar training procedures on a Jpred dataset. The Contextnet was proven to be highly competitive. Additionally, we retrained the network with the Cullpdb dataset and compared with Jpred, ReportX, Spider3 server and MUFold-SS method, the Contextnet was found to be more Q3 accurate on a CASP13 dataset. Training procedures were found to have significant impact on the accuracy of the Contextnet. This journal is © The Royal Society of Chemistry.Entities:
Year: 2019 PMID: 35540205 PMCID: PMC9075825 DOI: 10.1039/c9ra05218f
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 4.036
Fig. 1Context convolutional neural network architecture. The gray square represented input features, blue squares represented convolution operation, green squares represented dilated convolution operations and yellow squares represented concatenation operation. Numbers in squares were the channel number of the corresponding convolution operation.
Q3 accuracy of seven different secondary structure prediction networks on Jpred dataset
| Accuracy (%) | Standard deviation (%) | Learning rate | |
|---|---|---|---|
| Simple CNN | 82.68 | 0.35 | 0.0001 |
| BiLSTM | 83.03 | 0.20 | 0.001 |
| CNN + CRF | 82.08 | 0.32 | 0.0001 |
| Deep3I | 83.04 | 0.21 | 0.0001 |
| Double BiLSTM | 83.30 | 0.28 | 0.001 |
| CNN + BiLSTM | 83.35 | 0.13 | 0.0001 |
| Contextnet | 83.66 | 0.24 | 0.0001 |
Improvement of Contextnet by training tricks and the ensemble method
| Q3 accuracy (%) | Standard deviation (%) | |
|---|---|---|
| Contextnet | 83.66 | 0.24 |
| Trained with tricks | 84.14 | 0.13 |
| Ensemble of Contextnet | 84.74 |
Q3 accuracy, Q8 accuracy and SOV of Cullpdb test set
| Q3 accuracy (%) | Q8 accuracy (%) | SOV | |
|---|---|---|---|
| Contextnet | 84.41 | 73.13 | 77.35 |
| Ensemble of Contextnet | 85.29 | 74.68 | 80.41 |
Q3(Q8) accuracy of Jpred server, ReportX server, DeepCNF server, MUFold-SS and the Contextnet. Culled CB513, CASP12 and CASP13 dataset used here
| CB513 (%) | CASP12 (%) | CASP13 (%) | |
|---|---|---|---|
| Jpred server | 80.11 | 78.51 | 80.01 |
| ReportX server | 82.34(70.30) | 80.84(69.16) | 81.13(67.71) |
| Spider3 server | 84.56 | 82.23 | 83.22 |
| MUFold-SS | 85.12(72.41) | 80.98(68.87) | 83.19(72.30) |
| Contextnet | 83.98(71.15) | 81.67(69.87) | 83.81(71.01) |
| Ensemble of Contextnet | 85.04(72.76) | 82.69(71.20) | 84.93(72.95) |
SOV score of Jpred server, ReportX server, DeepCNF server, MUFold-SS and the Contextnet. Culled CB513, CASP12 and CASP13 dataset used here
| CB513 | CASP12 | CASP13 | |
|---|---|---|---|
| Jpred server | 75.40 | 73.56 | 73.61 |
| ReportX server | 77.38 | 75.19 | 74.70 |
| Spider3 server | 81.19 | 75.73 | 77.32 |
| MUFold-SS | 80.60 | 73.84 | 84.27 |
| Contextnet | 76.09 | 71.86 | 74.64 |
| Ensemble of Contextnet | 78.90 | 75.20 | 78.53 |
Q3(Q8) accuracy of Jpred server, ReportX server, DeepCNF server, MUFold-SS and the Contextnet. Complete CB513, CASP12 and CASP13 dataset used here
| CB513 (%) | CASP12 (%) | CASP13 (%) | |
|---|---|---|---|
| Jpred server | 80.47 | 78.07 | 80.17 |
| ReportX server | 82.83(70.89) | 80.72(68.84) | 80.79(67.38) |
| Spider3 server | 84.71 | 81.99 | 82.97 |
| MUFold-SS | 85.81(73.41) | 80.94(68.88) | 83.34(71.89) |
| Contextnet | 84.32(71.96) | 81.43(69.56) | 83.62(70.73) |
| Ensemble of Contextnet | 85.28(73.77) | 82.56(70.78) | 84.81(72.32) |
SOV score of Jpred server, ReportX server, DeepCNF server, MUFold-SS and the Contextnet. Complete CB513, CASP12 and CASP13 dataset used here
| CB513 | CASP12 | CASP13 | |
|---|---|---|---|
| Jpred server | 76.56 | 72.54 | 73.65 |
| ReportX server | 78.36 | 74.84 | 74.12 |
| Spider3 server | 81.35 | 75.58 | 75.73 |
| MUFold-SS | 82.12 | 74.00 | 83.73 |
| Contextnet | 77.27 | 72.14 | 74.67 |
| Ensemble of Contextnet | 80.27 | 75.91 | 78.18 |