| Literature DB >> 31208331 |
Yanbu Guo1, Weihua Li2, Bingyi Wang3, Huiqing Liu1, Dongming Zhou1.
Abstract
BACKGROUND: Protein secondary structure (PSS) is critical to further predict the tertiary structure, understand protein function and design drugs. However, experimental techniques of PSS are time consuming and expensive, and thus it's very urgent to develop efficient computational approaches for predicting PSS based on sequence information alone. Moreover, the feature matrix of a protein contains two dimensions: the amino-acid residue dimension and the feature vector dimension. Existing deep learning based methods have achieved remarkable performances of PSS prediction, but the methods often utilize the features from the amino-acid dimension. Thus, there is still room to improve computational methods of PSS prediction.Entities:
Keywords: Asymmetric convolutional neural network; Deep learning; Long short-term memory; Protein secondary structure
Mesh:
Substances:
Year: 2019 PMID: 31208331 PMCID: PMC6580607 DOI: 10.1186/s12859-019-2940-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Overview of DeepACLSTM structure
The main structures and parameters of DeepACLSTM
| Layer Type | Size | NP |
|---|---|---|
| embedding | 21 | 441 |
| Convolution1D | 1 × 42 | 1806 |
| Convolution2D | 3 × 1 | 168 |
| Dropout1 | 0.5 | 0 |
|
| 400 | 706,000 |
|
| 300 | 841,200 |
|
| 300 | 841,200 |
|
| 300 | 721,200 |
|
| 300 | 721,200 |
| Dropout2 | 0.4 | 0 |
|
| 600 | 600,600 |
| Softmax | 8 | 4808 |
Fig. 2The performance of DeepACLSTM on different input features
The Q8 accuracy (%) of DeepACLSTM with different LSTM units and the best values are marked in bold
| LSTM output dimension | CASP10 | CASP11 |
|---|---|---|
| 50 | 71.8 | 70.2 |
| 100 | 72.4 | 70.5 |
| 150 | 74.2 | 72.1 |
| 200 | 74.1 | 72.2 |
| 250 | 74.5 | 72.3 |
| 300 |
|
|
| 350 | 73.7 | 71.8 |
| 400 | 73.8 | 71.6 |
| 450 | 74.8 | 71.8 |
| 500 | 72.1 | 70.1 |
The Q8 accuracy (%) of DeepACLSTM with different filter size and the best values are marked in bold
| Filter Size | CASP10 | CASP11 |
|---|---|---|
| 3 |
|
|
| 5 | 73.9 | 72.1 |
| 7 | 74.2 | 72.1 |
| 9 | 74.7 | 72.4 |
| 11 | 74.4 | 72.3 |
| 13 | 71.3 | 70.0 |
| 15 | 69.6 | 68.6 |
| 17 | 74.3 | 72.3 |
| 19 | 73.5 | 71.6 |
| 21 | 74.0 | 71.7 |
The Q8 accuracy (%) of our method and baseline methods and the best performance are marked in bold
| Methods | CB513 | CASP10 | CASP11 |
|---|---|---|---|
| SSpro8 | 63.5 | 64.9 | 65.6 |
| CNF | 64.9 | 64.8 | 65.1 |
| DeepCNF | 68.3 | 71.8 | 72.3 |
| CBRNN | 70.2 | 74.5 | 72.5 |
| DeepACLSTM |
|
|
|
The Q8 accuracy (%) of our method and baseline methods and the best performance are marked in bold
| Methods | CB6133 | CB513 |
|---|---|---|
| GSN | 72.1 | 66.4 |
| DCRNN | 73.2 | 69.4 |
| CNNH | 74.0 | 70.3 |
| DeepACLSTM |
|
|
Fig. 3The performance of DeepACLSTM with different D1 rates
Fig. 4The performance of DeepACLSTM with different D2 rates
The Q8 accuracy (%) of our method on different dropout settings
| Dropout Setting | CB513 | CASP10 | CASP11 |
|---|---|---|---|
| YD1-YD2 |
|
|
|
| YD1-ND2 | 68.5 | 72.3 | 70.3 |
| ND1-YD2 | 69.1 | 73.3 | 71.1 |
| ND1-ND2 | 69.2 | 73.7 | 71.0 |
Fig. 5Internal architecture of the LSTM cell
Fig. 6Architecture of stacked BLSTM neural networks