| Literature DB >> 23626696 |
Wang Ding1, Jiang Xie, Dongbo Dai, Huiran Zhang, Hao Xie, Wu Zhang.
Abstract
BACKGROUNDS: Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23626696 PMCID: PMC3634008 DOI: 10.1371/journal.pone.0061533
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Performance of sub-networks and final cascade-network.
| Separation | Seq-Len | THR | Chains | Acc | Err | Cov | Err | |
| Sub-network 1 | 6 | 51–70 | 0.1 | 30 | 46.43 | 9.24 | 41.91 | 4.89 |
| Sub-network 2 | 7 | 71–90 | 0.6 | 40 | 44.35 | 9.89 | 36.43 | 7.64 |
| Sub-network 3 | 10 | 91–130 | 0.7 | 199 | 43.99 | 8.05 | 36.33 | 4.59 |
| Sub-network 4 | 13 | 131–190 | 0.7 | 246 | 41.38 | 8.39 | 33.95 | 5.32 |
| Sub-network 5 | 17 | 191–290 | 0.8 | 201 | 28.31 | 7.72 | 36.57 | 2.64 |
| Sub-network 6 | 21 | 291–450 | 0.9 | 87 | 31.81 | 9.90 | 34.47 | 2.53 |
| Average |
| 8.87 |
| 4.60 | ||||
| CNNcon | 51–450 | 803 |
| 8.07 |
| 4.52 |
Sequence separation: if value is s, then only contacts between pairs minimally s residues apart are considered, that is .
Length Range of protein sequence of corresponding sub-network training and testing data sets.
Minimal prediction value to determine residues contact or not.
Size of test data set for each sub-network.
Acc: prediction accuracy(%), defined in equation (1) and Cov: coverage(%), defined in equation (2).
Standard error.
Comparison results with other current methods.
| Predictor | Acc | Cov | Targets | Method |
| CNNcon | 57.86 | 34.28 | 803 | Neural network based; Using optimized thresholds. |
| NNconi | 54.50 | 35.00 | 116 | Neural network based; Top |
| PROFconj | 32.40 | 19.60 | 633 | Neural network based; Top |
| SVMconk | 37.00 | 21.00 | 48 | Support vector machine based;Top |
As in Table 1.
Size of test data sets.
This work.
Results are summarized from previous works [16]–[18], respectively.
Comparison results on 64 CASP10 targets.
| Predictor | Acce | Err | Cove | Errcov f |
| CNNcon | 55.48 | 17.13 | 36.89 | 4.79 |
| NNcon | 46.39 | 11.79 | 31.70 | 9.49 |
| PROFcon | 39.90 | 7.02 | 25.55 | 9.87 |
| SVMcon | 38.15 | 9.02 | 25.62 | 10.93 |
As in Table 1.
Figure 1Prediction results upon all test proteins by corresponding sub-networks.
The X axis is length range of tested proteins. The Y axis is prediction accuracy (%). Each point represents the predicted accuracy of a protein by its belonged sub-network. The average accuracy is as high as 34.01%. However, the accuracies decrease while the length of proteins increases.
Figure 2Prediction results upon all test proteins by the final cascade-network.
The X axis and Y axis are the same in Figure 1. Each point represents the predicted accuracy of a protein by the final cascade-network. The average accuracy is as high as 57.86%. Moreover, the accuracies keep steady while the length of proteins increases.
Figure 3Architectures of sub-neural network (left) and cascade-neural network (right).
Since architectures of all the six sub-networks are the same, only one of them is shown here (left).