| Literature DB >> 26339631 |
Jianzhu Ma1, Sheng Wang2.
Abstract
MOTIVATION: The solvent accessibility of protein residues is one of the driving forces of protein folding, while the contact number of protein residues limits the possibilities of protein conformations. The de novo prediction of these properties from protein sequence is important for the study of protein structure and function. Although these two properties are certainly related with each other, it is challenging to exploit this dependency for the prediction.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26339631 PMCID: PMC4538422 DOI: 10.1155/2015/678764
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 2Log-odds ratio between the pair frequencies in the structure alignments and the background frequencies, with respect to the relative solvent accessibility in 1% unit. The thick black line indicates the boundaries at 10% and 40% to define the 3-label solvent accessibility, say buried (B), intermediate (I), and exposed (E).
Figure 1The shared weight multitask learning framework under the CNF (conditional neural fields) model for 3-state solvent accessibility and 15-state contact number prediction. CNF could model the relationship between input features X and label Y through a hidden layer of neuron nodes, which conduct nonlinear transformation of X. Note that the weight W from the input features to hidden neuron nodes is fixed for all tasks, while the weight U from neuron to label and the weight T from label to label are task-specific.
Precision, recall, and F1 score for different evaluation dataset of 3-state solvent accessibility prediction.
| Evaluation dataset | Precision | Recall |
|
|---|---|---|---|
| †Buried overall | 0.76 | 0.78 | 0.77 |
| ‡Buried >0.9 | 0.96 | 0.31 | 0.47 |
| Buried >0.8 | 0.92 | 0.45 | 0.60 |
| Buried >0.7 | 0.88 | 0.57 | 0.69 |
| Buried >0.6 | 0.84 | 0.66 | 0.74 |
| Buried >0.5 | 0.79 | 0.74 | 0.76 |
| Buried >0.4 | 0.75 | 0.82 | 0.78 |
|
| |||
| Intermediate overall | 0.56 | 0.50 | 0.53 |
| Intermediate >0.9 | 1.00 | 0.0001 | 0.002 |
| Intermediate >0.8 | 0.82 | 0.006 | 0.01 |
| Intermediate >0.7 | 0.74 | 0.06 | 0.11 |
| Intermediate >0.6 | 0.67 | 0.19 | 0.30 |
| Intermediate >0.5 | 0.61 | 0.38 | 0.47 |
| Intermediate >0.4 | 0.55 | 0.61 | 0.58 |
|
| |||
| Exposed overall | 0.71 | 0.76 | 0.73 |
| Exposed >0.9 | 0.94 | 0.11 | 0.20 |
| Exposed >0.8 | 0.88 | 0.31 | 0.46 |
| Exposed >0.7 | 0.83 | 0.47 | 0.60 |
| Exposed >0.6 | 0.78 | 0.61 | 0.68 |
| Exposed >0.5 | 0.74 | 0.72 | 0.73 |
| Exposed >0.4 | 0.69 | 0.81 | 0.75 |
†Overall indicates the whole set of the predicted labels.
‡>0.9 indicates that the set of the predicted labels is chosen according to the predicted probability which is larger than 0.9.
Prediction accuracy of different feature class and learning model for 3-state solvent accessibility.
| Features | Evolution | Structure | Amino acid | †Combined single | ‡Combined MTL |
|
| |||||
| Q3 accuracy | 0.64 | 0.59 | 0.55 | 0.66 |
|
†Combined single indicates that all classes of features, including evolution, structure, and amino acid, are used for training a single task model.
‡Combined MTL indicates that all classes of features are used for training a multitask learning model.
Prediction accuracy of different feature class and learning models for 15-state contact number (with the same explanation as in Table 2).
| Features | Evolution | Structure | Amino acid | Combined single | Combined MTL |
|
| |||||
| Q15 accuracy | 0.26 | 0.24 | 0.19 | 0.28 |
|
Prediction accuracy of different tolerance values for 15-state contact number.
| Tolerance | 0 | 1 | 2 | 3 |
|
| ||||
| Accuracy | 0.30 | 0.63 | 0.83 | 0.93 |
Comparison results of the prediction accuracy of AcconPred with existing programs for 3-state solvent accessibility on the CASP11 dataset.
| Method | SPINE-X | SANN | ACCpro5 | AcconPred |
|
| ||||
| Q3 accuracy | 0.57 | 0.61 | 0.58 |
|
Comparison results of the Pearson correlation score of AcconPred with existing programs for contact number prediction on the Yuan dataset.
| Method | Kinjo | Yuan | AcconPred |
|
| |||
| Correlation | 0.63 | 0.64 |
|