| Literature DB >> 26858540 |
Lixia Sun1, Xiuzhen Hu1, Shaobo Li1, Zhuo Jiang1, Kun Li1.
Abstract
Prediction of a complex super-secondary structure is a key step in the study of tertiary structures of proteins. The strand-loop-helix-loop-strand (βαβ) motif is an important complex super-secondary structure in proteins. Many functional sites and active sites often occur in polypeptides of βαβ motifs. Therefore, the accurate prediction of βαβ motifs is very important to recognizing protein tertiary structure and the study of protein function. In this study, the βαβ motif dataset was first constructed using the DSSP package. A statistical analysis was then performed on βαβ motifs and non-βαβ motifs. The target motif was selected, and the length of the loop-α-loop varies from 10 to 26 amino acids. The ideal fixed-length pattern comprised 32 amino acids. A Support Vector Machine algorithm was developed for predicting βαβ motifs by using the sequence information, the predicted structure and function information to express the sequence feature. The overall predictive accuracy of 5-fold cross-validation and independent test was 81.7% and 76.7%, respectively. The Matthew's correlation coefficient of the 5-fold cross-validation and independent test are 0.63 and 0.53, respectively. Results demonstrate that the proposed method is an effective approach for predicting βαβ motifs and can be used for structure and function studies of proteins.Entities:
Keywords: Combined features; Complex super-secondary structure; Structure prediction; Support Vector Machine
Year: 2015 PMID: 26858540 PMCID: PMC4705255 DOI: 10.1016/j.sjbs.2015.10.005
Source DB: PubMed Journal: Saudi J Biol Sci ISSN: 1319-562X Impact factor: 4.219
Figure 1The distribution of the length of the loop-α-loop motif. The solid line is used to represent βαβ motifs, and dotted line is used to represent the non-βαβ motif.
The length of the patterns in the datasets.
| Motifs | βαβ | Non-βαβ |
|---|---|---|
| Longest length | 65 | 66 |
| Shortest length | 15 | 12 |
| Average length | 31.6 | 29.3 |
Figure 2Five examples of fixed-length patterns. Note: the sites which are emphasized are underlined; the symbol “∗” shows residues flanking the peptide which were appended at both ends.
Figure 3Sample of position conservation in the aligned twenty-eighth position. Note: βαβ motifs were shown in subfigure (A) and non-βαβ motifs were showed in subfigure (B). The overall height of the stack indicates position conservation, while the height of symbols within the stack indicates the relative frequency of each amino acid at that position.
Predictive results using a 5-fold cross-validations test.
| 74.1 | 52.9 | 66.9 | 61.5 | 0.28 | 64.8 | |
| 80.6 | 75.0 | 80.5 | 75.1 | 0.55 | 78.1 | |
| 83.5 | 74.8 | 81.0 | 78.0 | 0.59 | 79.7 | |
| 85.4 | 77.0 | 82.6 | 80.4 | 0.63 | 81.7 |
Predictive results using independent tests.
| 81.6% | 70.8% | 76.8% | 76.5% | 0.53 | 76.7% |
Figure 4Prediction βαβ motif (from the protein chain 1EDZ) example.