| Literature DB >> 22693212 |
Wei-Cheng Lo1, Li-Fen Wang, Yen-Yi Liu, Tian Dai, Jenn-Kang Hwang, Ping-Chiang Lyu.
Abstract
Circular permutation (CP) is a protein structural rearrangement phenomenon, through which nature allows structural homologs to have different locations of termini and thus varied activities, stabilities and functional properties. It can be applied in many fields of protein research and bioengineering. The limitation of applying CP lies in its technical complexity, high cost and uncertainty of the viability of the resulting protein variants. Not every position in a protein can be used to create a viable circular permutant, but there is still a lack of practical computational tools for evaluating the positional feasibility of CP before costly experiments are carried out. We have previously designed a comprehensive method for predicting viable CP cleavage sites in proteins. In this work, we implement that method into an efficient and user-friendly web server named CPred (CP site predictor), which is supposed to be helpful to promote fundamental researches and biotechnological applications of CP. The CPred is accessible at http://sarst.life.nthu.edu.tw/CPred.Entities:
Mesh:
Year: 2012 PMID: 22693212 PMCID: PMC3394280 DOI: 10.1093/nar/gks529
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The flowchart and output of CPred. (a) CPred is a viable circular permutation cleavage site prediction web server, which is working based on distributed computation techniques. After receiving the query protein data, the main program of CPred will extract feature values, execute machine learning subroutines, integrate the prediction results and deliver the final results to the output interface. The computation loads of many steps are distributed to several processors, as indicated by the radical arrow lines. Some structural features and machine learning methods require much more computation power than others; subroutines responsible for them, as represented by multicelled boxes, are designed by applying distributed techniques as well. (b) The output interface of CPred provides a list (lower right) and a graphic profile (lower left) of the probability scores of all residues in the input protein. The structure, along with predicted viable CP sites, is presented by an interactive Jmol (33) object (upper left), which allows the user to change the display mode (cartoon, spacefilled, etc.) and to rotate, resize and dissect the structure. A downloadable text version of the CPred results is provided as well (upper right). The structures shown in panel (a) and (b) were respectively rendered using PyMol (45) and Jmol (33).
Performance of CP viability prediction of CPred
| Data set | Performance measure | Closeness | CPred |
|---|---|---|---|
| Training set (Data set T + DHFR data set) | AUC | 0.753 | 0.940 |
| Sensitivity | 0.741 | 0.889 | |
| Specificity | 0.687 | 0.898 | |
| Matthews correlation coefficient | 0.428 | 0.787 | |
| nrCPDB-40 | Sensitivity | 0.622 | 0.746 |
| nrGIS-40 | Sensitivity | 0.614 | 0.719 |
aThese results were obtained with 10-fold cross-validation.
Performance of CPred at various decision thresholds of the probability score
| Probability score | PPF | Recall | Precision |
|---|---|---|---|
| ≥0.90 | 0.06 | 0.13 | 1.00 |
| ≥0.85 | 0.16 | 0.33 | 1.00 |
| ≥0.80 | 0.26 | 0.52 | 0.99 |
| ≥0.75 | 0.33 | 0.66 | 0.96 |
| ≥0.70 | 0.39 | 0.74 | 0.92 |
| ≥0.65 | 0.43 | 0.81 | 0.92 |
| ≥0.60 | 0.49 | 0.88 | 0.87 |
| ≥0.50 | 0.54 | 0.92 | 0.82 |
| ≥0.40 | 0.61 | 0.97 | 0.77 |
| ≥0.30 | 0.69 | 1.00 | 0.70 |
| ≥0.20 | 0.78 | 1.00 | 0.62 |
| ≥0.10 | 0.90 | 1.00 | 0.54 |
| ≥0.00 | 1.00 | 1.00 | 0.48 |
aPPF: predicted positive fraction, meaning the proportion of residues predicted as viable CP sites among all residues in the data set.