| Literature DB >> 31127307 |
Wenbo Wang1, Zhaoyu Li1, Junlin Wang1, Dong Xu1,2, Yi Shang1.
Abstract
This paper presents a new fast and accurate web service for protein model quality analysis, called PSICA (Protein Structural Information Conformity Analysis). It is designed to evaluate how much a tertiary model of a given protein primary sequence conforms to the known protein structures of similar protein sequences, and to evaluate the quality of predicted protein models. PSICA implements the MUfoldQA_S method, an efficient state-of-the-art protein model quality assessment (QA) method. In CASP12, MUfoldQA_S ranked No. 1 in the protein model QA select-20 category in terms of the difference between the predicted and true GDT-TS value of each model. For a given predicted 3D model, PSICA generates (i) predicted global GDT-TS value; (ii) interactive comparison between the model and other known protein structures; (iii) visualization of the predicted local quality of the model; and (iv) JSmol rendering of the model. Additionally, PSICA implements MUfoldQA_C, a new consensus method based on MUfoldQA_S. In CASP12, MUfoldQA_C ranked No. 1 in top 1 model GDT-TS loss on the select-20 QA category and No. 2 in the average difference between the predicted and true GDT-TS value of each model for both select-20 and best-150 QA categories. The PSICA server is freely available at http://qas.wangwb.com/∼wwr34/mufoldqa/index.html.Entities:
Mesh:
Year: 2019 PMID: 31127307 PMCID: PMC6602450 DOI: 10.1093/nar/gkz402
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Example of task receipt file in .txt format.
Figure 2.Example of job status web page.
Figure 3.Example of results summary page.
Figure 4.Some of the interactive visualizations in the detailed report page for each of the predicted models. (A) JSmol View of the Decoy; (B) Comparing Decoy with Different Templates; (C) Visualization of Local Quality of the Decoy (Range 0–1, Higher the Better); (D) Decoy Distance Matrix.
Difference between the predicted and true GDT-TS values of models, average over all targets
| CASP12 Select-20 | CASP12 Best-150 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| GR | CR | Group Name | GN | AD | GR | CR | Group Name | GN | AD |
| 1 | 1 |
| 334 | 3.602 | 1 | 1 |
| 237 | 5.173 |
| 2 | 1 |
| 318 | 3.818 | 2 | 2 |
| 318 | 5.512 |
| 3 | 2 |
| 034 | 5.615 | 3 | 1 |
| 360 | 6.748 |
| 4 | 3 |
| 237 | 5.756 | 4 | 3 |
| 034 | 6.781 |
| 5 | 2 |
| 201 | 5.883 | 5 | 2 |
| 201 | 7.087 |
| 6 | 3 |
| 360 | 6.697 | 6 | 4 |
| 214 | 7.093 |
| 7 | 4 |
| 214 | 6.878 | 7 | 5 |
| 112 | 8.373 |
| 8 | 1 | Wang4 | 195 | 7.021 | 8 | 6 |
| 219 | 8.373 |
| 9 | 5 |
| 073 | 7.272 | 9 | 7 |
| 223 | 8.373 |
| 10 | 2 | Wang2 | 206 | 8.021 | 10 | 3 |
| 334 | 8.898 |
| 11 | 3 | ProQ3_1_diso | 095 | 8.155 | 11 | 8 |
| 109 | 9.210 |
| 12 | 4 | VoroMQAsr | 093 | 8.275 | 12 | 9 |
| 267 | 9.664 |
| 13 | 5 | ProQ3_1 | 302 | 8.449 | 13 | 10 |
| 073 | 9.710 |
| 14 | 6 | VoroMQA | 224 | 8.488 | 14 | 4 |
| 072 | 9.754 |
| 15 | 6 |
| 223 | 8.507 | 15 | 11 |
| 411 | 9.839 |
| 16 | 7 |
| 219 | 8.507 | 16 | 1 | ProQ3_1 | 302 | 10.155 |
| 17 | 8 |
| 109 | 8.507 | 17 | 2 | ProQ3_1_diso | 095 | 10.159 |
| 18 | 9 |
| 112 | 8.507 | 18 | 3 | ProQ3 | 213 | 11.418 |
| 19 | 10 |
| 411 | 8.560 | 19 | 4 | MULTICOM-CLUSTER | 287 | 11.445 |
| 20 | 11 |
| 267 | 9.107 | 20 | 5 |
| 120 | 11.608 |
| More groups omitted… | More groups omitted… | ||||||||
GR: Global Ranking, the ranking among all method. CR: Categorical Ranking, the ranking within its QA method category, either single-model QA, quasi-single-model QA or multi-model QA. GN: Group Number: an identification number of the group assigned by the CASP officials. AD: Average difference between the predicted and true GDT-TS value of each model. Different font style represents different type of QA methods: regular for single-model QA method, italic for quasi-single-model QA method, and bold for multi-model QA method. GOAL and COFOLD_QA submitted no more than five predictions, which makes it an unfair comparison when all other group submitted at least 68 predictions, thus removed from ranking.
The lower the GDT-TS differences, the better. Results of the top 20 groups are shown.
Difference between the predicted and true GDT-TS value of each model for CASP13 20-target subset, average over all targets
| CASP13 | Select-20 | Best-150 | ||
|---|---|---|---|---|
| Method | AD | NT | AD | NT |
| MUfoldQA_C | 3.309 | 20 | 4.045 | 20 |
| ModFOLD7 | 4.788 | 20 | 5.645 | 20 |
| MUfoldQA_S | 4.970 | 20 | 6.368 | 20 |
| MULTICOM_CLUSTER | 5.041 | 20 | 8.235 | 20 |
| MULTICOM-NOVEL | 5.937 | 20 | 8.310 | 20 |
| ProQ3D | 7.792 | 18 | 9.388 | 20 |
| ProQ3 | 10.117 | 18 | 11.483 | 20 |
| ProQ4 | 16.206 | 20 | 14.262 | 20 |
AD: Average GDT-TS difference between predicted and true values. NT: Number of targets.
Figure 5.Comparison of execution time between MUfoldQA_S, MUfoldQA_C and some other QA methods (ModFOLD6, iFold_2, QASproCL) on the CASP12 best-150 dataset.