| Literature DB >> 32780798 |
Joshua M Toth1, Paul J DePietro1, Juergen Haas2, William A McLaughlin1.
Abstract
MOTIVATION: Methods to assess the quality of protein structure models are needed for user applications. To aid with the selection of structure models and further inform the development of structure prediction techniques, we describe the ResiRole method for the assessment of the quality of structure models.Entities:
Mesh:
Substances:
Year: 2021 PMID: 32780798 PMCID: PMC8058773 DOI: 10.1093/bioinformatics/btaa712
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Flow diagram of the data analysis stages for comparing structure prediction techniques regarding their capacities to have functional site predictions like those of the reference structures
Fig. 2.Comparison of the crystal structure of the group I dockerin domain of hydrolase GDSL along with structure models at the location of a calcium binding site. The function is predicted to be centered on an asparagine residue, which is shown in stick representations. The experimental structure is in white, whereas the Robetta model and RaptorX models are in yellow and magenta, respectively. The calcium ion from the reference structure is shown in green
Fig. 3.Scatter plot of all the cumulative probabilities for functional site predictions made in the reference structures versus those for the corresponding sites in the Robetta models based on the SeqFEATURE model EF_HAND_1.5.ASN.OD1. Pearson’s r is 0.1358 for the least-squares regression line. An example of a difference score calculation of 0.74 is shown for the point indicated by the arrow
Results of round-robin, head-to-head comparisons using difference scores
| Server | DS | RB | I4 | SM | M4T | NB | HB | I3 | P2 | RX | I2 | RA | PBCL | PR | SX | PB3D | PH3D | PHCL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RB | 0.1528 | |||||||||||||||||
| I4 | 0.1718 | 3.E-01 | ||||||||||||||||
| 213 | ||||||||||||||||||
| SM | 0.1721 | *** | *** | |||||||||||||||
| 1509 | 397 | |||||||||||||||||
| M4T | 0.1736 | *** | *** | *** | ||||||||||||||
| 659 | 166 | 1135 | ||||||||||||||||
| NB | 0.177 | *** | *** | *** | *** | |||||||||||||
| 1107 | 326 | 2247 | 777 | |||||||||||||||
| HB | 0.1786 | *** | 8.E-05 | *** | *** | *** | ||||||||||||
| 993 | 281 | 1834 | 580 | 1386 | ||||||||||||||
| I3 | 0.1795 | *** | *** | 3.E-01 | *** | *** | *** | |||||||||||
| 436 | 349 | 699 | 308 | 545 | 491 | |||||||||||||
| P2 | 0.1799 | *** | *** | *** | *** | *** | *** | *** | ||||||||||
| 1230 | 268 | 2532 | 997 | 1690 | 1304 | 548 | ||||||||||||
| RX | 0.1802 | *** | *** | 6.E-04 | 3.E-03 | *** | *** | 4.E-01 | *** | |||||||||
| 1489 | 402 | 2621 | 978 | 1919 | 1665 | 707 | 2040 | |||||||||||
| I2 | 0.1862 | *** | *** | 6.E-04 | * | *** | *** | 1.E-01 | *** | *** | ||||||||
| 948 | 197 | 1230 | 510 | 867 | 744 | 410 | 1021 | 1217 | ||||||||||
| RA | 0.1909 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | |||||||
| 642 | 229 | 1284 | 674 | 804 | 642 | 327 | 1156 | 1028 | 552 | |||||||||
| PBCL | 0.192 | *** | *** | *** | 5.E-01 | *** | *** | *** | * | *** | *** | *** | ||||||
| 307 | 302 | 875 | 178 | 791 | 572 | 361 | 494 | 817 | 219 | 277 | ||||||||
| PR | 0.1924 | *** | *** | *** | 5.E-01 | *** | *** | *** | * | *** | *** | *** | 4.E-01 | |||||
| 308 | 303 | 877 | 178 | 791 | 576 | 362 | 492 | 818 | 219 | 275 | 885 | |||||||
| SX | 0.1954 | *** | *** | *** | *** | 2.E-01 | *** | *** | *** | *** | *** | *** | * | * | ||||
| 815 | 360 | 2161 | 847 | 1454 | 979 | 666 | 1770 | 1720 | 741 | 955 | 750 | 752 | ||||||
| PB3D | 0.1968 | *** | *** | *** | *** | 3.E-01 | *** | *** | *** | *** | *** | *** | *** | *** | *** | |||
| 305 | 302 | 874 | 178 | 790 | 575 | 357 | 491 | 812 | 218 | 274 | 879 | 880 | 748 | |||||
| PH3D | 0.1998 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | * | ||
| 284 | 263 | 818 | 158 | 733 | 556 | 333 | 445 | 772 | 200 | 244 | 799 | 803 | 707 | 796 | ||||
| PHCL | 0.2014 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | * | |
| 289 | 272 | 836 | 158 | 743 | 570 | 341 | 457 | 786 | 206 | 248 | 814 | 817 | 720 | 811 | 825 | |||
| PT | 0.2075 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** |
| 973 | 357 | 2285 | 1007 | 1523 | 1065 | 631 | 2077 | 1790 | 862 | 1181 | 560 | 559 | 1839 | 557 | 516 | 527 |
Note: The overall average difference scores are displayed in the left column next to the technique identification. P values that are associated with each pairwise comparison are based on the Mann–Whitney U tests. Yellow ***, ** and * indicate comparisons in which the techniques were statistically different after applying the Bonferroni-corrected P value threshold of 6.54×10-7, as calculated based on initial P values of .0001, .001 and .01, respectively. The number of targets in common for each pair of techniques are given for each comparison.
DS, average difference score; RB, Robetta; HB, HHPredB; I4, IntFOLD4-TS; I3, IntFOLD3-TS; I2, IntFOLD2-TS; P2, Phyre2; NB, NaïveBLAST; RA, RBO Aleph; RX, RaptorX; PB3D, PRIMO-BST-3D; PBCL, PRIMO-BST-CL; PH3D, PRIMO-HHS-3D; PHCL, PRIMO-HHS-CL; PR, PRIMO; PT, Princeton-TEMPLATE; SM, SWISS-MODEL; SX, SPARKS-X.
Results of round-robin, head-to-head comparisons using difference scores for the hard targets
| Server | DS | RB | SM | M4T | RX | P2 | I4 | HB | RA | NB | I3 | SX | PT | I2 | PH3D | PHCL | PR | PBCL |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RB | 0.2229 | |||||||||||||||||
| SM | 0.2501 | *** | ||||||||||||||||
| 254 | ||||||||||||||||||
| M4T | 0.2539 | *** | 6.E-04 | |||||||||||||||
| 51 | 143 | |||||||||||||||||
| RX | 0.2545 | *** | *** | *** | ||||||||||||||
| 262 | 686 | 119 | ||||||||||||||||
| P2 | 0.2597 | *** | *** | 3.E-02 | *** | |||||||||||||
| 214 | 686 | 119 | 539 | |||||||||||||||
| I4 | 0.2604 | *** | *** | *** | *** | *** | ||||||||||||
| 37 | 107 | 15 | 111 | 72 | ||||||||||||||
| HB | 0.2666 | *** | *** | *** | *** | *** | 4.E-01 | |||||||||||
| 176 | 493 | 84 | 434 | 350 | 72 | |||||||||||||
| RA | 0.2669 | *** | 8.E-03 | 8.E-03 | *** | *** | 2.E-01 | *** | ||||||||||
| 113 | 280 | 70 | 224 | 257 | 42 | 143 | ||||||||||||
| NB | 0.2775 | *** | *** | *** | *** | *** | *** | *** | *** | |||||||||
| 145 | 536 | 102 | 445 | 377 | 79 | 329 | 138 | |||||||||||
| I3 | 0.278 | *** | 5.E-01 | 5.E-04 | *** | *** | *** | *** | *** | *** | ||||||||
| 80 | 188 | 27 | 198 | 157 | 100 | 131 | 82 | 125 | ||||||||||
| SX | 0.2795 | *** | *** | 9.E-03 | *** | *** | *** | *** | 5.E-01 | *** | 3.E-02 | |||||||
| 127 | 606 | 107 | 483 | 505 | 93 | 283 | 203 | 355 | 183 | |||||||||
| PT | 0.2816 | *** | *** | *** | *** | 9.E-05 | *** | *** | *** | *** | * | *** | ||||||
| 166 | 640 | 129 | 496 | 586 | 96 | 313 | 270 | 361 | 174 | 512 | ||||||||
| I2 | 0.2827 | *** | 3.E-03 | 3.E-01 | *** | *** | *** | ** | *** | *** | 4.E-02 | 4.E-03 | 2.E-02 | |||||
| 188 | 327 | 60 | 322 | 289 | 55 | 193 | 142 | 184 | 119 | 218 | 252 | |||||||
| PH3D | 0.3136 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | ||||
| 38 | 239 | 23 | 228 | 120 | 73 | 163 | 46 | 197 | 95 | 202 | 145 | 56 | ||||||
| PHCL | 0.3151 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | |||
| 40 | 247 | 22 | 237 | 125 | 81 | 170 | 47 | 203 | 100 | 210 | 152 | 58 | 245 | |||||
| PR | 0.323 | *** | *** | *** | *** | *** | *** | *** | *** | 3.E-01 | *** | *** | *** | *** | *** | *** | ||
| 39 | 250 | 23 | 236 | 128 | 83 | 165 | 47 | 212 | 97 | 208 | 157 | 57 | 233 | 240 | ||||
| PBCL | 0.3233 | *** | *** | *** | *** | *** | *** | *** | *** | 3.E-01 | *** | *** | *** | *** | *** | *** | 5.E-01 | |
| 38 | 247 | 23 | 233 | 128 | 82 | 161 | 47 | 210 | 96 | 205 | 156 | 57 | 229 | 236 | 256 | |||
| PB3D | 0.3423 | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** | *** |
| 39 | 247 | 23 | 231 | 127 | 82 | 162 | 46 | 210 | 95 | 204 | 155 | 56 | 229 | 236 | 255 | 253 |
Fig. 4.Correlations between the difference scores and other metrics for assessing structure model quality. The regression lines for the scatter plots off the average difference scores for the 18 different structure prediction techniques versus their corresponding average values of the other quality assessment metrics are provided. The other metrics are TM-Score (A), GDT-TS (B), GDT-HA (C), GDC (D), lDDT (E) and lDDT-BS (F)