| Literature DB >> 17407573 |
Abstract
BACKGROUND: Predicting protein residue-residue contacts is an important 2D prediction task. It is useful for ab initio structure prediction and understanding protein folding. In spite of steady progress over the past decade, contact prediction remains still largely unsolved.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17407573 PMCID: PMC1852326 DOI: 10.1186/1471-2105-8-113
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Detailed Contact Prediction Results on 48 Test Proteins for Sequence Separation >= 6, 12, and 24 respectively.
| Protein type | Len | Type | Separation >= 6 | Separation >= 12 | Separation >= 12 | |||
| Acc(corr/pred) | Cov(corr/tot) | Acc(corr/pred) | Cov(corr/tot) | Acc(corr/pred) | Cov(corr/tot) | |||
| 1IG5A | 75 | alpha | 0.333 (25/75) | 0.446 (25/56) | 0.240 (18/75) | 0.486 (18/37) | 0.273 (9/33) | 0.346 (9/26) |
| 1HXIA | 112 | alpha | 0.304 (34/112) | 0.270 (34/126) | 0.214 (24/112) | 0.238 (24/101) | 0.015 (1/67) | 0.018 (1/55) |
| 1SKNP | 74 | alpha | 0.093 (4/43) | 0.133 (4/30) | 0.000 (0/18) | 0.000 (0/24) | 0.000 (0/6) | 0.000 (0/20) |
| 1ELRA | 128 | alpha | 0.406 (52/128) | 0.327 (52/159) | 0.384 (33/86) | 0.264 (33/125) | 0.227 (5/22) | 0.085 (5/59) |
| 1E29A | 135 | alpha | 0.289 (39/135) | 0.193 (39/202) | 0.111 (15/135 | 0.112 (15/134) | 0.103 (7/68) | 0.071 (7/99) |
| 1CTJA | 89 | alpha | 0.157 (14/89) | 0.147 (14/95) | 0.112 (10/89 | 0.204 (10/49) | 0.090 (8/89) | 0.190 (8/42) |
| 1J75A | 57 | alpha | 0.474 (27/57) | 0.466 (27/58) | 0.250 (7/28) | 0.206 (7/34) | 0.500 (1/2) | 0.038 (1/26) |
| 1ECAA | 136 | alpha | 0.103 (14/136) | 0.156 (14/90) | 0.063 (5/79) | 0.064 (5/78) | 0.070 (3/43) | 0.041 (3/74) |
| 1FIOA | 190 | alpha | 0.143 (19/133) | 0.161 (19/118) | 0.153 (11/72) | 0.113 (11/97) | 0.140 (8/57) | 0.110 (8/73) |
| 1C75A | 71 | alpha | 0.282 (20/71) | 0.211 (20/95) | 0.099 (7/71) | 0.127 (7/55) | 0.087 (4/46) | 0.089 (4/45) |
| 1HCRA | 52 | alpha | 0.058 (3/52) | 0.231 (3/13) | 0.056 (1/18) | 0.167 (1/6) | 0.000 (0/0) | 0.000 (0/3) |
| 1QJPA | 137 | beta | 0.518 (71/137) | 0.183 (71/389) | 0.489 (67/137) | 0.215 (67/312) | 0.350(48/137) | 0.300 (48/160) |
| 1D2SA | 170 | beta | 0.482 (82/170) | 0.180 (82/455) | 0.341 (58/170) | 0.150 (58/386) | 0.165 (28/170) | 0.096 (28/293) |
| 1CQYA | 99 | beta | 0.182 (18/99) | 0.080 (18/225) | 0.172 (17/99) | 0.094 (17/180) | 0.273 (27/99) | 0.197 (27/137) |
| 1BMGA | 98 | beta | 0.398 (39/98) | 0.177 (39/220) | 0.398 (39/98) | 0.211 (39/185) | 0.429 (42/98) | 0.323 (42/130) |
| 1MAIA | 119 | beta | 0.538 (64/119) | 0.298 (64/215) | 0.361 (43/119) | 0.250 (43/172) | 0.034 (4/119) | 0.048 (4/83) |
| 1AMXA | 150 | beta | 0.387 (58/150) | 0.162 (58/357) | 0.300 (45/150) | 0.148 (45/304) | 0.220 (33/150) | 0.141 (33/234) |
| 1G3PA | 192 | beta | 0.042 (8/192) | 0.019 (8/420) | 0.042 (8/192 | 0.023 (8/342) | 0.036 (7/192) | 0.026 (7/273) |
| 1RSYA | 135 | beta | 0.578 (78/135) | 0.259 (78/301) | 0.459 (62/135) | 0.240 (62/258) | 0.230 (31/135) | 0.177 (31/175) |
| 1WHIA | 122 | beta | 0.492 (60/122) | 0.201 (60/298) | 0.459 (56/122 | 0.226 (56/248) | 0.295 (36/122) | 0.303 (36/119) |
| 1HE7A | 107 | beta | 0.280 (30/107) | 0.183 (30/164) | 0.327 (35/107) | 0.254 (35/138) | 0.346 (37/107) | 0.394 (37/94) |
| 1MWPA | 96 | a+b | 0.365 (35/96) | 0.178 (35/197) | 0.385 (37/96) | 0.236 (37/157) | 0.292 (28/96) | 0.311 (28/90) |
| 1QGVA | 130 | a+b | 0.338 (44/130) | 0.198 (44/222) | 0.338 (44/130) | 0.221 (44/199) | 0.385 (50/130) | 0.279 (50/179) |
| 1DBUA | 152 | a+b | 0.434 (66/152) | 0.208 (66/317) | 0.276 (42/152) | 0.162 (42/260) | 0.151 (23/152) | 0.111 (23/207) |
| 1XERA | 103 | a+b | 0.466 (48/103) | 0.219 (48/219) | 0.330 (34/103) | 0.214 (34/159) | 0.204 (21/103) | 0.193 (21/109) |
| 1JSFA | 130 | a+b | 0.500 (65/130) | 0.316 (65/206) | 0.385 (50/130) | 0.345 (50/145) | 0.154 (20/130) | 0.235 (20/85) |
| 1DZOA | 120 | a+b | 0.608 (73/120) | 0.330 (73/221) | 0.500 (60/120) | 0.351 (60/171) | 0.083 (10/120) | 0.139 (10/72) |
| 1GRJA | 151 | a+b | 0.318 (48/151) | 0.209 (48/230) | 0.225 (34/151) | 0.186 (34/183) | 0.066 (10/151) | 0.084 (10/119) |
| 1MSCA | 129 | a+b | 0.620 (80/129) | 0.421 (80/190) | 0.512 (66/129) | 0.524 (66/126) | 0.225 (29/129) | 0.644 (29/45) |
| 1CEWI | 108 | a+b | 0.528 (57/108) | 0.300 (57/190) | 0.454 (49/108) | 0.310 (49/158) | 0.278 (30/108) | 0.316 (30/95) |
| 1VHHA | 157 | a+b | 0.414 (65/157) | 0.206 (65/316) | 0.338 (53/157 | 0.201 (53/264) | 0.223 (35/157) | 0.174 (35/201) |
| 1BUOA | 121 | a+b | 0.298 (36/121) | 0.300 (36/120) | 0.207 (25/121) | 0.291 (25/86) | 0.140 (17/121) | 0.309 (17/55) |
| 1G2RA | 94 | a+b | 0.340 (32/94) | 0.254 (32/126) | 0.309 (29/94) | 0.309 (29/94) | 0.234 (22/94) | 0.400 (22/55) |
| 1E9MA | 106 | a+b | 0.387 (41/106) | 0.186 (41/220) | 0.358 (38/106) | 0.200 (38/190) | 0.311 (33/106) | 0.210 (33/157) |
| 1E87A | 117 | a+b | 0.470 (55/117) | 0.239 (55/230) | 0.299 (35/117) | 0.193 (35/181) | 0.291 (34/117) | 0.227 (34/150) |
| 1H9OA | 108 | a+b | 0.630 (68/108) | 0.354 (68/192) | 0.352 (38/108) | 0.299 (38/127) | 0.148 (16/108) | 0.302 (16/53) |
| 1IDOA | 184 | a/b | 0.402 (74/184) | 0.223 (74/332) | 0.402 (74/184) | 0.250 (74/296) | 0.402 (74/184) | 0.277 (74/267) |
| 1CHDA | 198 | a/b | 0.429 (85/198) | 0.175 (85/487) | 0.384 (76/198) | 0.170 (76/447) | 0.338 (67/198) | 0.181 (67/370) |
| 1FUEA | 163 | a/b | 0.374 (61/163) | 0.185 (61/330) | 0.374 (61/163) | 0.206 (61/296) | 0.399 (65/163) | 0.251 (65/259) |
| 1CXQA | 143 | a/b | 0.448 (64/143) | 0.303 (64/211) | 0.350 (50/143) | 0.276 (50/181) | 0.091 (13/143) | 0.115 (13/113) |
| 1F4PA | 147 | a/b | 0.442 (65/147) | 0.222 (65/293) | 0.361 (53/147) | 0.205 (53/258) | 0.354 (52/147) | 0.223 (52/233) |
| 1ES8A | 196 | a/b | 0.240 (47/196) | 0.130 (47/361) | 0.153 (30/196) | 0.100 (30/300) | 0.189 (37/196) | 0.160 (37/231) |
| 1DMGA | 172 | a/b | 0.302 (52/172) | 0.176 (52/296) | 0.273 (47/172) | 0.175 (47/269) | 0.192 (33/172) | 0.155 (33/213) |
| 1A1HA | 85 | small | 0.424 (36/85) | 0.424 (36/85) | 0.129 (11/85) | 0.262 (11/42) | 0.000 (0/85) | 0.000 (0/0) |
| 9WGAB | 171 | small | 0.415 (71/171) | 0.188 (71/378) | 0.357 (61/171) | 0.268 (61/228) | 0.041 (7/171) | 0.175 (7/40) |
| 2MADL | 124 | small | 0.274 (34/124) | 0.106 (34/321) | 0.226 (28/124) | 0.106 (28/263) | 0.218 (27/124) | 0.116 (27/232) |
| 1EJGA | 46 | small | 0.261 (12/46) | 0.203 (12/59) | 0.419 (13/31 | 0.271 (13/48) | 0.458 (11/24 | 0.306 (11/36) |
| 1AAOA | 113 | coil-coil | 0.221 (25/113) | 0.397 (25/63) | 0.031 (3/97) | 0.158 (3/19) | 0.000 (0/35) | 0.000 (0/0) |
Column 1 lists the protein name (PDB code + chain id). The chain id of a single-chain protein is denoted by "A" instead of "-". Column 2 lists chain lengths, ranging from 46 to 198. Column 3 lists the SCOP structure class, alpha, beta, a+b, a/b, small, and coil-coil represent six SCOP protein classess (all alpha helix, all beta sheet, alpha helix + beta sheet, alpha helix alternating beta sheet, small protein, and coil-coil), respectively. Columns 4 and 5 report the prediction accuracy (specificity) and coverage (sensitivity) for each protein. Accuracy is the number of correct predictions divided by the total number of predictions. Coverage is the number of correct predictions divided by the total number of true contacts. The raw number of correct preditions, all predictions, and true contacts are also reported in the brackets.
Figure 13D Structure of Protein 1DZOA. Protein 1DZOA is an a+b protein. It consists of two alpha helices and two beta sheets. Beta strands 1 and 2 form a parallel beta sheet. Beta strands 3, 4, 5, 6 form an anti-parallel beta sheet. Most non-local contacts involve the pairing interations between beta strands and the packing interactions between helices and beta sheets. (Figure rendered using Molscript [63]).
Figure 2Predicted and True Contact Maps of 1DZOA. The upper triangle shows the true contacts of protein 1DZOA. The lower triangle shows the predicted contacts of protein 1DZOA. 2L (240) top ranked contacts are selected. The key contacts within anti-parallel strand pairs (3,4), (4,5), and (5,6) are recalled. A few contacts within the parallel strand pair (1,2) are also predicted correctly. However, very long range contacts between alpha helices and beta sheets are not predicted. And there are some false positives. It is interesting to see that most false positives are close to the true contacts. Thus, they may not be very harmful when being used as distance restraints to reconstruct protein 3D structure.
Contact Prediction Results of Proteins in the Six SCOP Structure Classes.
| SCOP Class | Num | Separation >= 6 | Separation >= 12 | Separation >= 24 | |||
| Accuracy | Coverage | Accuracy | Coverage | Accuracy | Coverage | ||
| alpha | 11 | 0.24 | 0.24 | 0.17 | 0.18 | 0.11 | 0.09 |
| beta | 10 | 0.38 | 0.17 | 0.32 | 0.17 | 0.22 | 0.17 |
| a+b | 15 | 0.45 | 0.25 | 0.35 | 0.25 | 0.21 | 0.23 |
| a/b | 7 | 0.37 | 0.19 | 0.33 | 0.19 | 0.28 | 0.20 |
| small | 4 | 0.36 | 0.18 | 0.28 | 0.19 | 0.11 | 0.15 |
| coil-coil | 1 | 0.22 | 0.40 | 0.03 | 0.16 | 0.00 | -- |
| average | 48 | 0.37 | 0.21 | 0.30 | 0.20 | 0.21 | 0.19 |
Column 1 lists six structure classes. Column 2 lists the number of proteins in each class. Other columns reports the accuracy and coverage of contact predictions in each category. The statistics is computed for sequence separation >= 6, 12, and 24, respectively. The last row reports the average performance on all 48 proteins. The accuracy of a+b and a/b is slightly higher than that of beta proteins, which is in turn higher than that of alpha proteins. The performance on small proteins (mostly alpha helical) lies between proteins containing beta-sheets (a+b, a/b, and beta) and alpha helical proteins. There is only one coil-coil protein, which does not have native contacts with sequence separation >= 24.
CASP7 Results of Inter-Residue Contact Predictions of Eight Predictors.
| Separation >= 12 | Separation >= 24 | |||
| Method | Acuracy (%) | Coverage (%) | Accuracy (%) | Coverage (%) |
| SVMcon | 27.7 | 4.7 | 13.1 | 2.8 |
| BETApro | 35.4 | 5.1 | 19.7 | 3.2 |
| SAM-T06 | 20.7 | 3.5 | 18.5 | 3.9 |
| Distill | 26.4 | 2.9 | 13.7 | 1.4 |
| Possum | 15.0 | 2.3 | 21.4 | 2.6 |
| PROFcon | 12.1 | 2.0 | 8.1 | 1.6 |
| GPCPRED | 12.2 | 2.1 | 10.5 | 2.0 |
| GajdaPairings | 9.8 | 1.5 | 10.4 | 1.9 |
The eight contact map predictors are evaluated on the 13 de novo domains of CASP7. The 13 domains include (T0296, T0300, T0307, T0309, T0314, T0316 domain 2, T0319, T0347 domain 2, T0350, T0353, T0361, T0356 domain 1, T0356 domain 3). The experimental structures of the targets and the domain classification can be found at the CASP7 web site (). The accuracy and coverage of contact predictions are evaluated at sequence separation >= 12 and >= 24, respectively. It is worth noting that PROFcon only made predictions for 11 out of 13 domains. Thus its performance can not be directly compared with other methods. Here we includes its results for completeness.