| Literature DB >> 19244388 |
Allan Lo1, Yi-Yuan Chiu, Einar Andreas Rødland, Ping-Chiang Lyu, Ting-Yi Sung, Wen-Lian Hsu.
Abstract
MOTIVATION: Helix-helix interactions play a critical role in the structure assembly, stability and function of membrane proteins. On the molecular level, the interactions are mediated by one or more residue contacts. Although previous studies focused on helix-packing patterns and sequence motifs, few of them developed methods specifically for contact prediction.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19244388 PMCID: PMC2666818 DOI: 10.1093/bioinformatics/btp114
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Overview of TMhit methodology. In the left panel, the first level of TMhit is described. Contact residue prediction is performed in the following order: peptide extraction using sliding windows, feature encoding and prediction by SVM in Level 1. The ‘x’ marks the predicted contact residues in the first level. In the right panel, the second level of TMhit predicts the contact pair candidates based on the output of the first level (i.e. only the residues predicted in the first level as marked by an ‘x’ are considered). Contact pair prediction is performed in the following order: peptide pair extraction using sliding windows, feature encoding and prediction by SVM in Level 2. The final output is a contact map comprised of all predicted contact pairs.
Contact residue prediction accuracy of the independent test set
| Methods | Accuracy (%) | Sensitivity (%) | L2 input (%) | |
|---|---|---|---|---|
| TM | 44.8 (±6.4) | 68.9 (±3.2) | 0.23 (±0.05) | 19.2 |
| TM | 66.5 (±5.5) | 70.6 (±2.1) | 0.47 (±0.03) | 21.2 |
The standard error (SEboot) estimated by bootstrapping follows the ‘±’ sign.
L2 Input (%) denotes the remaining contact pair candidates as input for prediction in L2.
Contact pair prediction accuracy of direct prediction and two-level models on the independent test set
| Methods | Contact pair prediction | δ-analysis (|δ| = 4) | |||
|---|---|---|---|---|---|
| Accuracy (%) | Accuracy (%) | ||||
| Direct prediction | |||||
| TM | 9.8 (±4.0) | 47/480 | 26.4 | 1.2e-57 | 30.6 (±7.8) |
| TM | 12.7 (±4.9) | 61/480 | 34.2 | 1.6e-72 | 38.5 (±7.3) |
| Two-level model | |||||
| TM | 12.5 (±4.8) | 60/481 | 33.7 | 5.4e-80 | 34.8 (±6.5) |
| TM | 16.0 (±4.5) | 77/481 | 43.1 | 1.1e-99 | 41.2 (±7.2) |
The standard error (SE) estimated by bootstrapping follows the ‘±’ sign.
Comparison of contact pair prediction accuracy with CMA methods using observed information and contact definitions in HelixCorr
| Methods | Contact pair prediction | δ-analysis (|δ| = 4) | |
|---|---|---|---|
| Accuracy (%) | Accuracy (%) | ||
| TM | 31.0 (±7.0) | 2.2e-109 | 56.8(±7.5) |
| TM | 23.6 (±7.5) | 3.2e-71 | 48.6(±8.0) |
| McBASC McLachlan | 10.0 | 6.5e-17 | 46.0 |
| OMES KASS | 9.0 | 1.2e-14 | 50.0 |
| ELSC | 10.0 | 6.5e-17 | 41.0 |
| CONSENSUS-14 | 12.0 | 1.4e-53 | 55.0 |
| CONSENSUS-R-5 | 11.0 | 3.6e-42 | 56.0 |
The standard error (SEboot) estimated by bootstrapping follows the ‘±’ sign.
Prediction performance in helix–helix interaction using TMhit on the independent test set
| Thresholds ( | Accuracy (%) | Sensitivity (%) | Specificity (%) |
|---|---|---|---|
| 1 contact pair | 39.1 (±5.0) | 71.8 (±4.1) | 59.4 (±4.1) |
| 2 contact pairs | 45.4 (±6.1) | 51.8 (±7.1) | 77.4 (±3.2) |
| 3 contact pairs | 55.7 (±7.3) | 40.0 (±8.2) | 88.5 (±1.8) |
| 4 contact pairs | 61.4 (±6.8) | 31.8 (±6.7) | 92.7 (±1.2) |
| 5 contact pairs | 66.7 (±6.4) | 25.9 (±6.6) | 95.3 (±1.2) |
| 6 contact pairs | 76.0 (±7.2) | 22.4 (±5.4) | 97.4 (±1.1) |
| 7 contact pairs | 77.3 (±6.9) | 20.0 (±4.8) | 97.9 (±1.2) |
| 8 contact pairs | 82.4 (±8.3) | 16.5 (±4.3) | 98.7 (±1.0) |
The standard error (SE) estimated by bootstrapping follows the ‘±’ sign.
Fig. 2.Comparison of contact pair prediction accuracy as a function of Cd by direct and two-level models on the development set using observed information (topology and RSA). Direct prediction (L2 only) is shown in filled triangle and its accuracy is shown in a dotted horizontal line. Two-level models are shown in filled (selected) or empty circles (others). The regression curve was estimated from all models (smoothing parameter α = 0.8) using the LOCFIT package (Loader, 2004) and the dashed line indicates the confidence band at 95% confidence limits. Inset: comparison of prediction accuracy as a function of percent remaining contact pair candidates for prediction by Level 2.