| Literature DB >> 23368835 |
Castrense Savojardo1, Piero Fariselli, Pier Luigi Martelli, Rita Casadio.
Abstract
BACKGROUND: Recently, information derived by correlated mutations in proteins has regained relevance for predicting protein contacts. This is due to new forms of mutual information analysis that have been proven to be more suitable to highlight direct coupling between pairs of residues in protein structures and to the large number of protein chains that are currently available for statistical validation. It was previously discussed that disulfide bond topology in proteins is also constrained by correlated mutations.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23368835 PMCID: PMC3548674 DOI: 10.1186/1471-2105-14-S1-S10
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Performance on disulfide connectivity prediction obtained with correlated mutation measures
| # bonds | ICOV | MIp | Random | |||
|---|---|---|---|---|---|---|
| Pb = Rb | Qp | Pb = Rb | Qp | Pb = Rb | Qp | |
| 62 | 62 | 68 | 68 | 33 | 33 | |
| 52.6 | 42.4 | 47.8 | 37.7 | 20 | 7 | |
| 51.8 | 26.8 | 49.4 | 29.3 | 14 | 1 | |
| 39.5 | 16.2 | 33.5 | 13.5 | 11 | 0.1 | |
| 51.7 | 43.7 | 49.9 | 44.5 | 23 | 15 | |
#bonds: number of disulfide bonds; iCOV: sparse inverse COVariance estimation; MIp: MIp: corrected Mutual Information. Random: performance obtained by a random predictor. Here Pb = Rb since the total number of predicted bonds (which is known in this experiment) is equal to the total number of observed bonds (N= N). For index definition see Performance measures.
Performance on disulfide connectivity prediction obtained with different SVR-based methods
| # bonds | SVR | SVR+iCOV | SVR+MI | SVR+MI+iCOV | ||||
|---|---|---|---|---|---|---|---|---|
| Pb = Rb | Qp | Pb = Rb | Qp | Pb = Rb | Qp | Pb = Rb | Qp | |
| 75 | 75 | 76 | 76 | 73 | 73 | 76 | 76 | |
| 60 | 48 | 62.8 | 55.3 | 59.6 | 50.6 | 62.8 | 55.3 | |
| 57 | 44 | 67.1 | 51.2 | 61 | 46.3 | 67.7 | 51.2 | |
| 46 | 19 | 55.1 | 27 | 54.1 | 29.7 | 58.9 | 32.4 | |
| 60 | 54 | 65.2 | 58.6 | 61.9 | 55.5 | 66.2 | 59.3 | |
# bonds: number of disulfide bonds; MIp: corrected Mutual Information; iCOV: sparse inverse COVariance estimation; SVR: Support Vector Regression; and their combinations as indicated. For details see Methods. Results are evaluated on the PDBCYS dataset [12]. SVR results are taken from [12]. For index definition see Performance measures.
Prediction without a prior knowledge of the cysteine bonding state
| # bonds | DisLocate | SVR+MI+iCOV | ||||
|---|---|---|---|---|---|---|
| Rb | Pb | Qp | Rb | Pb | Qp | |
| 83 | 46 | 76 | 93 | 46 | 76 | |
| 67 | 52 | 61 | 71 | 59 | 62 | |
| 47 | 41 | 35 | 55 | 49 | 38 | |
| 52 | 37 | 35 | 63 | 48 | 38 | |
| 39 | 39 | 15 | 50 | 49 | 16 | |
| 52 | 42 | 36 | 60 | 50 | 38 | |
Legends are as in Table 2.
Figure 1Scoring the method at increasing number of sequences in the MSA. The accuracy per protein (Qp) of the different methods is plotted as a function of the number of protein chains in the multiple sequence alignment (MSA quality) used to derive information on correlated mutations. MIp: corrected Mutual Information; iCOV: sparse inverse COVariance estimation; SVR: Support Vector Regression; and their combinations as indicated. For details see Methods.
Figure 2Scoring the method at increasing NEFF value. The accuracy per protein (Qp) of the different methods is plotted as a function of the NEFF value (NEFF = 1 single sequence, NEFF = 20 random) [25]. MIp: corrected Mutual Information; iCOV: sparse inverse COVariance estimation; SVR: Support Vector Regression; and their combinations as indicated. For details see Methods.