| Literature DB >> 25236673 |
Jungkap Park, Kazuhiro Saitou1.
Abstract
BACKGROUND: Multibody potentials accounting for cooperative effects of molecular interactions have shown better accuracy than typical pairwise potentials. The main challenge in the development of such potentials is to find relevant structural features that characterize the tightly folded proteins. Also, the side-chains of residues adopt several specific, staggered conformations, known as rotamers within protein structures. Different molecular conformations result in different dipole moments and induce charge reorientations. However, until now modeling of the rotameric state of residues had not been incorporated into the development of multibody potentials for modeling non-bonded interactions in protein structures.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25236673 PMCID: PMC4262145 DOI: 10.1186/1471-2105-15-307
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Description of the interaction between atom types and . Total eight parameters are used to specify the interaction between two atoms. Here, d , θ , ϕ are the spherical coordinates of atom j with respect to the local frame of atom i, and ω is a torsional angle around d , and R and R represent the rotameric state of residues. The rotameric states are determined by side-chain dihedral angles.
Figure 2Bayesian network structure representing conditional independence of variables defined in the ROTAS potential. The angular parameters are assumed as independent of each other at the given distance and rotameric states.
Figure 3Schematic representation of residue flexibility and rotameric state. (A) Ball-stick representation of Asn which has two side-chain dihedral angles. (B) Newman diagram of three favored X1 angles in proteins. The −60, +60, and 180 angles are often referred to as gauche minus (g-), gauche plus (g+), and trans (t), respectively.
All 167 residue-specific heavy atom types and associated side-chain dihedral angles for defining their rotameric states
| Amino acids | Dihedrals | Associated atoms | Number of rotameric states |
|---|---|---|---|
| GLY | - | C, O, N, Cα | 1 |
| ALA | - | C, O, N, Cα, Cβ | 1 |
| CYS | χ1 | C, O, N, Cα, Cβ, Sγ | 3 |
| SER | χ1 | C, O, N, Cα, Cβ, Oγ | 3 |
| THR | χ1 | C, O, N, Cα, Cβ, Oγ1, Oγ2 | 3 |
| PRO | χ1 | C, O, N, Cα, Cβ, Cγ, Cδ | 3 |
| VAL | χ1 | C, O, N, Cα, Cβ, Cγ1, Cγ2 | 3 |
| ILE | χ1, X2 | C, O, N, Cα, Cβ, Cγ1, Cγ2, Cδ1 | 9 |
| LEU | χ1, X2 | C, O, N, Cα, Cβ, Cγ, Cδ1, Cδ2 | 9 |
| ASP | χ1, X2 | C, O, N, Cα, Cβ, Cγ, Oδ1, Oδ2 | 6 |
| ASN | χ1, X2 | C, O, N, Cα, Cβ, Cγ, Oδ1, Nδ2 | 6 |
| GLU | χ1, X2 | C, O, N, Cα, Cβ, Cγ | 9 |
| χ2, X3 | Cδ, Oϵ1, Oϵ2 | 6 | |
| GLN | χ1, X2 | C, O, N, Cα, Cβ | 9 |
| χ2, X3 | Cγ, Cδ, Oϵ1, Nϵ2 | 6 | |
| MET | χ1, X2 | C, O, N, Cα, Cβ | 9 |
| χ2, X3 | Cγ, Sδ, Cϵ | 9 | |
| ARG | χ1, X2 | C, O, N, Cα, Cβ | 9 |
| χ2, X3 | Cγ | 9 | |
| χ3, X4 | Cδ, Nϵ, Cξ | 9 | |
| χ4 | Hη1, Hη2 | 3 | |
| LYS | χ1, X2 | C, O, N, Cα, Cβ | 9 |
| χ2, X3 | Cγ | 9 | |
| χ3, X4 | Cδ, Cϵ, Nξ | 9 | |
| HIS | χ1, X2 | C, O, N, Cα, Cβ, Cγ, Nδ1, Cδ2 | 6 |
| χ2 | Cϵ1, Nϵ2 | 2 | |
| PHE | χ1 | C, O, N, Cα, Cβ | 3 |
| χ1, X2 | Cγ, Cδ1, Cδ2 | 6 | |
| X2 | Cϵ1, Cϵ2, Cξ | 2 | |
| TRP | χ1, X2 | C, O, N, Cα, Cβ, Cγ, Cδ1, Cδ2 | 6 |
| χ2 | Nϵ1, Cϵ2, Cϵ3, Cξ2, Cξ3, Cη2 | 2 | |
| TYR | χ1 | C, O, N, Cα, Cβ | 3 |
| χ1, X2 | Cγ, Cδ1, Cδ2 | 6 | |
| χ2 | Cϵ1, Cϵ2, Cξ, Oη | 2 |
Figure 4The distance dependence of root mean square of ( − ) for angular parameters. The observed probability distribution is calculated over all pairs of atom types. The thin, dashed and dotted curves corresponds to θ, ϕ and ω, respectively.
Figure 5Examples of the rotamer dependence of the energy terms in the ROTAS potential. (A) Disulfide bond interaction for i and j = Cys Sγ at d = 2 Å, (B) hydrogen bond interaction for i = Ser O and j = Gly N at d = 3 Å, (C) nonpolar interaction for i = Ile Cγ2 and j = Val Cγ1 at d = 5 Å, and (D) polar interaction for i = Lys Nξ and j = Asp Oδ2 at d = 7 Å.
Performance on native structure recognition
| Decoy set | Targets | dDFIRE | OPUS_PSP | RWplus | GOAP | ROTAS |
|---|---|---|---|---|---|---|
| 4state_reduced | 7 | 7 (−4.15) | 7 (−4.49) | 6 (−3.50) | 7 (−4.67) | 7 (− |
| fisa | 4 | 3 (−3.80) | 3 (−4.24) | 3 (−4.78) | 3 (−3.98) | 3 (− |
| lmds | 10 | 6 (−2.44) | 8 (− | 7 (−1.03) | 8 (−4.34) | 8 (−5.47) |
| fisa_casp3 | 5 | 4 (−4.73) | 5 (−6.33) | 4 (−5.17) | 4 (−6.65) | 4 (− |
| hg_structal | 29 | 15 (−1.25) | 18 (−2.28) | 12 (−1.70) | 20 (−2.46) |
|
| ig_structal | 61 | 26 (−0.82) | 22 (−1.13) | 0 (1.11) | 44 (−1.91) |
|
| ig_structal_hires | 20 | 16 (−2.00) | 15 (−1.79) | 0 (0.31) | 18 (−2.68) | 18 (− |
| lattice_ssfit | 8 | 8 (− | 8 (−6.56) | 8 (−8.77) | 8 (−7.94) | 8 (−8.90) |
| moulder | 20 | 18 (−2.74) | 19 (− | 19 (−2.84) | 19 (−3.53) | 19 (−3.76) |
| rosetta | 59 | 12 (−0.43) | 40 (−3.62) | 20 (−1.21) | 43 (−3.66) |
|
| I-TASSER | 56 | 48 (−5.03) | 49 (−5.40) |
| 48 (−5.81) | 49 (− |
| Amber99 | 47 | 27 (−3.42) | 20 (−2.58) | 16 (−2.38) |
| 37 (− |
| CASP5-8 | 143 | 98 (−1.34) | 134 (− | 106 (−1.67) | 139 (−2.26) |
|
| Total | 469 | 288 (−2.16) | 348 (−3.08) | 257 (−1.98) | 399 (−3.35) |
|
Numbers outside the parentheses are the numbers of correctly recognized native structures and the ones in the parentheses are the average Z-scores of the native structures. The best scores are highlighted in bold type.
Figure 6Relationship between the energy scores of ROTAS and GOAP for all native and decoy structures. Red and gray dots represent native and decoy structures, respectively.
The ability of ROTAS on native structure recognition as a function of native structure resolution
| Exp. method | Resolution | Targets | Rank1 | Z |
|---|---|---|---|---|
| NMR | - | 25 | 15 (60%) | −3.32 |
| X-ray | all | 444 | 394 (89%) | −3.82 |
| R < = 1.8 | 152 | 143 (94%) | −4.91 | |
| 1.8 < = R < 2.2 | 171 | 153 (89%) | −3.71 | |
| 2.2 < = R < 2.8 | 102 | 86 (84%) | −2.78 | |
| 2.8 < R | 19 | 12 (63%) | −1.79 |
Numbers in parentheses are the ratio of Rank1 structures.
Performance on best model selection
| Decoy set | dDFIRE | OPUS_PSP | RWplus | GOAP | ROTAS | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| logP B1 | logP B10 | logP B1 | logP B10 | logP B1 | logP B10 | logP B1 | logP B10 | logP B1 | logP B10 | |
| 4state_reduced | −3.60 | −5.84 | −4.03 |
| −2.80 | −5.70 | −4.68 | −6.04 |
| −6.10 |
| fisa | −2.68 | −4.03 | −1.57 | −3.61 | −2.18 | −4.06 |
| −4.34 | −2.23 |
|
| lmds | −1.51 | −3.39 | −1.08 | −3.36 | −1.04 | −3.45 |
|
| −1.83 | −3.57 |
| fisa_casp3 | −1.42 | −3.24 | −0.81 | −3.13 | −1.19 |
|
| −3.33 | −1.30 | −3.78 |
| hg_structal | −2.44 |
| −2.55 | −3.17 | −2.50 | −3.33 | −2.42 | −3.29 |
| −3.31 |
| ig_structal | −2.06 | −3.58 |
|
| −2.14 | −3.56 | −2.17 | −3.69 | −1.96 | −3.67 |
| ig_structal_hires | −1.84 | −2.66 | −1.93 |
|
| −2.81 | −1.91 | −2.71 | −1.83 | −2.77 |
| moulder | −3.17 | −4.79 | −2.71 | −4.62 | −3.06 | −4.90 |
| −5.08 | −3.72 |
|
| lattice_ssfit | −1.60 | −3.68 | −1.03 | −3.53 | −1.13 |
| −1.24 | −2.72 |
| −3.01 |
| rosetta | −1.30 | −3.45 |
| −3.18 | −1.72 |
| −1.65 | −3.56 | −1.51 | −3.59 |
| I-TASSER | −1.83 |
| −1.26 | −3.60 | −1.78 | −3.73 | −1.77 | −3.61 |
| −3.69 |
| Amber99 | −3.64 | −5.43 | −3.03 | −4.72 | −3.48 | −4.94 | −4.09 | −5.64 |
|
|
| CASP5-8 | −1.89 | −2.80 | −1.36 | −2.77 | −1.88 |
|
| −2.80 | −1.87 | −2.80 |
| Total | −2.11 | −3.58 | −1.90 | −3.44 | −2.11 | −3.56 |
| −3.60 | −2.23 |
|
The best scores are highlighted in bold type.
Performance on correlation coefficients between energy score and model quality
| Decoy set | dDFIRE | OPUS_PSP | RWplus | GOAP | ROTAS | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| r | τ | r | τ | r | τ | r | τ | r | τ | |
| 4state_reduced | −0.693 | −0.483 | −0.590 | −0.399 | −0.605 | −0.417 | −0.766 | −0.550 |
|
|
| fisa | −0.461 | −0.321 | −0.282 | −0.189 | −0.462 | −0.315 |
|
| −0.442 | −0.297 |
| lmds |
|
| −0.091 | −0.054 | −0.147 | −0.095 | −0.228 | −0.149 | −0.227 | −0.149 |
| fisa_casp3 |
|
| −0.090 | −0.063 | −0.236 | −0.152 | −0.161 | −0.102 | −0.182 | −0.117 |
| hg_structal | −0.796 | −0.618 | −0.752 | −0.553 | −0.806 |
| −0.808 | −0.609 |
| −0.602 |
| ig_structal | −0.766 | −0.308 | −0.779 | −0.340 | −0.782 | −0.277 |
|
| −0.836 | −0.372 |
| ig_structal_hires | −0.844 | −0.373 | −0.832 | −0.403 | −0.879 | −0.411 |
|
| −0.860 | −0.401 |
| lattice_ssfit | −0.068 | −0.047 | −0.050 | −0.033 |
|
| −0.034 | −0.025 | −0.043 | −0.029 |
| moulder | −0.832 |
| −0.755 | −0.600 | −0.792 | −0.642 | −0.823 | −0.660 |
| −0.665 |
| rosetta | −0.265 | −0.176 | −0.192 | −0.113 | −0.350 |
| −0.330 | −0.212 |
| −0.221 |
| I-TASSER |
|
| −0.281 | −0.195 | −0.485 | −0.290 | −0.465 | −0.276 | −0.456 | −0.271 |
| Amber99 | −0.609 | −0.339 | −0.421 | −0.201 | −0.526 | −0.313 | −0.692 | −0.355 |
|
|
| CASP5-8 | −0.594 | −0.488 | −0.440 | −0.354 | −0.611 | −0.501 | −0.593 | −0.490 |
|
|
| Total | −0.581 | −0.380 | −0.465 | −0.297 | −0.584 | −0.382 | −0.603 | −0.394 |
|
|
r: Pearson’s correlation coefficient.
τ: Kendall’s rank correlation coefficient.
The best scores are highlighted in bold type.
Figure 7Examples of Pearson correlation between ROTAS energy and TM-score. (A) 1SCP_ in I-TASSER, (B) 1CAU in Moulder, (C) 1LOU in Rosetta and (D) T0324 in CASP7. The native structures are included and represented as empty red circle at TM-score = 1.
The area under the ROC curves for classification of near-native and non-native model
| Targets | <|P|> | <|N|> | dDFIRE | OPUS_PSP | RWplus | GOAP | ROTAS | |
|---|---|---|---|---|---|---|---|---|
| 4state_reduced | 7 | 195 | 468 | 0.86 | 0.80 | 0.81 | 0.91 |
|
| fisa | 2 | 47 | 453 | 0.79 | 0.60 |
| 0.79 | 0.77 |
| lmds | 2 | 60 | 439 |
| 0.64 | 0.66 | 0.61 | 0.56 |
| fisa_casp3 | 2 | 20 | 1672 |
| 0.58 | 0.72 | 0.68 | 0.70 |
| moulder | 19 | 151 | 169 | 0.95 | 0.93 | 0.95 | 0.95 |
|
| rosetta | 27 | 50 | 50 | 0.71 | 0.66 | 0.74 | 0.75 |
|
| I-TASSER | 31 | 229 | 217 | 0.79 | 0.71 | 0.77 | 0.80 |
|
| Amber99 | 41 | 219 | 821 | 0.87 | 0.79 | 0.83 | 0.93 |
|
| CASP5-8 | 89 | 14 | 7 | 0.82 | 0.75 | 0.84 | 0.83 |
|
| Total | 220 | 105 | 245 | 0.82 | 0.75 | 0.82 | 0.84 |
|
| p-value | 1.02E-04 | 7.70E-27 | 1.30E-03 | 2.03E-06 |
<|P|>: Averaged number of positive (near-native) models in each target.
<|N|>: Averaged number of negative (non-native) models in each target.
p-value: P value of paired t-test of the difference of the AUC between ROTAS and the given potential.
The best scores are highlighted in bold type.
Figure 8Influence of the cutoff distance on the performance of ROTAS and GOAP. (A) Number of first-ranked native structures, (B) Average Z-score of the native structures, (C) Average log P and (D) Average Pearson’s correlation coefficient between TM-score and energy score.
Comparison of different reference states in ROTAS
| Ref. state | Rank1 | Z-score | logP B1 | logP B10 | Pearson’s r | Kendall’s τ |
|---|---|---|---|---|---|---|
| DFIRE | 409 | −3.795 | −2.233 |
| −0.612 | −0.396 |
| DOPE | 409 | −3.810 | −2.172 | −3.576 | −0.566 | −0.358 |
| RW | 408 | −3.818 |
| −3.645 |
|
|
| RAPDF | 409 |
| −2.185 | −3.592 | −0.578 | −0.367 |
| KBP | 409 | −3.638 | −2.276 | −3.630 | −0.609 | −0.393 |
The best scores are highlighted in bold type.
Performance of EPAD, ROTAS and ROTAS + EPAD
| Rank1 | Z-score | logP B1 | logP B10 | Pearson’s r | Kendall’s τ | |
|---|---|---|---|---|---|---|
| EPAD | 260 | −2.13 | −2.11 | −3.56 |
|
|
| ROTAS | 409 | −3.80 |
|
| −0.61 | −0.40 |
| EPAD + ROTAS |
|
| −2.22 | −3.61 | −0.59 | −0.38 |
The best scores are highlighted in bold type.