| Literature DB >> 27493529 |
Shiqiao Du1, Yuichi Harano2, Masahiro Kinoshita3, Minoru Sakurai1.
Abstract
We predict protein structure using our recently developed free energy function for describing protein stability, which is focused on solvation thermodynamics. The function is combined with the current most reliable sampling methods, i.e., fragment assembly (FA) and comparative modeling (CM). The prediction is tested using 11 small proteins for which high-resolution crystal structures are available. For 8 of these proteins, sequence similarities are found in the database, and the prediction is performed with CM. Fairly accurate models with average Cα root mean square deviation (RMSD) ∼ 2.0 Å are successfully obtained for all cases. For the rest of the target proteins, we perform the prediction following FA protocols. For 2 cases, we obtain predicted models with an RMSD ∼ 3.0 Å as the best-scored structures. For the other case, the RMSD remains larger than 7 Å. For all the 11 target proteins, our scoring function identifies the experimentally determined native structure as the best structure. Starting from the predicted structure, replica exchange molecular dynamics is performed to further refine the structures. However, we are unable to improve its RMSD toward the experimental structure. The exhaustive sampling by coarse-grained normal mode analysis around the native structures reveals that our function has a linear correlation with RMSDs < 3.0 Å. These results suggest that the function is quite reliable for the protein structure prediction while the sampling method remains one of the major limiting factors in it. The aspects through which the methodology could further be improved are discussed.Entities:
Keywords: coarse-grained normal mode analysis; fragment assembly; homology modeling; protein structure prediction; replica-exchange molecular dynamics; solvation thermodynamics
Year: 2012 PMID: 27493529 PMCID: PMC4629643 DOI: 10.2142/biophysics.8.127
Source DB: PubMed Journal: Biophysics (Nagoya-shi) ISSN: 1349-2942
Properties of the 11 proteins used to test the prediction protocol and the score
| PDB | Nres | Description | SCOP | Resolution |
|---|---|---|---|---|
| 1whz | 70 | Hypothetical protein | α+β | 1.52 |
| 1ttz | 75 | Unknown Function | α/β | 2.11 |
| 1ptf | 87 | Phosphotransferase | α+β | 1.60 |
| 2he4 | 90 | Unknown Function | not classified | 1.45 |
| 1s12 | 94 | Unknown Function | α+β | 2.00 |
| 2hd3 | 96 | Ethanolamine Utilization Protein | all β | 2.40 |
| 2ivy | 101 | Hypothetical protein | α+β | 1.40 |
| 1tr0 | 106 | Plant Protein | α+β | 1.80 |
| 3dcx | 117 | Unknown Function | all β | 2.00 |
| 2hng | 127 | Hypothetical protein | α+ β | 1.63 |
| 1hka | 158 | Transferase | α+β | 1.50 |
The items listed include the PDB entry name, total number of residues (Nres), description of the biological and source of the protein, SCOP secondary structure class, experimentally determined resolution in Å.
Figure 1The plot of Fsolv as a function of Cα-RMSD for the generated models. All targeted proteins are presented. The X-axis is the RMSD of the decoy structures from the native. The Y-axis is the corresponding normalized Fsolv value.
RMSD of model structures
| PDB | Method | Closest decoy | AMBER99 selected | Best in CASPs | |
|---|---|---|---|---|---|
| 1whz | FA | 3.70 | 7.56 | 11.67 | 1.58 |
| 1ttz | FA | 2.45 | 3.14 | 4.32 | – |
| 1ptf | CM | 0.93 | 1.10 | 1.07 | – |
| 2he4[ | CM | 1.46 | 1.95 | 2.45 | 0.73 |
| 1s12[ | FA | 2.44 | 3.09 | 8.00 | 2.08 |
| 2hd3[ | CM | 1.30 | 2.20 | 2.53 | 3.78 |
| 2ivy | CM | 2.71 | 3.35 | 3.51 | – |
| 1tr0 | CM | 1.30 | 1.82 | 1.98 | 2.08 |
| 3dcx | CM | 3.09 | 3.49 | 3.51 | – |
| 2hng[ | CM | 1.23 | 2.06 | 4.64 | 5.38 |
| 1hka[ | CM | 2.03 | 2.11 | 2.11 | 6.03 |
The item listed include the PDB entry name, the method used to predict, minimum RMSD that can be found in the generated models, RMSD of selected by Fsolv, best RMSD result in previous CASPs (if available). We put asterisk (*) if the model was generated with any structural data either as fragments or templates which had not been published at the time of the corresponding CASP round. The publish date of the protein structures are obtained from the RCSB PDB web page (http://www.rcsb.org/pdb/home/home.do).
Figure 2The superposition of the native and the predicted structure for all proteins. The native and predicted structures are respectively colored in green and purple.
Figure 3The plot of AMBER99SB/GBSA energy for generated models as a function of the RMSD. The X-axis is the Cα-RMSD of the decoy structures from the native. The Y-axis is the corresponding normalized score value. The values are normalized against the score of the native structure.
Figure 4(a) The plot of Fsolv as a function of the Cα-RMSD for models generated by REMD starting from the best predicted models (PDB ID: 1ttz). The initial structure is colored in red. (b) The plot of Fsolv as a function of the value of native contact for the models generated by REMD. To compare profiles, normalization is done with the equation (Fsolv (decoy) − Fsolv (native))/(residue length).
Figure 5The plot of Fsolv as a function of Cα-RMSD for NNMs. The values of axes are the same as in Figure 1.
Spearman coefficient between RMSD and Fsolv value for NNMs
| PDB | Spearman coefficient of NNMs |
|---|---|
| 1whz | 0.72 |
| 1ttz | 0.44 |
| 1ptf | 0.74 |
| 2he4 | 0.17 |
| 1s12 | 0.55 |
| 2hd3 | 0.70 |
| 2ivy | 0.16 |
| 1tr0 | 0.38 |
| 3dcx | 0.47 |
| 2hng | 0.23 |
| 1hka | 0.43 |