| Literature DB >> 34469019 |
Abstract
The prediction of the three-dimensional (3D) structure of proteins from the amino acid sequence made a stunning breakthrough reaching atomic accuracy. Using the neural network-based method AlphaFold2, 3D structures of almost the entire human proteome have been predicted and made available (https://www.alphafold.ebi.ac.uk). To gain insight into how well AlphaFold2 structures represent the conformation of proteins in solution, I here compare the AlphaFold2 structures of selected small proteins with their 3D structures that were determined by nuclear magnetic resonance (NMR) spectroscopy. Proteins were selected for which the 3D solution structures were determined on the basis of a very large number of distance restraints and residual dipolar couplings and are thus some of the best-resolved solution structures of proteins to date. The quality of the backbone conformation of the AlphaFold2 structures is assessed by fitting a large set of experimental residual dipolar couplings (RDCs). The analysis shows that experimental RDCs fit extremely well to the AlphaFold2 structures predicted for GB3, DinI, and ubiquitin. In the case of GB3, the accuracy of the AlphaFold2 structure even surpasses that of a 1.1 Å crystal structure. Fitting of experimental RDCs furthermore allows identification of AlphaFold2 structures that are best representative of the protein's conformation in solution as seen for the EF hands of the N-terminal domain of Ca2+ -ligated calmodulin. Taken together, the analysis shows that structures predicted by AlphaFold2 can be highly representative of the solution conformation of proteins. The combination of AlphaFold2 structures with RDCs promises to be a powerful approach to study structural changes in proteins.Entities:
Keywords: AlphaFold; NMR spectroscopy; conformational dynamics; dipolar coupling
Mesh:
Substances:
Year: 2021 PMID: 34469019 PMCID: PMC8521308 DOI: 10.1002/pro.4175
Source DB: PubMed Journal: Protein Sci ISSN: 0961-8368 Impact factor: 6.725
FIGURE 1Comparison of structures predicted by AlphaFold2 (AF2) with experimental RDCs and RDC‐derived NMR structures. (a,b) The third IGG‐binding domain from streptococcal protein G (GB3): (a) RDC‐derived NMR structure (grey; PDB id: 1P7F), 1.1 Å X‐ray structure (blue; PDB id: 1IGD), AF2‐structure (green); (b) fit of four types of experimental RDCs (HN─N, Ca—Ha, Co—Ca, CO─N; taken from 1P7F.mr) to the AF2‐structure shown in (a). (c) DNA damage‐inducible protein I (DinI): RDC‐derived NMR structure (grey; PDB id: 1GHH), AF2‐models #1 to #5 (green, cyan, pink, yellow, orange, respectively) predicted using the Google collaborative notebook for AF2 prediction at https://colab.research.google.com/drive/1LVPSOf4L502F21RWBmYJJYYLDlOU2NTL, and AF2‐structure downloaded from the AF2 database (blue; https://www.alphafold.ebi.ac.uk). (d) Ubiquitin: RDC‐derived NMR structure (blue; PDB id: 2MJB), AF2‐model #3 (cyan), and AF2‐model #4 (green). Zoomed view showing a near perfect fit of AF2‐model #3 to the loop conformation in the RDC‐derived NMR structure. (e–g) Calmodulin: (e) X‐ray structure (grey; PDB id: 1CLL), and AF2‐models #1 to #5 (orange, yellow, pink, cyan, green, respectively) aligned on the C‐terminal domain (differences in the relative orientation of the N‐terminal domain are indicated by a dashed arrow‐headed line); (f) N‐terminal domain superposition of the RDC‐refined NMR structure (PDB id: 1J7O; wheat), the X‐ray structure (PDB id: 1CLL; grey) and the best‐fitting AF2‐model (magenta); (g) comparison of experimental HN‐N RDCs (blue; taken from 1J7O.mr) along the sequence of calmodulin with RDCs back‐calculated from the RDC‐refined NMR structure (PDB id: 1J7O; top), the X‐ray structure (PDB id: 1CLL; middle) and the best‐fitting AF2‐model shown in (f). The location of the four α‐helices in the N‐terminal domain of calmodulin is indicated above. Data plots were generated using http://spin.niddk.nih.gov/bax/nmrserver/dc/svd.html