| Literature DB >> 34342271 |
Nazim Bouatta1, Peter Sorger1, Mohammed AlQuraishi2.
Abstract
The functions of most proteins result from their 3D structures, but determining their structures experimentally remains a challenge, despite steady advances in crystallography, NMR and single-particle cryoEM. Computationally predicting the structure of a protein from its primary sequence has long been a grand challenge in bioinformatics, intimately connected with understanding protein chemistry and dynamics. Recent advances in deep learning, combined with the availability of genomic data for inferring co-evolutionary patterns, provide a new approach to protein structure prediction that is complementary to longstanding physics-based approaches. The outstanding performance of AlphaFold2 in the recent Critical Assessment of protein Structure Prediction (CASP14) experiment demonstrates the remarkable power of deep learning in structure prediction. In this perspective, we focus on the key features of AlphaFold2, including its use of (i) attention mechanisms and Transformers to capture long-range dependencies, (ii) symmetry principles to facilitate reasoning over protein structures in three dimensions and (iii) end-to-end differentiability as a unifying framework for learning from protein data. The rules of protein folding are ultimately encoded in the physical principles that underpin it; to conclude, the implications of having a powerful computational model for structure prediction that does not explicitly rely on those principles are discussed. open access.Entities:
Keywords: AlphaFold2; CASP14; protein structure prediction
Mesh:
Substances:
Year: 2021 PMID: 34342271 PMCID: PMC8329862 DOI: 10.1107/S2059798321007531
Source DB: PubMed Journal: Acta Crystallogr D Struct Biol ISSN: 2059-7983 Impact factor: 7.652
Figure 1AlphaFold2 prediction of the full-length chain of human EGFR (UniProt ID: P00533) color coded by model confidence (dark blue, highly confident; dark orange, very low confidence). Individual domains are confidently predicted, but inter-domain arrangement is not, as evidenced by long unstructured linkers with very low model confidence.