| Literature DB >> 34791371 |
Mihaly Varadi1, Stephen Anyango1, Mandar Deshpande1, Sreenath Nair1, Cindy Natassia1, Galabina Yordanova1, David Yuan1, Oana Stroe1, Gemma Wood1, Agata Laydon2, Augustin Žídek2, Tim Green2, Kathryn Tunyasuvunakool2, Stig Petersen2, John Jumper2, Ellen Clancy2, Richard Green2, Ankur Vora2, Mira Lutfi2, Michael Figurnov2, Andrew Cowie2, Nicole Hobbs2, Pushmeet Kohli2, Gerard Kleywegt1, Ewan Birney1, Demis Hassabis2, Sameer Velankar1.
Abstract
The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by AlphaFold v2.0 of DeepMind, it has enabled an unprecedented expansion of the structural coverage of the known protein-sequence space. AlphaFold DB provides programmatic access to and interactive visualization of predicted atomic coordinates, per-residue and pairwise model-confidence estimates and predicted aligned errors. The initial release of AlphaFold DB contains over 360,000 predicted structures across 21 model-organism proteomes, which will soon be expanded to cover most of the (over 100 million) representative sequences from the UniRef90 data set.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34791371 PMCID: PMC8728224 DOI: 10.1093/nar/gkab1061
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 19.160
Structural predictions for complete proteomes in AlphaFold DB
| Species | Common name | Reference proteome | Predicted structures |
|---|---|---|---|
|
|
| UP000006548 | 27 434 |
|
| Nematode worm | UP000001940 | 19 694 |
|
|
| UP000000559 | 5974 |
|
| Zebrafish | UP000000437 | 24 664 |
|
|
| UP000002195 | 12 622 |
|
| Fruit fly | UP000000803 | 13 458 |
|
|
| UP000000625 | 4363 |
|
| Soybean | UP000008827 | 55 799 |
|
| Human | UP000005640 | 23 391 |
|
|
| UP000008153 | 7924 |
|
|
| UP000000805 | 1773 |
|
| Mouse | UP000000589 | 21 615 |
|
|
| UP000001584 | 3988 |
|
| Asian rice | UP000059680 | 43 649 |
|
|
| UP000001450 | 5187 |
|
| Rat | UP000002494 | 21 272 |
|
| Budding yeast | UP000002311 | 6040 |
|
| Fission yeast | UP000002485 | 5128 |
|
|
| UP000008816 | 2888 |
|
|
| UP000002296 | 19 036 |
|
| Maize | UP000007305 | 39 299 |
AlphaFold DB provides free access to over 360,000 predicted structures across 21 proteomes. The data set contains proteins with sequence lengths of 16–2700 and excludes isoforms and sequences with unknown or non-standard amino acids.
Figure 1.Searching AlphaFold DB. AlphaFold DB provides a search engine to find proteins of interest based on gene or protein name, UniProt accession or organism name. The search results can be filtered if required and clicking on a protein name leads to the relevant protein-specific entry page.
Figure 2.Meta-information and 3D visualization of the AlphaFold structure predictions. The protein-specific web pages display essential metadata for the protein of interest, such as known biological functions and cross-references to UniProt and PDBe-KB. Users can download the predicted models in PDB and mmCIF format, and an interactive molecular viewer visualizes the structure, coloured by the per-residue pLDDT confidence measure.
Figure 3.Visualization of Predicted Aligned Errors. Protein-specific pages contain an interactive 2D plot of the PAE values. This tool interacts with the 3D molecular viewer to facilitate the identification of domains whose relative positions and orientations AlphaFold predicts with confidence. In this example (https://alphafold.ebi.ac.uk/entry/Q93074), AlphaFold has high confidence in the relative position of domains at residues 1–500 (green) and residues 1200–1700 (blue), but not with the region between 500–1200 (orange) nor the C-terminus.