| Literature DB >> 34031561 |
Maxwell I Zimmerman1,2, Justin R Porter1,2, Michael D Ward1,2, Sukrit Singh1,2, Neha Vithani1,2, Artur Meller1,2, Upasana L Mallimadugula1,2, Catherine E Kuhn1,2, Jonathan H Borowsky1,2, Rafal P Wiewiora3,4, Matthew F D Hurley5, Aoife M Harbison6, Carl A Fogarty6, Joseph E Coffland7, Elisa Fadda6, Vincent A Voelz5, John D Chodera4, Gregory R Bowman8,9.
Abstract
SARS-CoV-2 has intricate mechanisms for initiating infection, immune evasion/suppression and replication that depend on the structure and dynamics of its constituent proteins. Many protein structures have been solved, but far less is known about their relevant conformational changes. To address this challenge, over a million citizen scientists banded together through the Folding@home distributed computing project to create the first exascale computer and simulate 0.1 seconds of the viral proteome. Our adaptive sampling simulations predict dramatic opening of the apo spike complex, far beyond that seen experimentally, explaining and predicting the existence of 'cryptic' epitopes. Different spike variants modulate the probabilities of open versus closed structures, balancing receptor binding and immune evasion. We also discover dramatic conformational changes across the proteome, which reveal over 50 'cryptic' pockets that expand targeting options for the design of antivirals. All data and models are freely available online, providing a quantitative structural atlas.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34031561 PMCID: PMC8249329 DOI: 10.1038/s41557-021-00707-0
Source DB: PubMed Journal: Nat Chem ISSN: 1755-4330 Impact factor: 24.427
Figure 1:Summary of Folding@home’s computational power.
A) The growth of Folding@home (F@H) in response to COVID-19. The cumulative number of users is shown in blue and COVID-19 cases are shown in orange. B) Global distribution of Folding@home users. Each yellow dot represents a unique IP address contributing to Folding@home. C) The processing speed of Folding@home and the next 10 fastest supercomputers, in exaFLOPS.
Figure 2:Structural characterization of Spike opening and conformational masking for three Spike homologues.
A) An example structure of SARS-CoV-2 Spike protein from our simulations that is fully compatible with receptor binding, as shown by superimposing ACE2 (gray). The three chains of Spike are illustrated with a cartoon and transparent surface representation (orange, teal, and purple), and glycans are shown as sticks (green). B) Three Spike homologues have very different probabilities of adopting ACE2 binding competent conformations, likely modulating their affinities for both ACE2 and antibodies that engage the ACE2-binding interface. HCoV-NL63, SARS-CoV-1, and SARS-CoV-2 are shown as light-blue, orange, and black, respectively. C) The probability distribution of Spike opening for each homologue. Opening is quantified in terms of how far the center of mass of an RBD deviates from its position in the closed (or down) state. The cryptic epitope for the antibody CR3022 (red) is only accessible to antibody binding in extremely open conformations. D) Our simulations capture exposure of cryptic epitopes that are buried in the up and down cryoEM structures. The fraction of residues within different epitopes that are exposed to a 0.5 nm radius probe for the down structure (blue), up structure (yellow), the ensemble average from our simulations (green), and the maximum value we observe in our simulations (red). Epitopes are determined as the residues that contact the specified antibody, and are clustered by their binding location on the RBD.[36]
Figure 3:Effects of glycan shielding and conformational masking on the accessibility of different parts of the Spike to potential therapeutics.
A) The probability that a residue is exposed to potential therapeutics, as determined from our structural ensemble. Red indicates a high probability of being exposed and blue indicates a low probability of being exposed. B) Exposure probabilities colored on the surface of the Spike protein. Exposed patches are circled in orange. Red residues have a higher probability of being exposed, whereas blue residues have a lower probability of being exposed. Green atoms denote glycans. C) Sequence conservation score colored onto the Spike protein. A conserved patch on the protein is circled in orange. Red residues have higher conservation, whereas blue residues have lower conservation. D) The difference in the probability that each residue is exposed between the ACE2-binding competent conformations and the entire ensemble. Red residues have a higher probability of being exposed upon opening, whereas blue residues have a lower probability of being exposed.
Figure 4:Examples of cryptic pockets and functionally-relevant dynamics.
A-B) Conformational ensemble of Mpro (monomeric) highlighting cryptic pockets near the active site (AS) and domain interface (DI). Conformational states (black circles) are projected onto the solvent accessible surface areas (SASAs) of residues surrounding either the active-site or dimerization interface. The starting structure for simulations (6Y2E) is shown as a red dot. Representative structures are depicted with cartoon and transparent surface. Domains I and II are colored cyan and domain III is colored gray. The loop of domain III, which covers the active-site residues and is seen to be highly dynamic, is colored red. C-D) The conformational ensemble from our simulations of nucleoprotein is similar to the distribution of structures seen experimentally. Conformational states are projected onto the distance and angle between the positive finger and a nearby loop. Angles were calculated between vectors that point along each red segment in panel D and distances were calculated between their centers of mass. Cluster centers are represented as black circles, the starting structure for simulations (6VYO) is shown as a red dot, and NMR structures are shown with solid blue dots. Representative structures are shown as cartoons.
Summary of protein systems we have simulated on Folding@home, organized by viral strain.
| System name | Oligomerization | Initial structure | Residues | Atoms in system | Aggregate simulation time (μs) | Cryptic pockets discovered |
|---|---|---|---|---|---|---|
| NSP3 (Macrodomain “X”) | Monomer | 6W02 | 167 | 23907 | 10,906 | - |
| NSP3 (Papain-like protease 2, PL2pro) | Monomer | 3E9S | 306 | 97285 | 731 | 2 |
| NSP5 (main protease, Mpro, 3CLpro) | Monomer | 6Y2E | 306 | 64791 | 6,405 | 2 |
| NSP5 (main protease, Mpro, 3CLpro) | Dimer | 6Y2E | 612 | 77331 | 2,902 | 2 |
| NSP7 | Monomer | 5F22 | 79 | 20094 | 3,722 | 3 |
| NSP8 | Monomer | 2AHM | 191 | 156282 | 1,776 | 3 |
| NSP9 | Dimer | 6W4B | 226 | 49885 | 8,939 | 2 |
| NSP10 | Monomer | 6W4H | 131 | 29560 | 6,141 | 2 |
| NSP12 (polymerase) | Monomer | 6NUR | 891 | 186622 | 3,330 | 3 |
| NSP13 (helicase) | Monomer | 6JYT | 596 | 129368 | 3,407 | 3 |
| NSP14 | Monomer | 5C8S | 527 | 216380 | 2,384 | 2 |
| NSP15 | Monomer | 6VWW | 347 | 67345 | 3,674 | 4 |
| NSP15 | Hexamer | 6VWW | 2082 | 230339 | 4,270 | - |
| NSP16 | Monomer | 6W4H | 298 | 45672 | 2,382 | 5 |
| Nucleoprotein (RBD) | Monomer | 6VYO | 173 | 29125 | 9,493 | 3 |
| Nucleoprotein Dimerization Domain | Monomer | 6YUN | 118 | 34905 | 6,782 | - |
| Nucleoprotein Dimerization Domain | Dimer | 6YUN | 236 | 72733 | 1,458 | 2 |
| Spike | Trimer | 6VXX | 3363 | 442881 | 1,109 | - |
| NSP7 / NSP8/ NSP12 | Trimer complex | 6NUR | 1184 | 215694 | 1,686 | - |
| NSP10 / NSP14 | Dimer complex | 5C8S | 688 | 226672 | 689 | 3 |
| NSP10 / NSP16 | Dimer complex | 6W4H | 429 | 63752 | 3,463 | 2 |
| NSP3 (Macrodomain “X”) | Monomer | 2FAV | 172 | 33117 | 659 | - |
| NSP9 | Dimer | 1QZ8 | 226 | 49599 | 7,763 | - |
| NSP15 | Monomer | 2H85 | 345 | 67345 | 4,734 | - |
| NSP15 | Hexamer | 2H85 | 2070 | 230339 | 1,130 | - |
| Nucleoprotein RBD | Monomer | 2OFZ | 174 | 29125 | 4,088 | - |
| Nucleoprotein Dimerization Domain | Monomer | 2GIB | 370 | 34905 | 1,626 | - |
| Nucleoprotein Dimerization Domain | Dimer | 2GIB | 740 | 72733 | 4,221 | - |
| Spike | Trimer | 5X58 | 3261 | 375851 | 741 | - |
| NSP10 / NSP16 | Dimer complex | 6W4H | 425 | 69589 | 518 | - |
| IL6 | Monomer | 1ALU | 166 | 26855 | 1,593 | 2 |
| IL6-R | Monomer | 1N26 | 299 | 149764 | 196 | 5 |
| ACE2 | Monomer | 6LZG | 596 | 75787 | 664 | 2 |
| NSP13 | Monomer | 5WWP | 596 | 121134 | 719 | - |
| NSP10 / NSP16 | Dimer Complex | 6W4H | 424 | 69127 | 518 | - |
| Spike | Trimer | 5SZS | 3606 | 453348 | 651 | - |
Missing residues were modeled using Swiss model.[56]
Structural model was generated from a homologous sequence using Swiss model.[56]
Missing residues were modeled using CHARMM-GUI.[57,58]