| Literature DB >> 35133792 |
Rui Wang1, Jiahui Chen1, Yuta Hozumi1, Changchuan Yin2, Guo-Wei Wei1,3,4.
Abstract
The surge of COVID-19 infections has been fueled by new SARS-CoV-2 variants, namely Alpha, Beta, Gamma, Delta, and so forth. The molecular mechanism underlying such surge is elusive due to the existence of 28 554 unique mutations, including 4 653 non-degenerate mutations on the spike protein. Understanding the molecular mechanism of SARS-CoV-2 transmission and evolution is a prerequisite to foresee the trend of emerging vaccine-breakthrough variants and the design of mutation-proof vaccines and monoclonal antibodies. We integrate the genotyping of 1 489 884 SARS-CoV-2 genomes, a library of 130 human antibodies, tens of thousands of mutational data, topological data analysis, and deep learning to reveal SARS-CoV-2 evolution mechanism and forecast emerging vaccine-breakthrough variants. We show that prevailing variants can be quantitatively explained by infectivity-strengthening and vaccine-escape (co-)mutations on the spike protein RBD due to natural selection and/or vaccination-induced evolutionary pressure. We illustrate that infectivity strengthening mutations were the main mechanism for viral evolution, while vaccine-escape mutations become a dominating viral evolutionary mechanism among highly vaccinated populations. We demonstrate that Lambda is as infectious as Delta but is more vaccine-resistant. We analyze emerging vaccine-breakthrough comutations in highly vaccinated countries, including the United Kingdom, the United States, Denmark, and so forth. Finally, we identify sets of comutations that have a high likelihood of massive growth: [A411S, L452R, T478K], [L452R, T478K, N501Y], [V401L, L452R, T478K], [K417N, L452R, T478K], [L452R, T478K, E484K, N501Y], and [P384L, K417N, E484K, N501Y]. We predict they can escape existing vaccines. We foresee an urgent need to develop new virus combating strategies.Entities:
Keywords: COVID-19; SARS-CoV-2; comutations; infectivity; vaccine-breakthrough; vaccine-resistant
Year: 2022 PMID: 35133792 PMCID: PMC8848511 DOI: 10.1021/acsinfecdis.1c00557
Source DB: PubMed Journal: ACS Infect Dis ISSN: 2373-8227 Impact factor: 5.084
Figure 1Most significant RBD mutations. (a) The 3D structure of SARS-CoV-2 S protein RBD and ACE2 complex (PDB ID: 6M0J). The RBD mutations in 10 variants are marked with color. (b) Illustration of the time evolution of 455 ACE2 binding-strengthening RBD mutations (blue) and 228 ACE2 binding-weakening RBD mutations (red). The x-axis represents the date and the y-axis represents the natural log of frequency. There has been a surge in the number of infections since early 2021. (c) BFE changes of RBD complexes with ACE2 and 130 antibodies induced by 75 significant RBD mutations. A positive BFE change (blue) means the mutation strengthens the binding, while a negative BFE change (red) means the mutation weakens the binding. Most mutations, except for vaccine-resistant Y449H and Y449S, strengthen the RBD binding with ACE2. Y449S and K417N are highly disruptive to antibodies.
Top 25 Most Observed S Protein RBD Mutationsa
| worldwide | BFE change | antibody
disruption | |||||
|---|---|---|---|---|---|---|---|
| mutation | count | rank | change | rank | count | ratio | rank |
| N501Y | 744354 | 1 | 0.5499 | 30 | 24 | 18.46 | 160 |
| L452R | 259345 | 2 | 0.5752 | 28 | 39 | 30.0 | 98 |
| T478K | 239619 | 3 | 0.9994 | 2 | 2 | 1.54 | 557 |
| E484K | 84167 | 4 | 0.0946 | 272 | 38 | 29.23 | 104 |
| K417T | 37748 | 5 | 0.0116 | 433 | 37 | 28.46 | 107 |
| S477N | 32673 | 6 | 0.0180 | 422 | 0 | 0.0 | 650 |
| N439K | 16154 | 7 | 0.1792 | 159 | 11 | 8.46 | 272 |
| K417N | 8399 | 8 | 0.1661 | 176 | 53 | 40.77 | 61 |
| F490S | 5617 | 9 | 0.4406 | 52 | 51 | 39.23 | 67 |
| S494P | 5119 | 10 | 0.0902 | 282 | 62 | 47.69 | 46 |
| N440K | 3379 | 11 | 0.6161 | 22 | 0 | 0.0 | 645 |
| E484Q | 3229 | 12 | 0.0057 | 442 | 30 | 23.08 | 130 |
| L452Q | 2858 | 13 | 0.9802 | 3 | 27 | 20.77 | 144 |
| A520S | 2727 | 14 | 0.1495 | 199 | 3 | 2.31 | 497 |
| N501T | 2054 | 15 | 0.4514 | 48 | 17 | 13.08 | 202 |
| R357K | 1973 | 16 | 0.1393 | 208 | 5 | 3.85 | 388 |
| A522S | 1959 | 17 | 0.1283 | 221 | 2 | 1.54 | 543 |
| R346K | 1686 | 18 | 0.1234 | 229 | 6 | 4.62 | 380 |
| V367F | 1395 | 19 | 0.1764 | 161 | 0 | 0.0 | 637 |
| N440S | 1361 | 20 | 0.1499 | 197 | 2 | 1.54 | 542 |
| P384L | 1155 | 21 | 0.2681 | 105 | 18 | 13.85 | 199 |
| Y449S | 1146 | 22 | –0.8112 | 632 | 85 | 65.38 | 16 |
| D427N | 1106 | 23 | –0.1133 | 558 | 1 | 0.77 | 589 |
| R346S | 1037 | 24 | 0.0374 | 386 | 20 | 15.38 | 182 |
| A475V | 891 | 25 | 0.3069 | 94 | 10 | 7.69 | 289 |
Here, BFE change refers to the BFE change for the S protein and human ACE2 complex induced by a single-site S protein RBD mutation. A positive mutation-induced BFE change strengthens the binding between S protein and ACE2, which results in more infectious variants. Counts of antibody disruption represent the number of antibody and S protein complexes disrupted by a specific RBD mutation. Here, an antibody and S protein complex is to be disrupted if its binding affinity is reduced by more than 0.3 kcal/mol.[18] In addition, we calculate the antibody disruption ratio (%), which is the ratio of the number of disrupted antibody and S protein complexes over 130 known complexes. Ranks are computed from 683 observed RBD mutations.
Figure 2Properties of RBD comutations. (a) Illustration of RBD 2 comutations with a frequency greater than 90. (b) Illustration of RBD 3 comutations with a frequency greater than 30. (c) Illustration of RBD 4 comutations with a frequency greater than 20. Here, the x-axis lists RBD comutations and the y-axis represents the predicted total BFE change between S RBD and ACE2 of each set of RBD comutations. The number on the top of each bar is the AI-predicted number of antibody and RBD complexes that may be significantly disrupted by the set of RBD comutations, and the color of each bar represents the natural log of frequency for each set of RBD comutations. (Please check the interactive HTML files in the Supporting Information S2.2.4 for a better view of these plots.)
Figure 3(a) Two-dimensional histograms of antibody disruption count and total BFE changes for 2 comutations (unit: kcal/mol). (b) Two-dimensional histograms of antibody disruption count and total BFE changes (unit: kcal/mol) for RBD 3 comutations. (c) Two-dimensional histograms of antibody disruption count and total BFE changes (unit: kcal/mol) for RBD 4 comutations. (d) The histograms of total BFE changes (unit: kcal/mol) for RBD comutations. (e) The histograms of the natural log of frequency for RBD comutations. (f) The histograms of antibody disruption count for RBD comutations. In panels a–c, the color bar represents the number of comutations that fall into the restriction of x-axis and y-axis. The reader is referred to the web version of these plots in the Supporting Information S2.2.2 and S2.2.3.
Figure 4Illustration of the time evolution of 2, 3, and 4 comutations on the S protein RBD of SARS-CoV-2 from January 01, 2021, to July 31, 2021, in 12 COVID-19 devastated countries: the United Kingdom (UK), the United States (US), Denmark (DK), Brazil (BR), Germany (DE), Netherlands (NL), Sweden (SE), Italy (IT), Canada (CA), France (FR), India (IN), and Belgium (BE). The y-axis represents the natural log frequency of each RBD comutation. The top five high-frequency comutations in each country are marked by red, blue, green, yellow, and pink lines. The cyan line is for the RBD comutation [L452Q, F490S] on the Lambda variant, and the other comutations are marked by light gray lines. Notably, there are two blues lines in the panel of FR due to the same frequency of [K417N, E484K, N501Y] and [E484K, N501Y]. (Please check the interactive HTML files in the Supporting Information S2.2.1 for a better view of these plots.)
Figure 5(a) Illustration of genome sequence data preprocessing and BFE change predictions. (b) Comparison of experimental CT-P59 IC50 fold change (reduction)[35] and predicted BFE changes induced by mutations L452R and T478K. (c) Comparison of predicted BFE changes and relative luciferase units[25] for pseudovirus infection changes of ACE2 and S protein complex induced by mutations L452R and N501Y.