Literature DB >> 35608456

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness.

Pardis C Sabeti^1,2,3,4,5, Jacob E Lemieux^1,4,6, Fritz Obermeyer^1,7, Martin Jankowiak^1,7, Nikolaos Barkas¹, Stephen F Schaffner^1,2,3, Jesse D Pyle^1,8, Leonid Yurkovetskiy⁹, Matteo Bosso⁹, Daniel J Park¹, Mehrtash Babadi¹, Bronwyn L MacInnis^1,3,4, Jeremy Luban^1,4,10,9.

Abstract

Repeated emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants with increased fitness underscores the value of rapid detection and characterization of new lineages. We have developed PyR0, a hierarchical Bayesian multinomial logistic regression model that infers relative prevalence of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to fitness. Applying PyR0 to all publicly available SARS-CoV-2 genomes, we identify numerous substitutions that increase fitness, including previously identified spike mutations and many nonspike mutations within the nucleocapsid and nonstructural proteins. PyR0 forecasts growth of new lineages from their mutational profile, ranks the fitness of lineages as new sequences become available, and prioritizes mutations of biological and public health concern for functional characterization.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 35608456 PMCID： PMC9161372 DOI： 10.1126/science.abm1208

Source DB: PubMed Journal: Science ISSN： 0036-8075 Impact factor: 63.714

The SARS-CoV-2 pandemic has been characterized by repeated waves of cases driven by the emergence of new lineages with higher fitness, where fitness encompasses any trait that affects the lineage’s growth, including its basic reproduction number (R0), ability to evade existing immunity, and generation time. Rapidly identifying such lineages as they emerge, and accurately forecasting their dynamics, is critical for guiding outbreak response. Doing so effectively would benefit from the ability to interrogate the entirety of the global SARS-CoV-2 genomic dataset. The large size (currently over 7.5 million virus genomes) and geographic and temporal variability of the available data present significant challenges that will become greater as more viruses are sequenced. Current phylogenetic approaches are computationally inefficient on datasets with more than ~5000 samples and take days to run at that scale. Ad hoc methods to estimate the relative fitness of particular SARS-CoV-2 lineages are a computationally efficient alternative ( – ), but have typically relied on models in which one or two lineages of interest are compared to all others and do not capture the complex dynamics of multiple co-circulating lineages. Furthermore, estimates of relative fitness based on lineage frequency data alone ( , , ) do not take advantage of additional statistical power that can be gained from analyzing the independent appearance and growth of the same mutation in multiple lineages. Performing a mutation-based analysis of lineage prevalence has the additional advantage of identifying specific genetic determinants of a lineage’s phenotype, which is critically important both for understanding the biology of transmission and pathogenesis and for predicting the phenotype of new lineages. The SARS-CoV-2 pandemic has already been dominated by several genetic changes of functional and epidemiological importance, including the spike (S) D614G mutation that is associated with higher SARS-CoV-2 loads ( , ). Mutations found in Variants of Concern (VoC), such as S:N439R, S:N501Y, and S:E484K, have been linked, respectively, to increased transmissibility ( ), enhanced binding to ACE2 ( ), and antibody escape ( , ). Despite these successes, identifying functionally important mutations in the context of a large background of genetic variants of little or no phenotypic consequence remains challenging. In modeling the relative fitness of SARS-CoV-2 lineages, we estimated their growth as a linear combination of the effects of individual mutations. To this end, we developed PyR0, a hierarchical Bayesian regression model that enables scalable analysis of the complete set of publicly available SARS-CoV-2 genomes, that can be applied to any viral genomic dataset and to other viral phenotypes. The model, which is summarized in fig. S1, and described in detail in the supplementary materials, avoids the complexity of full phylogenetic inference by first clustering genomes by genetic similarity (refining PANGO lineages ( )), and estimating the incremental effect on growth rate of each of the most common amino acid changes on the lineages in which they appear. By regressing growth rate on genome sequence, the model shares statistical strength among genetically similar lineages without explicitly relying on phylogeny. By modeling only the multinomial proportion of different lineages rather than the absolute number of samples for each lineage ( , ), and by doing so within 14-day intervals in 1,560 globally-distributed geographic regions, the model achieves robustness to a number of sources of bias that affect all lineages, across regions and over time, including differences in data collection and changes in transmission due to such factors as social behavior, public health policy, and vaccination. We fit PyR0 to 6,466,300 SARS-CoV-2 genomes available on GISAID ( ) as of January 20, 2022, in a model that contained 3,000 clusters, derived from 1,544 PANGO lineages, and 2,904 nonsynonymous mutations. The output of the model is a posterior distribution for the relative fitness (exponential growth rate) of each lineage and for the contribution to the fitness from each mutation. Fitting this large model is computationally challenging, so we used stochastic variational inference, an approximate inference method that reduced our task to solving a 75-million-dimensional optimization problem on a GPU. Inference was implemented in the Pyro ( ) probabilistic programming framework (see Supplemental Materials). The trained model can be used to infer lineage fitness, predict the fitness of completely new lineages, forecast future lineage proportions, and estimate the effects of individual mutations on fitness. The model's lineage fitness estimates (Fig. 1B) show a modest upward trend over time among all lineages, interrupted by several lineages with much higher fitness. Sensitivity analyses revealed qualitative consistency of fitness estimates across spatial data subsets (fig. S2). The upward trend may in part reflect an upward bias caused by the lineage assignment process, as can be seen in simulation studies (fig. S3), but the high tail of the distribution exhibits elevated fitness values far in excess of this trend. The spread of the virus into human populations in late 2019 and early 2022 has been marked by periods of rapid evolution in fitness and waves of increase in case counts (Fig. 1). While PANGO lineages facilitate communication by providing a stable nomenclature, we observed some PANGO lineages with multiple successive peaks in some regions, suggesting that sublineages within them had differing fitnesses. We therefore algorithmically refined the 1,544 PANGO lineages into 3,000 finer clusters, and found that our model identified significant heterogeneity within some PANGO lineages (fig. S4). When we tested the model's predictive ability (fig. S5), we found that forecasts were reliable for 1-2 months into the future for variants of concern, but not necessarily other variants, when they tended to be disrupted by the emergence of a completely new strain (table S1, fig. S6). The accuracy of forecasts stabilized typically stabilized within two weeks after the emergence of a new competitive lineage in a region (fig. S6).

Fig. 1.

Relative fitness versus date of lineage emergence.

Relative fitness versus date of lineage emergence.

Circle size is proportional to cumulative case count inferred from lineage proportion estimates and confirmed case counts. Inset table lists the 10 fittest lineages inferred by the model. R/RA is the fold increase in relative fitness over the Wuhan (A) lineage, assuming a fixed generation time of 5.5 days. The model correctly infers WHO classification variant Omicron (PANGO BA.2) to have the highest fitness to date, 8.9-fold (95% CI, 8.6-9.2) higher than the original A lineage (Fig. 1 inset), accurately foreshadowing its rise in regions where it is circulating (fig. S7). Through systematic backtesting, we found that the model would have provided early warning and aided in the identification of VoCs had it been routinely applied to SARS-CoV-2 samples, confirming the importance for public health of timely publication of genomic data. For example, the elevated fitness of BA.2 was identified by mid-December 2021 on the basis of 76 reported sequences (fig. S8); sharing statistical strength over mutations enabled an earlier and more confident prediction that BA.2 was the fittest lineage yet observed (fig. S10). Likewise, PyR0 would have forecast the dominance of B.1.1.7 in late November 2020 (fig. S9), AY.4 by May 2021 (fig. S10), and BA.1 by early December 2021 (fig. S8). While variant-specific models were accurate and useful in predicting the rise of these lineages ( ), each modeling effort was specific to a particular lineage and geographic region. PyR0’s global approach provides similar early detection while also offering automated, rapid, and standardized unbiased consideration of all variants and lineages, together with ranking based on relative fitness. Compared to standard multinomial regression models, PyR0 estimates of lineage fitness were similar (Pearson’s R = 0.95, S11-S12), but including mutations in the model enables PyR0 to infer elevated fitness of Omicron lineages BA.1 and BA.2 faster than the model without mutations (fig. S14). In contrast to non-hierarchical binomial logistic regression (fig. S13), PyR0 estimates displayed less variability as data accumulated, benefitting from the sharing of information across regions and the regularizing effect of the priors. Lineage fitness estimates were also stable between our initial analysis of 2.1 million genomes in August 2021 ( ), shortly after the emergence of Delta lineages, and before the emergence of Omicron (Spearman’s rho = 0.78, fig. S15C). The correlation between individual amino acids in the two models was weaker than that for lineages (fig. S15D-E, rho = 0.48) but still significant (test of no association for rho, p < 2 x 10−16), reflecting both the inherent difficulty of estimating high-dimensional mutational coefficients observed indirectly through lineage counts (Supplementary Note 1), as well as the addition of 4.3 million sequences, including highly fit Omicron lineages distinguished by their enhanced immune escape. By jointly modeling fitness estimates using lineage counts and individual mutations, PyR0 harnesses convergent evolution (Table 1 and fig. S16) to infer the fitness of new constellations of mutations based on the trajectories of other lineages in which they have previously emerged. This predictive capability has the potential to aid public health efforts because the model has the potential to learn faster by incorporating mutations than it would by relying on lineage counts alone (fig. S14). To test the reliability of this kind of estimate, we fit leave-one-out estimators for PANGO lineages on subsets of the dataset with that entire lineage removed, based solely on the mutational content of the omitted lineage (fig. S17). These estimators showed excellent agreement with estimators based on the observed behavior of the lineages, and they were also more accurate than naive phylogenetic estimators that assume the fitness of each new strain is equal to its parent lineage's fitness (Pearson's R = 0.983, after correcting for parent fitness, fig. S17). Together, these analyses suggest that PyR0 has the potential to aid genomic surveillance efforts by providing an automated early warning system on a similar time scale as sophisticated regional surveillance efforts ( , ).

Table 1.

Amino acid substitutions most significantly associated with increased fitness.

Rank	Gene	Substitution	Fold Increase in Fitness	Number of Lineages
1	S	H655Y	1.051	33
2	S	T95I	1.046	30
3	ORF1a	P3395H	1.039	5
4	S	N764K	1.04	6
5	ORF1a	K856R	1.039	2
6	S	S371L	1.041	3
7	E	T9I	1.04	5
8	S	Q954H	1.04	5
9	ORF9b	P10S	1.039	25
10	S	L981F	1.04	2
11	N	P13L	1.04	25
12	S	G339D	1.039	4
13	S	S375F	1.04	5
14	S	S477N	1.039	47
15	S	N679K	1.04	11
16	S	S373P	1.04	5
17	M	Q19E	1.039	5
18	S	D796Y	1.038	11
19	S	N969K	1.04	5
20	S	T547K	1.038	3

Amino acid substitutions most significantly associated with increased fitness.

Significance is defined as posterior mean / posterior standard deviation. Fitness is per 5.5 days (estimated generation time of the Wuhan (A) lineage ( , )). Final column: number of PANGO lineages in which each substitution emerged independently. Genome-wide estimates of the effect of SARS-CoV-2 mutations on fitness also provide a powerful tool for better understanding the biology of fitness. Our model allowed us to estimate the contribution of 2,904 amino acid substitutions (Fig. 2A and Table 1) to lineage fitness and to rank them by inferred statistical significance (fig. S18). Cross-validation confirmed that these results replicate qualitatively across different geographic regions (fig. S19).

Fig. 2.

Manhattan plot of amino acid changes assessed in this study.

Manhattan plot of amino acid changes assessed in this study.

(A) Changes across the entire genome. (B) Changes in the first 850 amino acids of S. In each of (A) to (C) the y axis shows effect size Δ log R, the estimated change in log relative fitness due to each amino acid change. The bottom three axes show the background density of all observed amino acid changes, the density of those associated with growth (weighted by |Δ log R|), and the ratio of the two. The top 55 amino acid changes are labeled. See fig. S13 for detailed views of S, N, ORF1a, and ORF1b. C. Changes in the first 250 amino acids of N. (D) Structure of the spike-ACE2 complex (PDB: 7KNB). Spike subunits colored light blue, light orange, and gray. Top-ranked mutations are shown as red spheres. ACE2 is shown in magenta. (E) Close-up view of the RBD interface. (F) Top-ranked mutations in the N-terminal RNA-binding domain of N. Residues 44-180 of N (PDB: 7ACT) are shown in light blue. Amino acid positions corresponding to top mutations in this region are shown as red spheres. A 10-nt bound RNA is shown in gray. The highest concentrations of fitness-associated mutations were found in the S, N, and the ORF1 polyprotein genes (ORF1a and ORF1b, Fig. 2, A and B, and figs. S20 and S21). Using spatial autocorrelation as a measure of spatial structure, we found evidence of functional hotspots in the S, N, ORF7a, ORF3a, and ORF1a genes (table S2). Within S, we confirmed three hotspots of fitness-enhancing mutations, each within a defined functional region: the N-terminal domain, the receptor-binding domain (RBD), and the furin-cleavage site (Fig. 2B). We assessed mutational enrichment in the top-ranked set of mutations and identified an enrichment for lysine to asparagine mutations in the S gene (fig. S22C). We visualized top scoring mutations within atomic structures for the spike protein (Fig. 2, D to E), the nucleocapsid's N-terminal domain (Fig. 2F), the polymerase (fig. S23), and two proteases (fig. S24). Many of the top mutations in the S gene occurred in the receptor binding domain (RBD) making direct contacts with the ACE2 receptor, including K417N/T and E484K (Fig. 2, D to E). Two top-ranked mutations, T478K and S477N, occur in a flexible loop adjacent to the S-ACE2 interface (Fig. 3E), suggesting that these mutations may affect the kinetics of receptor engagement or the Spike conformational changes that follow. Other mutations occurred in regions proximal to essential enzymatic active sites of the viral replication (fig. S15) or protein processing (fig. S16) machinery.

Fig. 3.

(A) Infectivity relative to WT of lentiviral vectors pseudotyped with the indicated Spike mutants.

(A) Infectivity relative to WT of lentiviral vectors pseudotyped with the indicated Spike mutants.

Target cells were HEK293T cells expressing ACE2 and TMPRSS2 transgenes. The genetic background of the Spike was Wuhan-Hu-1 bearing D614G. Red bars were significantly different from WT (adjusted p values shown). Black bars were not significantly different from WT. (B) For the 1701 SARS-CoV-2 clusters with at least one amino acid substitution in the RBD domain we compare: i) the PyR0 prediction for the contribution to Δ log R from RBD substitutions only; to ii) antibody binding computed using the antibody-escape calculator in ( ). The escape calculator is based on an intuitive non-linear model parameterized using deep mutational scanning data for 33 neutralizing antibodies elicited by SARS-CoV-2. PyR0 predictions exhibit high (Spearman) correlation with predictions from Greaney et al. ( ) (C to E) We dissect PyR0 Δ log R estimates into S-gene (C), RBD (D), and non-S-gene (E) contributions for 3000 SARS-CoV-2 clusters (blue dots). The horizontal axis corresponds to the date at which each cluster first emerged. Red squares denote the median Δ log R within each monthly bin. The increased importance of S-gene mutations (notably in the RBD) over non-S-gene mutations starting around November 2021 is apparent. We tested several of the high-scoring mutations in single-cycle infectivity assays as done previously ( ), focusing on the RBD (Fig. 3A). We found that while some individual mutations increased infectivity, on average, high-scoring RBD mutations did not promote infectivity per se. We considered an alternate possibility that fitness of Spike mutations is driven by immune escape. Using RBD-aggregated mutations as a proxy for immune escape, we found that the fitness effect of these Spike mutations correlates well with antibody escape estimates from Greaney et al. ( ) (Fig. 3B). Together with the observed jump in fitness beginning in late 2021 (Fig. 3C) associated with Spike mutations, but not mutations elsewhere in the genome (Fig. 3E), these results suggest that immune escape is the dominant driver of current fitness increases. BA.1 and BA.2 had similar estimated fitness from Spike mutations, potentially consistent with similar Spike antibody neutralization of these variants ( ), whereas PyR0 inferred that the elevated fitness of BA.2 is attributed to non-Spike mutations (fig. S25). In contrast to mutations in Spike, those in the serine-arginine rich region of N were linked to increased efficiency of SARS-CoV-2 genomic RNA packaging ( ). Within ORF1, we found fitness-associated mutations across all viral enzymes, and clusters within additional non-structural proteins (nsps). The highest concentration of fitness-associated mutations is found in nsp4, nsp6, and nsp12–14 (fig. S12B,S13C-D), suggesting unexplored function at those sites. For example, nsp4 and nsp6 have roles in assembly of replication compartments, and substitutions in these regions may influence the kinetics of replication (see Supplemental Note 3). We caution that while convergent evolution makes it possible to identify candidate functional mutations, observational data alone is insufficient to declare mutations as causal rather than merely correlated. Our uncertainty-ranked list of important mutations can be used to prioritize hits identified by our study for functional follow-up. Some lineages increased in fitness more than others over the course of the pandemic (fig. S4). Notably, B.1.1 displayed the greatest variability among sublineages, followed by B.1. Fitness appeared to reach a plateau over time for most lineages (Fig. 1 and fig. S4). In contrast to Omicron sublineages, Alpha and Delta showed little variability in Spike-attributable fitness (fig. S25), suggesting that the propensity to acquire new Spike mutations depends on the constellation of mutations that comprise a lineage, consistent with epistasis. A limitation of PyR0 is that it does not incorporate epistatic interactions between mutations (Supplemental Note 1); however, our results demonstrate the feasibility of inferring genetic determinants and lineage fitness using the simplest possible linear-additive model and provide a foundation for future research for more complex modeling that includes epistatic effects between mutations and migration across geographic regions. In summary, PyR0 provides a genome-wide, automated approach for detecting viral lineages with increased fitness. By combining a model-based assessment of lineage fitness with absolute case counts, our model provides a global picture of the events of the first two years of the pandemic. Because it assesses the contribution of individual mutations and aggregates across all lineages and geographic regions, it can identify mutations and gene regions that likely increase fitness, and mutation-level information may help detect fitter lineages earlier than case counts alone. Applied to the full set of publicly available SARS-CoV-2 genomes, it provides a genomic view of the mutations driving increased fitness of the virus, identifying experimentally established driver mutations in S and highlighting the key role of non-S mutations, particularly in N, ORF1b, and ORF1a, which have received relatively less research attention. By modeling millions of viral sequences across thousands of regions, PyR0 yields mechanistic insight into viral fitness and offers a panoramic view of viral evolution, revealing a pattern whereby major circulating lineages fragment into sublineages with modest differences in fitness before they are collectively displaced by the sudden emergence of markedly fitter variants.

41 in total

1. Sequence co-evolution gives 3D contacts and structures of protein complexes.

Authors: Thomas A Hopf; Charlotta P I Schärfe; João P G L M Rodrigues; Anna G Green; Oliver Kohlbacher; Chris Sander; Alexandre M J J Bonvin; Debora S Marks
Journal: Elife Date: 2014-09-25 Impact factor: 8.140

2. Severe acute respiratory syndrome coronavirus nonstructural protein 2 interacts with a host protein complex involved in mitochondrial biogenesis and intracellular signaling.

Authors: Cromwell T Cornillez-Ty; Lujian Liao; John R Yates; Peter Kuhn; Michael J Buchmeier
Journal: J Virol Date: 2009-07-29 Impact factor: 5.103

3. Functional screen reveals SARS coronavirus nonstructural protein nsp14 as a novel cap N7 methyltransferase.

Authors: Yu Chen; Hui Cai; Ji'an Pan; Nian Xiang; Po Tien; Tero Ahola; Deyin Guo
Journal: Proc Natl Acad Sci U S A Date: 2009-02-10 Impact factor: 11.205

4. Structure of M^pro from SARS-CoV-2 and discovery of its inhibitors.

Authors: Zhenming Jin; Xiaoyu Du; Yechun Xu; Yongqiang Deng; Meiqin Liu; Yao Zhao; Bing Zhang; Xiaofeng Li; Leike Zhang; Chao Peng; Yinkai Duan; Jing Yu; Lin Wang; Kailin Yang; Fengjiang Liu; Rendi Jiang; Xinglou Yang; Tian You; Xiaoce Liu; Xiuna Yang; Fang Bai; Hong Liu; Xiang Liu; Luke W Guddat; Wenqing Xu; Gengfu Xiao; Chengfeng Qin; Zhengli Shi; Hualiang Jiang; Zihe Rao; Haitao Yang
Journal: Nature Date: 2020-04-09 Impact factor: 49.962

5. Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants.

Authors: Yiska Weisblum; Fabian Schmidt; Fengwen Zhang; Justin DaSilva; Daniel Poston; Julio Cc Lorenzi; Frauke Muecksch; Magdalena Rutkowska; Hans-Heinrich Hoffmann; Eleftherios Michailidis; Christian Gaebler; Marianna Agudelo; Alice Cho; Zijun Wang; Anna Gazumyan; Melissa Cipolla; Larry Luchsinger; Christopher D Hillyer; Marina Caskey; Davide F Robbiani; Charles M Rice; Michel C Nussenzweig; Theodora Hatziioannou; Paul D Bieniasz
Journal: Elife Date: 2020-10-28 Impact factor: 8.140

6. Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England.

Authors: Sam Abbott; Rosanna C Barnard; Christopher I Jarvis; Adam J Kucharski; James D Munday; Carl A B Pearson; Timothy W Russell; Damien C Tully; Alex D Washburne; Tom Wenseleers; Nicholas G Davies; Amy Gimma; William Waites; Kerry L M Wong; Kevin van Zandvoort; Justin D Silverman; Karla Diaz-Ordaz; Ruth Keogh; Rosalind M Eggo; Sebastian Funk; Mark Jit; Katherine E Atkins; W John Edmunds
Journal: Science Date: 2021-03-03 Impact factor: 63.714

7. T-CoV: a comprehensive portal of HLA-peptide interactions affected by SARS-CoV-2 mutations.

Authors: Stepan Nersisyan; Anton Zhiyanov; Maxim Shkurnikov; Alexander Tonevitsky
Journal: Nucleic Acids Res Date: 2022-01-07 Impact factor: 16.971

8. Genomic reconstruction of the SARS-CoV-2 epidemic in England.

Authors: Harald S Vöhringer; Theo Sanderson; Matthew Sinnott; Nicola De Maio; Thuy Nguyen; Richard Goater; Frank Schwach; Ian Harrison; Joel Hellewell; Cristina V Ariani; Sonia Gonçalves; David K Jackson; Ian Johnston; Alexander W Jung; Callum Saint; John Sillitoe; Maria Suciu; Nick Goldman; Jasmina Panovska-Griffiths; Ewan Birney; Erik Volz; Sebastian Funk; Dominic Kwiatkowski; Meera Chand; Inigo Martincorena; Jeffrey C Barrett; Moritz Gerstung
Journal: Nature Date: 2021-10-14 Impact factor: 69.504

9. Increased transmissibility and global spread of SARS-CoV-2 variants of concern as at June 2021.

Authors: Finlay Campbell; Brett Archer; Henry Laurenson-Schafer; Yuka Jinnai; Franck Konings; Neale Batra; Boris Pavlin; Katelijn Vandemaele; Maria D Van Kerkhove; Thibaut Jombart; Oliver Morgan; Olivier le Polain de Waroux
Journal: Euro Surveill Date: 2021-06

10. Severe acute respiratory syndrome coronavirus nonstructural proteins 3, 4, and 6 induce double-membrane vesicles.

Authors: Megan M Angelini; Marzieh Akhlaghpour; Benjamin W Neuman; Michael J Buchmeier
Journal: mBio Date: 2013-08-13 Impact factor: 7.867

22 in total

Review 1. Ecological and evolutionary dynamics of multi-strain RNA viruses.

Authors: Dennis N Makau; Samantha Lycett; Matthew Michalska-Smith; Igor A D Paploski; Maxim C-J Cheeran; Meggan E Craft; Rowland R Kao; Declan C Schroeder; Andrea Doeschl-Wilson; Kimberly VanderWaal
Journal: Nat Ecol Evol Date: 2022-09-22 Impact factor: 19.100

2. Within-host genetic diversity of SARS-CoV-2 in the context of large-scale hospital-associated genomic surveillance.

Authors: Alexandra Mushegian; Scott Wesley Long; Randall James Olsen; Paul James Christensen; Sishir Subedi; Matthew Chung; James Davis; James Musser; Elodie Ghedin
Journal: medRxiv Date: 2022-08-19

3. Early Emergence and Dispersal of Delta SARS-CoV-2 Lineage AY.99.2 in Brazil.

Authors: Camila Malta Romano; Cristina Mendes de Oliveira; Luciane Sussuchi da Silva; José Eduardo Levi
Journal: Front Med (Lausanne) Date: 2022-06-17

4. Virological characteristics of the SARS-CoV-2 Omicron BA.2 spike.

Authors: Daichi Yamasoba; Izumi Kimura; Hesham Nasser; Yuhei Morioka; Naganori Nao; Jumpei Ito; Keiya Uriu; Masumi Tsuda; Jiri Zahradnik; Kotaro Shirakawa; Rigel Suzuki; Mai Kishimoto; Yusuke Kosugi; Kouji Kobiyama; Teppei Hara; Mako Toyoda; Yuri L Tanaka; Erika P Butlertanaka; Ryo Shimizu; Hayato Ito; Lei Wang; Yoshitaka Oda; Yasuko Orba; Michihito Sasaki; Kayoko Nagata; Kumiko Yoshimatsu; Hiroyuki Asakura; Mami Nagashima; Kenji Sadamasu; Kazuhisa Yoshimura; Jin Kuramochi; Motoaki Seki; Ryoji Fujiki; Atsushi Kaneda; Tadanaga Shimada; Taka-Aki Nakada; Seiichiro Sakao; Takuji Suzuki; Takamasa Ueno; Akifumi Takaori-Kondo; Ken J Ishii; Gideon Schreiber; Hirofumi Sawa; Akatsuki Saito; Takashi Irie; Shinya Tanaka; Keita Matsuno; Takasuke Fukuhara; Terumasa Ikeda; Kei Sato
Journal: Cell Date: 2022-05-02 Impact factor: 66.850

Review 5. The potential of genomics for infectious disease forecasting.

Authors: Jessica E Stockdale; Pengyu Liu; Caroline Colijn
Journal: Nat Microbiol Date: 2022-10-20 Impact factor: 30.964

6. Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness.

Authors: Pardis C Sabeti; Jacob E Lemieux; Fritz Obermeyer; Martin Jankowiak; Nikolaos Barkas; Stephen F Schaffner; Jesse D Pyle; Leonid Yurkovetskiy; Matteo Bosso; Daniel J Park; Mehrtash Babadi; Bronwyn L MacInnis; Jeremy Luban
Journal: Science Date: 2022-05-24 Impact factor: 63.714

7. Limited spread of a rare spike E484K-harboring SARS-CoV-2 in Marseille, France.

Authors: Philippe Colson; Jacques Fantini; Nouara Yahi; Jeremy Delerce; Anthony Levasseur; Pierre-Edouard Fournier; Jean-Christophe Lagier; Didier Raoult; Bernard La Scola
Journal: Arch Virol Date: 2022-01-27 Impact factor: 2.685

8. Mapping Data to Deep Understanding: Making the Most of the Deluge of SARS-CoV-2 Genome Sequences.

Authors: Bahrad A Sokhansanj; Gail L Rosen
Journal: mSystems Date: 2022-03-21 Impact factor: 7.324

Review 9. Structural Dynamics and Molecular Evolution of the SARS-CoV-2 Spike Protein.

Authors: Kyle A Wolf; Jason C Kwan; Jeremy P Kamil
Journal: mBio Date: 2022-03-08 Impact factor: 7.786

10. SARS-CoV-2 Delta Variant N Gene Mutations Reduce Sensitivity to the TaqPath COVID-19 Multiplex Molecular Diagnostic Assay.

Authors: Steven C Holland; Ajeet Bains; LaRinda A Holland; Matthew F Smith; Regan A Sullins; Nicholas J Mellor; Alexis W Thomas; Nathaniel Johnson; Vel Murugan; Efrem S Lim
Journal: Viruses Date: 2022-06-16 Impact factor: 5.818