The recently emerged Omicron variant of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has quickly spread around the world. Although many consensus mutations of the Omicron variant have been recognized, little is known about its genetic variation during its transmission in the population. Here, we comprehensively analyzed the genetic differentiation and diversity of the Omicron variant during its early outbreak. We found that Omicron achieved more structural variations, especially deletions, on the SARS-CoV-2 genome than the other four variants of concern (Alpha, Beta, Gamma, and Delta) in the same timescale. In addition, the Omicron variant acquired, except for 50 consensus mutations, seven great new non-synonymous nucleotide substitutions during its spread. Three of them are on the S protein, including S_A701V, S_L1081V, and S_R346K, which belong to the receptor-binding domain (RBD). The Omicron BA.1 branch could be divided into five divergent groups spreading across different countries and regions based on these seven novel mutations. Furthermore, we found that the Omicron variant possesses more mutations related to a faster transmission rate than the other SARS-CoV-2 variants by assessing the relationship between the genetic diversity and transmission rate. The findings indicated that more attention should be paid to the significant genetic differentiation and diversity of the Omicron variant for better disease prevention and control.
The recently emerged Omicron variant of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has quickly spread around the world. Although many consensus mutations of the Omicron variant have been recognized, little is known about its genetic variation during its transmission in the population. Here, we comprehensively analyzed the genetic differentiation and diversity of the Omicron variant during its early outbreak. We found that Omicron achieved more structural variations, especially deletions, on the SARS-CoV-2 genome than the other four variants of concern (Alpha, Beta, Gamma, and Delta) in the same timescale. In addition, the Omicron variant acquired, except for 50 consensus mutations, seven great new non-synonymous nucleotide substitutions during its spread. Three of them are on the S protein, including S_A701V, S_L1081V, and S_R346K, which belong to the receptor-binding domain (RBD). The Omicron BA.1 branch could be divided into five divergent groups spreading across different countries and regions based on these seven novel mutations. Furthermore, we found that the Omicron variant possesses more mutations related to a faster transmission rate than the other SARS-CoV-2 variants by assessing the relationship between the genetic diversity and transmission rate. The findings indicated that more attention should be paid to the significant genetic differentiation and diversity of the Omicron variant for better disease prevention and control.
Since the emergence of the coronavirus disease 2019 (COVID-19) was firstly reported in December 2019, the frequent emerging events of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have raised significant concerns [1]. To prioritize the monitoring of noteworthy SARS-CoV-2 variants, the World Health Organization (WHO) divided highlighted variants into three categories: variants of concern (VOCs), variants of interest (VOIs), and variants under monitoring (VUMs). Previously, four VOCs were highlighted, including Alpha (B.1.1.7), Beta (B.1.351), Gamma (P.1), and Delta (B.1.617.2) [2]. On November 19, 2021, a new variant was detected in S-gene target failure (SGTF) samples in South America that was were genetically distinct from all previous SARS-CoV-2 strains [3]. On November 24, 2021, this SARS-CoV-2 variant was defined as a new PANGO lineage (B.1.1.529), and two days later, this branch was classified as a VOC by the WHO and named the Omicron variant. As of February 16, 2022, the Omicron variant had spread to at least 140 countries and regions, leading to another COVID-19 case spike.Omicron is currently the variant of concern with the most mutations, carrying 50 characteristic mutations, 31 of which are on the S protein [4]. Some of these characteristic mutations are also present in other variants [5], while others are unique mutations, such as S_G339D, S_S375F, S_G446S, and S_Q498R [6]. In the first month of Omicron circulating in the human population, it was split into four lineages, including B.1.1.529, BA.1, BA.2, and BA.3. At the beginning of the Omicron variant’s spread, BA.1 was the dominant lineage. However, recently, in Denmark, the amount of sequenced BA.2 genomes increased rapidly and the BA.2 lineage has become the dominant strain [7]. These phenomena indicated that the Omicron variant can quickly evolve and differentiate at a high transmission speed.Furthermore, some studies have confirmed that the Omicron variant has a significantly greater immune escape capability than the SARS-CoV-2 strain reported in 2019. Vaccinated people and previously infected people are still at an extremely high risk of being infected with Omicron [8], [9], [10], [11]. Although a preliminary understanding of Omicron mutations has been achieved, internal dynamic evolution and genetic differentiation remained unknown during its transmission in the human population. Therefore, it is urgent to understand the evolutionary progress of Omicron in the early outbreak stage, which will be significantly helpful in the prevention and control of the Omicron variant’s spread.
Materials and methods
Data source
The genome sequences of SARS-CoV-2 were downloaded from the GISAID [12]. The multiple sequence alignment (MSA) and spatiotemporal files (Metadata) were downloaded on December 12, 2021. Since the delayed updates of MSA file data, we downloaded raw Omicron sequence data and treated them as other sequences in the MSA file suffered. Excluding sequences with more than 5% unknown bases (N), there were 12,304 Omicron sequences until December 20, 2021. To further analyze the genetic diversity in the later period, we further downloaded 103,688 Omicron sequences collected in England on January 8, 2022. These sequences were individually aligned to the reference WIV04 (EPI_ISL_402124) by MAFFT [13]. The early-stage sequences of variants Alpha, Beta, Gamma, and Delta were extracted from the MSA file to form two datasets with the same sequence number or time duration as the Omicron dataset mentioned above. The initial time points for the other four variants (Alpha, Beta, Gamma, and Delta) were set at the day when they had 100 genome sequences. The mutation information of these sequences was extracted by an R package (https://github.com/wuaipinglab/genome_treatment). Mutations before the 300th and after the 29,000th bases were discarded for the low sequencing quality at the head and tail of the genome. Nucleotide substitutions occurring more than three times and insertions or deletions occurring more than once were kept. To remove the interference from low-quality sequencing data, we discarded sequences with N on the position between one base before and one base after each variant’s consensus deletions/insertions. The final used sequence numbers of each strain were shown in supplementary Tables 4 and 7.
Phylogenic tree
The Phylogenic tree was downloaded from NextStrain [14], accessed on December 22, 2021, as a NEXUS file, including detailed lineage information. Characteristic mutations tables of the five variants were downloaded from the website Outbreak.info on December 25, 2021 [4]. Mutations in more than or equal to 75% of sequences would be treated as characteristic mutations. Characteristic mutations tables of the four Omicron lineages (B.1.1.529, BA.1, BA.2, and BA.3) and the five cluster groups (a, b, c, d, and e) were calculated with the Omicron sequences treated above. Only the mutations that appeared in more than half of the sequences were shown.
Omicron transmission network
The spread of SARS-CoV-2 was distributed scaleless. Many infected people could contribute to a large-scale virus spread through multiple gathering events in its early stages. In the scaleless network, a few nodes could connect to a large number of nodes. These key intermediate nodes might help to infer the transmission route of Omicron. We first extracted all the mutations (nucleotide substitutions, deletions, and insertions) in Omicron sequences to discover these key nodes. We then clustered these genome sequences based on their mutation similarity using an apcluster package in R [15]. Eventually, we had 253 clusters. Within each cluster, the earliest strains that appeared in different countries were selected as the representative sequences, and a total of 782 representative sequences were obtained. Two sequences were speculated to have a propagation relationship if there was only one nucleotide substitution difference, and a link was made between them. Nucleotide substitutions were discarded if there were more than 1000 N (from more than 10,000 sequences) at its location. Therefore, although some nucleotide substitutions occurred many times, they were not included in further analysis. Finally, these 782 sequences formed an omicron propagation network with 8,224 edges. We visualized this network in the software Gephi [16].
Mutations in the protein structure
We downloaded the monomer structure of S protein (QHD43416.pdb) from the website on December 22, 2021 (https://zhanggroup.org//COVID-19/). We visualized the S proteins using Pymol [17]. Mutations in Omicron were labeled on the S proteins.
Results
Omicron had more mutations than other SARS-CoV-2 VOCs
The emergence of the Omicron variant has raised significant concerns about its vast genome mutations. The Omicron variant had 50 consensus mutations, including 43 nucleotide substitutions, six deletions, and one insertion. Of them, 27 nucleotide substitutions, three deletions, and one insertion were on the S protein (Fig. 1
A and Table S1). The other four VOCs possessed relatively fewer mutations: Alpha had 22 mutations, Beta had 18 mutations, Gamma had 23 mutations, and Delta had 29 mutations. In addition, systematic studies from the NextStrain website revealed that Omicron did not come from the previous dominant strain Delta [5], [14], [18] but was an individually emerging variant (Fig. 1B).
Fig. 1
Comparing the diversity of mutations among the five variants of concern (VOCs) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A) The consensus mutations of the variants Alpha, Beta, Gamma, Delta, and Omicron are shown in a heatmap. Word labels in the red highlight deletions, and those in blue highlight insertions. Nucleotide substitutions were labeled in black. B) A phylogenetic tree from the NextStrain is displayed, whose five VOCs are labeled in different colors. C) The number of the genome sequences of five VOCs uploaded to the GISAID grew overtime in the first 47 days. D) Among these sequences, the types of nucleotide substitutions and deletions/insertions were calculated in each variant of concern. E) The location distribution of these deletions (red) or insertions (blue) on the SARS-CoV-2 genome is shown in bars. F) In each line, every combination of deletion (red) or insertion (blue) in five variants is shown in the heatmap. Each column represents a unique deletion/insertion ordering by their location on the SARS-CoV-2 genome. Their respective genes are labeled in different colors.
Comparing the diversity of mutations among the five variants of concern (VOCs) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A) The consensus mutations of the variants Alpha, Beta, Gamma, Delta, and Omicron are shown in a heatmap. Word labels in the red highlight deletions, and those in blue highlight insertions. Nucleotide substitutions were labeled in black. B) A phylogenetic tree from the NextStrain is displayed, whose five VOCs are labeled in different colors. C) The number of the genome sequences of five VOCs uploaded to the GISAID grew overtime in the first 47 days. D) Among these sequences, the types of nucleotide substitutions and deletions/insertions were calculated in each variant of concern. E) The location distribution of these deletions (red) or insertions (blue) on the SARS-CoV-2 genome is shown in bars. F) In each line, every combination of deletion (red) or insertion (blue) in five variants is shown in the heatmap. Each column represents a unique deletion/insertion ordering by their location on the SARS-CoV-2 genome. Their respective genes are labeled in different colors.The accumulated genomes increased exponentially within 47 days, indicating that the Omicron variant had a relatively high speed spread worldwide (Fig. 1C). During the first 47 days after their emergence, Omicron, Alpha, Beta, Gamma, and Delta variants were reported with 12304, 3364, 733, 961, and 441 sequenced genomes, respectively (Fig. 1C). We compared the mutation number of these variants accumulated in their first 47 days. The result showed that Omicron contained 398 nucleotide substitutions, similar to that in the Beta variant and was half of the other three variants (Fig. 1D and Table S2). However, deletions or insertions in Omicron significantly happened more frequently than those in the other variants, up to twice as many (Fig. 1D and Table S3). Then we performed a systematic analysis of these deletions and insertions. Although all the five SARS-CoV-2 variants shared a similar deletion regional preference, deletions in variant Omicron had a wider distribution on the genome. The deletion regions in variant Omicron generally covered the regions where most deletions occurred in the other variants (Fig. 1E). Furthermore, more diverse deletion combinations were observed in Omicron (Fig. 1F).
Continuous genetic differentiation in early Omicron transmission
The Omicron variant was further divided into four lineages, namely B.1.1.529, BA.1, BA.2, and BA.3, based on the NextStrain [14]. The number of genome sequences of these four different Omicron branches in the first-47-day Omicron dataset was 23, 10,754, 20, and 8, respectively (Table S4). Finally, we displayed the consensus mutations, which occurred in over 50% of sequences of each branch. For example, the consensus mutations of the BA.1 branch covered completely that of the B.1.1.529 branch (Fig. 2
A). At the same time, the BA.2 and BA.3 branches contained their unique consensus mutations (Fig. 2A and Table S5).
Fig. 2
Differentiation and diversity within the early sequences of Omicron. A) The characteristic mutations of each sub-lineages in early Omicron sequences are shown in the heatmap. Word labels in red highlight deletions, and those in blue highlight insertions. Nucleotide substitution labels were in black. The frequency of the mutations in each sub-lineage is shown from yellow to red as 50% to 100%. B) In the sequences from the BA.1 sub-lineage, their nucleotide substitution combinations are shown in each line in the heatmap and further grouped into a, b, c, d, and e group. C) A propagation net was created by early Omicron sequences after clustered. D) The time characteristic of these sequences is shown in the propagation net.
Differentiation and diversity within the early sequences of Omicron. A) The characteristic mutations of each sub-lineages in early Omicron sequences are shown in the heatmap. Word labels in red highlight deletions, and those in blue highlight insertions. Nucleotide substitution labels were in black. The frequency of the mutations in each sub-lineage is shown from yellow to red as 50% to 100%. B) In the sequences from the BA.1 sub-lineage, their nucleotide substitution combinations are shown in each line in the heatmap and further grouped into a, b, c, d, and e group. C) A propagation net was created by early Omicron sequences after clustered. D) The time characteristic of these sequences is shown in the propagation net.Because of the few sequenced genomes of B.1.1.529, BA.2, and BA.3 in the early stage of variant Omicron, we only used BA.1 sequences to further analyze the genetic diversity after the introduction of the Omicron variant into the human population. In addition to the 50 consensus mutations, we found ten other sites had relatively high-frequency mutations, including seven non-synonymous nucleotide substitutions (nsp3_V1069I, nsp4_V94A, nsp12_F685Y, S_R346K, S_A701V, S_I1081V, and ORF3a_L106F) and three synonymous nucleotide substitutions (Fig. 2B). After a cluster analysis by all nucleotide substitutions, including these ten high-frequency mutations, the early-stage BA.1 sequences could be divided into five groups, as group a–e (Fig. 2B). Except for Group a, each group had one or two nucleotide substitutions. Three nucleotide substitutions on the S protein belonged to group b, c, and d, respectively.We then built a network. Two sequences were linked in this network if there was only one nucleotide substitution difference between them. The whole network presented a process of continuous diffusion from the center to the outside. We labeled group a–e in this network. We found that these five groups appeared in the different parts of the network (Fig. 2C). Group a was at the center of the network. Group c, d, and e were on the outside, connecting to group a through several nodes. Notably, group b did not connect with the other groups in the network. The intermediate nodes between group b and other groups were not included in our sequence dataset. When we mapped the detection time of each node into the network, we found that group c and d appeared earliest, followed by group b and e (Fig. 2D and Fig. S1). The spatiotemporal analyses showed that each of these groups had a unique distribution. Although in some countries, such as England and the United States, all groups were detected (Fig. S1A–S1D). These results indicated that the Omicron variant mutated and evolved during its early transmission.
Novel mutations appeared on the S protein of the Omicron variant
In the early-stage Omicron genome sequences, diverse mutations appeared in BA.1. These mutations divided BA.1 into one original group and four subgroups. We calculated the consensus mutations occurring in more than 50% of sequences within each group. Six unique nucleotide substitutions were notable (Fig. 3
A and Table S6). Three unique nucleotide substitutions on the S protein belonged to groups b, c, and d. These mutations were S_R346K (group c), S_A701V (group d), and S_L1081V (group b), respectively. Besides, group b had L106F on the orf3a protein which could induce apoptosis [19]. The L106F has been reported in India and Brazil [20], [21]. Group d had another mutation, V1069I, on the papain-like protease domain of the nsp3 protein, responsible for cleaving the polypeptide [22]. Finally, Ggroup e contained one unique nucleotide substitution, F685Y, located on the nsp12 encoding RNA-dependent RNA polymerase. Group c was later named Lineage BA.1.1. The Outbreak.info showed that the number of BA.1.1 cases increased rapidly, and the BA.1.1 has infected more than 20% of cases in Djibouti [4]. We mapped consensus Omicron mutations acquired from Outbreak.info together with three other novel nucleotide substitutions to the three-dimensional structure of the S protein [4]. The S_R346K mutation was on the receptor-binding domain (RBD) and the other two mutations, S_A701V and S_L1081V, were on the later part of the S2 region (Fig. 3B).
Fig. 3
The mutation distribution of the early Omicron sequences in different groups. A) The characteristic mutations of each group in early Omicron sequences are shown. Word labels in red highlight deletions, and those in blue highlight insertions. Nucleotide substitution labels were in black. The frequency of the mutations in each group is shown from yellow to red as 50% to 100%. B) Consensus mutations in Omicron and remarkable mutations in group a, b, c, d, and e are mapped on the structural model of the Spike protein. N-terminal domain (NTD) is marked in green, and receptor-binding domain (RBD) is marked in pink. Blue indicates deletion positions, yellow indicates insertion positions, cyan indicates nucleotide substitutions, and red indicates remarkable mutations.
The mutation distribution of the early Omicron sequences in different groups. A) The characteristic mutations of each group in early Omicron sequences are shown. Word labels in red highlight deletions, and those in blue highlight insertions. Nucleotide substitution labels were in black. The frequency of the mutations in each group is shown from yellow to red as 50% to 100%. B) Consensus mutations in Omicron and remarkable mutations in group a, b, c, d, and e are mapped on the structural model of the Spike protein. N-terminal domain (NTD) is marked in green, and receptor-binding domain (RBD) is marked in pink. Blue indicates deletion positions, yellow indicates insertion positions, cyan indicates nucleotide substitutions, and red indicates remarkable mutations.
Faster spread led to higher genetic diversity in the Omicron variant
To determine whether more genome sequences could contribute to more mutations, we compared the internal diversity among different variants (Alpha, Delta, and Omicron) with the same number of sequences from England (Table S7). Until January 8, 2022, there were 103,688 sequences of the Omicron variant in England, and the genome of Omicron also accumulated faster than that of the other variants. The faster accumulation indicated that the spread speed of the Omicron variant was faster than the other variants in this region (Fig. 4
A). In our results, the deletion diversity among Omicron increased, with a rapid increase in the number of sequenced genomes, significantly faster than that of the Alpha and Delta variants (Fig. 4B). In addition, a similar growth trend was shared between the internal diversity of nucleotide substitution and that of insertion or deletion in variant Alpha, Delta, and Omicron (Fig. 4C). When we mapped the location of deletions and insertions from different variants on the SARS-CoV-2 genome, we found that in the sequences with the same time duration after the initial day of each variant in England, which was labeled by a dotted line in Fig. 4A, the deletions of variant Omicron distributed wider on the genome than that of other variants (Fig. 4D). The wider distribution could result from a rapid sequence accumulation of variant Omicron in the early stage (Fig. 4A). However, when these variants came to have the same sequence number, their deletion distribution tended to be similar (Fig. 4E). The above results indicated that higher genetic diversity in the Omicron variant could be related to a faster spread in its early outbreak stage.
Fig. 4
The relationship between Omicron transmission and mutation accumulation in England. A) The genome sequence numbers of dominant variants (Alpha, Delta, and Omicron) uploaded to the GISAID grew over time in the early stage of Omicron in England. A dotted line marked the time duration which we used to show the deletions or insertions distribution in the sequences with the same time duration. B) and C) The types of deletion/insertion or nucleotide substitutions in these sequences were calculated over time. The location of these deletions or insertions were labeled separately on the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome, D) in the sequence with the same time duration or E) in the same number of sequences.
The relationship between Omicron transmission and mutation accumulation in England. A) The genome sequence numbers of dominant variants (Alpha, Delta, and Omicron) uploaded to the GISAID grew over time in the early stage of Omicron in England. A dotted line marked the time duration which we used to show the deletions or insertions distribution in the sequences with the same time duration. B) and C) The types of deletion/insertion or nucleotide substitutions in these sequences were calculated over time. The location of these deletions or insertions were labeled separately on the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genome, D) in the sequence with the same time duration or E) in the same number of sequences.
Discussion
Compared to the other VOCs, variant Omicron had almost four times the number of sequenced genomes within the same time duration after the initial day of the variants. The larger number of infection cases of this dominant variant could result in the more frequent emergence of mutations. During the first 47 days of spread, 398 nucleotide substitutions and 51 deletions/insertions were identified in the Omicron sequences. Previous studies showed that deletions could affect the virus protein greater than single nucleotide substitutions [23]. A systemic analysis revealed that deletion in the SARS-CoV-2 had a regional preference. It was also illustrated that the recurrent deletions on the N-terminal Domain of the S protein partially covered the binding domain of some neutralizing antibodies indicating a potential role of the deletions in virus evolution [24], [25]. Therefore, in preventing and controlling the COVID-19 pandemic, it was necessary to pay more attention to the internal genetic diversity, including nucleotide substitutions and deletions or insertions of the dominant variants.Previous studies have shown that the Omicron variant consisted of four sub-lineages. These sub-lineages seemed to emerge at similar times, two of which (BA.1 and BA.2) had spread worldwide [26]. A recent study showed that the BA.2 lineage, which appeared later, might spread faster than BA.1 [7]. Our study showed that the BA.1 lineage continued to differentiate. We divided BA.1 into five groups by multiple mutations, including one original group and four subgroups. Each of these subgroups carried one novel nucleotide substitution. Group c, with S_R346K, increased rapidly and has been named the BA.1.1 lineage [27]. Group b and d possessed their own one characteristic nucleotide substitutions on the S protein. These mutations on the spike protein might potentially affect virus transmission. Finally, group e had one nucleotide substitution (nsp12_F685Y) on the nsp12. Since nsp12, encoding RNA-dependent RNA polymerase, participates in virus replication and translation, it was meaningful to figure out the function of this mutation.On the S protein of the Omicron variant, there were 31 consensus mutations. Some of them, such as S477R, Q498R, and N501Y, have already been associated with an increased binding ability to the ACE2 receptor [28], [29], [30], [31]. Another consensus nucleotide substitution, S_K417N, has been confirmed to be able to inactivate some therapeutic neutralizing antibodies [29], [30], [31]. The consensus deletion S_del69/70 has been proved to help the virus enter host cells [32]. Except for these notable mutations, we found that a series of novel mutations continued to emerge on the S protein during the spread of variant Omicron. Three novel nucleotide substitutions (S_R346K, S_A701V, S_L1081V) were detected in these early-stage sequences, of which S_R346K was on the receptor-binding domain. S_R346K has been proved to slightly affect the binding between SARS-CoV-2 virus and class 2 antibodies [33]. Another nucleotide substitution, S_A701V, was one of the dominant mutations in the third pandemic wave in Malaysia [34]. In addition, many studies have proved that Omicron had a solid ability to escape several neutralizing antibodies [8], and previously infected people were also the susceptible population [11]. Therefore, it is critical to figure out the ability of not only consensus mutations, but also these emerging mutations of the Omicron variant on virus transmission ability and immune escape capability.From January 2022 to February 2022, the Omicron variant has spread exponentially. Many infected people could lead to more confirmed cases with severe symptoms and a shortage of medical resources. Our studies implied that a rapid increase in the infected patients might lead to a rapid increase in viral diversity. More dangerous mutations may even occur in the future. Therefore, the internal diversity among the dominant variants should be followed through more mutation surveillance. Therefore, reasonable pandemic prevention and control measures should be carried out cautiously to confront the Omicron variant’s spread.
Authors: Pengfei Wang; Manoj S Nair; Lihong Liu; Sho Iketani; Yang Luo; Yicheng Guo; Maple Wang; Jian Yu; Baoshan Zhang; Peter D Kwong; Barney S Graham; John R Mascola; Jennifer Y Chang; Michael T Yin; Magdalena Sobieszczyk; Christos A Kyratsous; Lawrence Shapiro; Zizhang Sheng; Yaoxing Huang; David D Ho Journal: Nature Date: 2021-03-08 Impact factor: 69.504
Authors: Kevin R McCarthy; Linda J Rennick; Sham Nambulli; Lindsey R Robinson-McCarthy; William G Bain; Ghady Haidar; W Paul Duprex Journal: Science Date: 2021-02-03 Impact factor: 47.728
Authors: Raquel Viana; Sikhulile Moyo; Daniel G Amoako; Houriiyah Tegally; Cathrine Scheepers; Christian L Althaus; Ugochukwu J Anyaneji; Phillip A Bester; Maciej F Boni; Mohammed Chand; Wonderful T Choga; Rachel Colquhoun; Michaela Davids; Koen Deforche; Deelan Doolabh; Louis du Plessis; Susan Engelbrecht; Josie Everatt; Jennifer Giandhari; Marta Giovanetti; Diana Hardie; Verity Hill; Nei-Yuan Hsiao; Arash Iranzadeh; Arshad Ismail; Charity Joseph; Rageema Joseph; Legodile Koopile; Sergei L Kosakovsky Pond; Moritz U G Kraemer; Lesego Kuate-Lere; Oluwakemi Laguda-Akingba; Onalethatha Lesetedi-Mafoko; Richard J Lessells; Shahin Lockman; Alexander G Lucaci; Arisha Maharaj; Boitshoko Mahlangu; Tongai Maponga; Kamela Mahlakwane; Zinhle Makatini; Gert Marais; Dorcas Maruapula; Kereng Masupu; Mogomotsi Matshaba; Simnikiwe Mayaphi; Nokuzola Mbhele; Mpaphi B Mbulawa; Adriano Mendes; Koleka Mlisana; Anele Mnguni; Thabo Mohale; Monika Moir; Kgomotso Moruisi; Mosepele Mosepele; Gerald Motsatsi; Modisa S Motswaledi; Thongbotho Mphoyakgosi; Nokukhanya Msomi; Peter N Mwangi; Yeshnee Naidoo; Noxolo Ntuli; Martin Nyaga; Lucier Olubayo; Sureshnee Pillay; Botshelo Radibe; Yajna Ramphal; Upasana Ramphal; James E San; Lesley Scott; Roger Shapiro; Lavanya Singh; Pamela Smith-Lawrence; Wendy Stevens; Amy Strydom; Kathleen Subramoney; Naume Tebeila; Derek Tshiabuila; Joseph Tsui; Stephanie van Wyk; Steven Weaver; Constantinos K Wibmer; Eduan Wilkinson; Nicole Wolter; Alexander E Zarebski; Boitumelo Zuze; Dominique Goedhals; Wolfgang Preiser; Florette Treurnicht; Marietje Venter; Carolyn Williamson; Oliver G Pybus; Jinal Bhiman; Allison Glass; Darren P Martin; Andrew Rambaut; Simani Gaseitsiwe; Anne von Gottberg; Tulio de Oliveira Journal: Nature Date: 2022-01-07 Impact factor: 49.962