Literature DB >> 34033650

SARS-CoV-2: Possible recombination and emergence of potentially more virulent strains.

Dania Haddad¹, Sumi Elsa John¹, Anwar Mohammad², Maha M Hammad², Prashantha Hebbar¹, Arshad Channanath¹, Rasheeba Nizam¹, Sarah Al-Qabandi³, Ashraf Al Madhoun¹, Abdullah Alshukry⁴, Hamad Ali^1,5, Thangavel Alphonse Thanaraj¹, Fahd Al-Mulla¹.

Abstract

COVID-19 is challenging healthcare preparedness, world economies, and livelihoods. The infection and death rates associated with this pandemic are strikingly variable in different countries. To elucidate this discrepancy, we analyzed 2431 early spread SARS-CoV-2 sequences from GISAID. We estimated continental-wise admixture proportions, assessed haplotype block estimation, and tested for the presence or absence of strains' recombination. Herein, we identified 1010 unique missense mutations and seven different SARS-CoV-2 clusters. In samples from Asia, a small haplotype block was identified, whereas samples from Europe and North America harbored large and different haplotype blocks with nonsynonymous variants. Variant frequency and linkage disequilibrium varied among continents, especially in North America. Recombination between different strains was only observed in North American and European sequences. In addition, we structurally modelled the two most common mutations, Spike_D614G and Nsp12_P314L, which suggested that these linked mutations may enhance viral entry and replication, respectively. Overall, we propose that genomic recombination between different strains may contribute to SARS-CoV-2 virulence and COVID-19 severity and may present additional challenges for current treatment regimens and countermeasures. Furthermore, our study provides a possible explanation for the substantial second wave of COVID-19 presented with higher infection and death rates in many countries.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2021 PMID： 34033650 PMCID： PMC8148317 DOI： 10.1371/journal.pone.0251368

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) outbreaks have grievously impacted the world in a short span of time. Understanding the factors that govern the severity of a pandemic is of paramount importance to design better surveillance systems and control policies [1]. In the case of COVID-19, three variables play a critical role in its spread and morbidity in a country: the nature of the pathogen, the genetic diversity of the host population, and the environmental factors such as public awareness and governmental health measures [2]. SARS-CoV-2 spike (S) glycoprotein plays an integral role in the viral transmission and virulence [3]. The S protein contains two functional subunits, S1 and S2, cleaved by furin protease at the host cell [4]. The S1 subunit contains the receptor binding domain and facilitates the interactions with the host cell surface receptor, Angiotensin-converting enzyme 2 (ACE2) [5,6]. The S2 subunit, activated by the host Transmembrane Serine Protease 2, harbors necessary elements for membrane fusion [7]. Mutations in the S protein may induce conformational changes leading to increased pathogenicity [8]. We were the first to report the major role of D614G (23403A>G) mutation located at the S1-S2 proximal junction. This mutation generates conformational changes in the protein structure rendering the furin cleavage site (664-RRAR-667) more flexible and thus enhancing viral entry [9]. Currently, the D614G mutation has become the focus of several studies addressing prospective drug targeting strategies [10]. Furthermore, studies on understanding the genetic diversity and evolution of SARS-CoV-2 are emerging [11]. Admixture analyses have been conducted to understand the evolution of betacoronaviruses and in particular the diversification of SARS-CoV-2 [12]. Haplotype analyses have also indicated that frequencies of certain haplotypes correlate with viral pathogenicity [13]. Here we extended our previous studies [9,14] by analyzing the population genetics aspects within genome sequences of SARS-CoV-2 to understand the contiguous spread of SARS-CoV-2, its rapid evolution, and the differential severity of COVID-19 among different continents. We analyzed 2341 full-length viral sequences deposited in GISAID including those sequenced at our institute from patients in Kuwait. We found evidence of coinfection between different viral strains in Europe and in North America but not in the other continents. We also modelled two major mutations and their possible effects on the stability of the encoded protein and thereby on SARS-CoV-2 virulence.

Methods

Retrieval of complete SARS-CoV-2 genome sequences

2431 complete SARS-CoV-2 genome sequences from infected individuals were retrieved from the GISAID database (Global Initiative on Sharing All Influenza Data) [15] (accessed on April 3rd, 2020) and were used in all analyses.

Alignment and annotation of amino acid sequence variation

Multiple sequence alignment was performed using MAFFT v7.407 [16] (retree: 5, maxiter: 1000). Alignment gaps were trimmed using TrimAL (automated1) [17]. SeqKit [18] was used to concatenate chopped-off sequences using NC045512.2 as the reference genome sequence. SNP-sites [19] and Annovar [20] were used to extract and annotate single nucleotide variants (SNV).

Linkage disequilibrium and haplotype blocks analysis

PLINK2 [21] was used to extract sequence variants with minor allele frequencies (MAF) ≥ 0.5%, to estimate inter-chromosomal linkage disequilibrium (LD), squared correlation coefficient (r), and haplotype blocks. We used Haploview [22] to visualize the haplotype blocks. We examined the correlation between the LD results and physical distance across all continental data sets. To detect recombination within datasets, we tested individual datasets for pairwise homoplasy index using PhiPack software [23]. All the PhiPack tests were performed with 1000 permutations. We used ‘Profile’ program available in PhiPack suite with default options for window scan size and step size in all the aligned sequence datasets. Further validation was done using RDP, GENECONV, MaxChi2, Chimaera, and 3Seq algorithms available from RDP4 software suite [24] with default parameter settings and 1000 permutations. Admixture 1.3.0 software [25] was used to identify genetic substructure of strains across the continental transmission as follows: variants with MAF ≥ 0.5%, variants in LD (R2 > 0.5), haplotype blocks, independent variants, nonsynonymous, and synonymous variants. All analyses were iterated for K = 20 and cross validation errors were examined to infer an optimal K cluster. Replicate runs were further processed using CLUMPAK [26] and results for the major modes were illustrated using the ggplot2 data visualization package for the statistical programming language R (https://www.r-project.org/).

Protein structural analysis

The crystal structures of the viral RNA-dependent RNA polymerase (RdRp, PDB ID: 6M71) [27] and the S protein (PDB ID:6VSB) [28] were used as a scaffold to model the amino acid variations observed in the viral strains. The missing amino acids, i.e. invisible in the Cryo-EM structure of the S protein, were modelled-in by using SWISS-Model [29]. DynaMut web server [30] was used to predict the effect of the mutations on the proteins stability and flexibility. PyMol (Molecular Graphics System, Version 2.0, Schrodinger, LLC) was used to generate structural images.

Results

Detection and classification of mutations from global SARS-CoV-2 genome sequences

We analyzed 2431 high quality SARS-CoV-2 genome sequences from six continental groups. 2352 sequences showed substantial genetic differences from the Wuhan reference sequence (NC045512.2). We identified 1010 unique mutations using our variant calling pipeline. 613 variants were nonsynonymous, 387 variants were synonymous, 9 variants were stop-gain, and 1 variant in the 5′ UTR. We found only 72 variants with MAF ≥ 0.5%, which were used in admixture and haplotype block analyses. shows the distribution of synonymous and nonsynonymous variants in each gene of the SARS-CoV-2 genome with varying MAF thresholds. The genes with the highest percentage of nonsynonymous variants and MAF ≥ 0.5% were: ORF3a, M, and ORF8, while the genes with the highest percentage of synonymous variants and MAF ≥ 0.5% were: ORF6 and ORF10.

Detection and classification of mutations from GISAID SARS-CoV-2 genome sequences.

Illustration of the distribution of synonymous and nonsynonymous variants for each gene in the raw dataset, MAF ≥ 0.5%, and MAF ≥ 1% thresholds are shown. A set of 72 variants in total was observed with MAF ≥ 0.5% threshold and utilized in subsequent analysis. MAF- Minor Allele Frequency.

Identification of SARS-CoV-2 genetic clusters in different continents

The principal components analysis (PCA) of early spread 2352 SARS-CoV-2 sequences gave three distinct clusters of samples based on their continent of origin (). All three clusters diverged from a single point (, red circle). The North American cluster showed the least viral genetic variances unlike the samples from Asia and Oceania which harbored the most genetic diversity. The European cluster is well-defined with few interspersed Asian samples, which is an indication of its origin. This clustering in Europe and North America is probably associated with a founder effect where a single mutation was introduced and subsequently transmitted. This suggestion is corroborated by the fact that the collection date of the founder strain is prior to those in the European and the North American clusters ().

Principal Component Analysis using 2352 GISAID sequences.

Principal Component Analysis of 2352 SARS-CoV-2 sequences shows three distinct clusters of color-coded samples (see the legend for their continent of origin). All three clusters diverge from a single point (red circle). The North American cluster (black oval) shows least variance among the three. The European cluster (orange oval) is well-defined with few interspersed Asian samples, an indication of its origin. The third cluster (Blue oval) shows the most variance and includes samples from Oceania, Asia, and others. The admixture analysis showed a gradual reduction in cross validation (CV) error for iterations up to K = 7 (). Subsequent iterations showed a fluctuating pattern; however, the least error was seen at K = 17. Further, we verified this CV trend in a subset of variants filtered for MAF ≥ 0.5% (comprised of 72 SNVs). Interestingly, we again observed a gradual reduction in CV error up to K = 7 (with best fit at 7) and subsequently, a trend of increasing CV error. We created subsets of strong LD, weak LD, Haplotype block, nonsynonymous, and synonymous variants from 72 variants at K = 7 across the continents. We separately performed admixture analysis on these subsets. The analysis of the detected seven datasets revealed interesting mosaic patterns (). Samples from Asia formed largely two clusters (C1and C6 with C1 as dominant); whereas, samples from the European dataset were distributed into six different clusters (C1, C2, C3, C5, C6 and C7 with C2 as dominant); and the North American samples formed four clusters (C1, C2, C4 and C7 with C4 as dominant). The African and Oceanian datasets formed two clusters each ([C2 and C3 with C2 as dominant] and [C3 andC5 with C3 as dominant cluster], respectively) and South American dataset was formed by five clusters (C1, C2, C3, C5 and C6).

Fig 3

Identification of SARS-CoV-2 genetic clusters in different continents.

Identification of SARS-CoV-2 genetic clusters in different continents.

Illustration of the seven (C1 to C7, color-coded) genetic subdivisions of SARS-CoV-2 sequences across continents using variants with MAF ≥ 0.5%. Differential proportions of strong LD (C), weak LD (D), haplotype block (E), nonsynonymous (F), and synonymous (G) variants across continental datasets are shown. In the context of dependent variants (), a strong LD block (R2 ≥ 0.5) was mostly observed in two clusters (C1 and C6) and a weak LD block (R2 < 0.5) was observed in three clusters (C1, C3, and C7) in Asia. In Europe, a strong LD block was observed in 4 clusters (C1, C2, C3, and C5) and a weak LD block was observed in five clusters (C1, C2, C3, C4, and C5; predominantly from C3 and C5). Interestingly, four clusters were common between the strong and weak LD blocks, suggesting that a makeup of four strains that dominate in Europe have significant proportion of strong and weak LD signatures in them. Likewise, in North America, both strong and weak LD were observed among three clusters (C1, C2, and C4) and (C1, C3, and C4) respectively. Further in Africa, Oceania, and South America, strong LD was observed among (C2), (C1) and (C1 and C2) respectively; weak LD was observed in (C3 and C4), (C1, C2, C3, and C5) and (C3 and C5) respectively. Interestingly, proportions of haplotype blocks (), identified using the whole data, followed mostly the pattern observed in the track of strong LD (i.e. the track of ) but admixed with weak LD strains of the respective continents. Of particular note, variations in proportions of strains carrying nonsynonymous and synonymous signatures were also very evident (). Proportions of two nonsynonymous clusters (C1 and C5, which have admixed with each other) and two synonymous clusters were dominating in Asia. Four nonsynonymous clusters (C2, C4, C5, and C6) and five synonymous clusters were dominating in Europe, while in North America, three nonsynonymous (C1, C3, C6) and two synonymous (C1, C3) clusters were in higher proportions. Similarly, C4 and C6, C5, and C2 nonsynonymous clusters along with C1 and C6, C2, and C5 synonymous clusters were in high proportions in Africa, Oceania, and South America respectively.

Evidence of coinfection in the sequences from continental datasets

Next, we tested for the presence or absence of recombination among continental datasets using PhiPack software, which uses three different tests, namely “Pairwise Homoplasy Index (Phi)” [23], “Neighbor Similarity Score (NSS)” [31], and “Maximum χ2 (MaxChi2)”[32]. The results from these tests on the combined dataset suggested the possibility of coinfection at a global level (). However, significant P-values (< 0.05) for Phi statistics were observed only with European and North American populations: European (NSS test, P-value = 0.001) and North American (NSS and Phi (normal), P-value = 0.007, 0.042, respectively). These significant P-values showed evidence of the presence of recombination events on a continental level i.e. European and North American populations. These two populations gave the highest numbers of informative variants, 276 and 194 respectively. Furthermore, a relatively large number of regions with significant evidence of recombination was observed in these two continents (. Moreover, plausible evidence of recombination in North American and European datasets were ascertained using algorithms available from RDP4 software: recombination events in European population were confirmed with significant P-values by the MaxChi2, Chimera, and 3Seq algorithms while possible recombination events in North American populations were validated by MaxChi2 and 3Seq algorithms (). On the flip side; African, Oceanic, South American, and Asian datasets showed no recombination in early spread of SARS-CoV-2 virus in the respective continents. Results of NSS, MaxChi2, Phi (permutation) and Phi (normal) tests using pairwise homoplasy index test available from PhiPack software on the combined dataset of all the 2352 samples. Significant P-values suggest the possibility of coinfection on a global level. European (NSS test, P-value of 0.001) and North American (NSS and Phi(normal), P-value of 0.007, 0.042 respectively) show evidence for the presence of recombination events, while African, Oceanic, South American, and Asian datasets show no recombination in early spread of SARS-CoV-2 in respective continents. Analysis performed using RDP, GENECONV, MaxChi2, Chimera, and 3Seq algorithms. NS = No significant p-value is observed for the recombinant event using respective method. * = The actual breakpoint position is undetermined (it was most likely overprinted by a subsequent recombination event. ~ = It is possible that this apparent recombination signal could have been caused by an evolutionary process other than recombination.

Estimation of LD and haplotype blocks in continental samples

Haplotype block analysis was carried out using two approaches; first, because of the small sample size in African (n = 25), Oceanian (n = 69), and South American (n = 24) datasets, we decided to compare haplotype blocks obtained from pooled datasets of all variants with MAF ≥ 0.5%. Hence, we initially compared LD blocks obtained from the combined dataset and then observed LD among the same variants in each continental dataset. In the second approach, we estimated haplotype blocks only within continental datasets with large sample size such as Asia, Europe, and North America. The first analysis, using the combined dataset, showed that LD block varies among continental datasets. illustrates the extent of LD variation in haplotype block of the combined pool in each continental dataset. Examination of the variants in haplotype blocks suggested a clear variation in allele frequency between continental datasets. shows MAF of 18 variants involved in haplotype block of the combined data in each continental dataset. These differences called for the second approach of estimating haplotype blocks and the extent of LD between variants directly from individual continental datasets having large sample size. Surprisingly, we observed different sets of variants in haplotype blocks, different lengths of haplotype blocks, and differences in nonsynonymous composition in haplotype blocks among the three continents datasets (). describes characteristics of the haplotype blocks observed in the datasets from Asia, Europe, and North America. Correlation between LD and physical distance suggested that a strong LD was observed in North American sequences throughout the genome compared to sequences from other continents (). The South American and European sequences showed strong LD towards the upstream region of ORF1a gene, while North American and Oceanic sequences showed strong LD from the S protein region to the end of the genome.

Estimation of haplotype blocks in continental samples.

Haplotype block estimation and extent of linkage disequilibrium observed between variants in Asia, Europe, and North America, identified a single block with different lengths in Asia and Europe, while in North America two blocks were identified.

Correlation between LD and physical distance across continental SARS-CoV-2 genomes.

(A) Upward peaks show strong LD and downward peaks show weak LD. North America and Oceania showed strong LD in the region overlapping the S protein. Strong LD among variants suggested that these de novo mutations were not broken by recombination events, (B) Smooth lines showed clear differences in LD over physical distance across continental datasets, (C) The genomic structure of SARS-CoV-2 is depicted. Display of minor allele frequency for each variant in different continents, the functional consequence of these variants, and their corresponding genes. (SNV- single nucleotide variant). Characteristics of haplotype blocks estimated from Asian, European, and North American datasets. Nonsynonymous variants are shown with bold font.

Structural analysis of SARS-CoV-2 mutations

D614G mutation

Wrapp et al. recently solved the cryo-EM structure of the S protein with 3.5 Å resolution [28] (). The S protein has many flexible loop regions that were not visible in the structure, including the RRAR cleavage site. Therefore, we modelled-in the cleavage site and the undetected flexible regions using the cryo-EM structure PDB ID:6M71 as a scaffold [29]. shows the overlay of the S protein from PDB ID:6VSB with the modelled S protein, with an RMSD of 0.25 Å. As shown in the overlay figure, there were more loops present in the modelled structure that were undetectable by Cryo-EM. The RRAR cleavage site () presents a highly accessible surface region, where the host protease enzyme can readily cleave the S protein [28]. As such, any mutations on the S protein, close to RRAR furin protease cleavage site, might alter its activity. Therefore, the D614G mutation is believed to increase SARS-CoV-2 virulence [9,33]. One possibility is that the change from a negatively charged aspartate to a non-polar glycine may modify the structure and therefore the function of the protein. Charged amino acids form ionic and hydrogen bonds (H-bond) through their side chains and stabilize proteins [34]. The targeted aspartate is present in the loop region, therefore a mutation to a glycine would cause unfolding of the loop and possibly render it more flexible making the furin cleavage site more accessible.

Fig 6

3D modelling of SARS-CoV-2 Spike protein.

3D modelling of SARS-CoV-2 Spike protein.

(A) Trimeric structure of SARS-CoV-S spike like protein (PBD:6VSB). (B) Overlay of the SARS-CoV-S spike like protein (PBD ID: 6VSB, blue) with the modelled SARS-CoV-2 S protein (PDB ID: 6M71, magenta). (C) The surface of the modelled S protein with the RRAR furin cleavage site (blue). D614 is in close vicinity to T859 of the adjacent monomer’s S2 (Chain B); thus, they can form a H-bond () through both sidechains. In addition, backbone H-bonds can be formed with A646 of the same chain. It was documented that S2 domains alter their structure after furin site cleavage [35]. Therefore, the mutation of D614 to G might weaken the stability of S2 and make cell entry more aggressive. It is probably the loss of the H-bond between G614 (S1/Chain A) and T859 (S2/Chain B) that stops the hinging of the S2 domain making it more flexible in the transition state when interacting with the host cell receptor. A thermodynamic analysis showed that D614G mutation resulted in slightly destabilizing the protein with a ΔΔG: -0.086 kcal/mol and increasing the vibrational entropy to ΔΔSVib 0.137 kcal.mol-1.K-1 as seen in where the red parts indicate more flexibility. Since this mutation will occur on the trimeric structure of the S protein, all the three domains will be more flexible. Such flexibility will render the furin cleavage site more accessible which is concomitant with the virulence of the D614G mutation. Furthermore, Wrapp et al. suggested that the flexibility observed in the receptor binding domain region (RBD) () may facilitate the binding of ACE2 to the S protein [28].

Fig 8

3D modelling of G614 mutation.

(A) S protein monomer 6VSB with D614G mutation, the red region of the protein depicts the more flexible region of the protein due to the D614G mutation with a decrease in stability of ΔΔG: -0.086 kcal/mol and an increase in vibrational entropy to ΔΔSVib 0.137 kcal.mol-1.K-1. (B) A zoomed-in structure of the N-terminal domain (NTD) and the G614 mutation in close vicinity to the RARR furin cleavage site.

3D modelling of SARS-CoV-2 Spike protein showing suggested bonds for D614.

(A) Suggested hydrogen bonds (red dashed lines) of D614 (S1 domain chain A) with T859 (S2 domain chain B) and D614 and A646 of S1 domain chain A. (B) The suggested hydrogen bond can be disrupted with the D614G mutation altering the activity of the protein.

3D modelling of G614 mutation.

P314L/P323L mutation

ORF1a and ORF1b produce a set of non-structural proteins (nsp) which assemble to facilitate viral replication and transcription (nsp7, nsp8, and nsp12) [36]. The nucleoside triphosphate (NTP) entry site and the nascent RNA strand exit paths have positively charged amino acids, are solvent accessible, and are conserved in betacoronaviruses [37] (). P314L mutation (or Position P323 on the protein structure PDB ID:6M71 because of a frame shift and written as P323L hereafter) is positioned on the interface domain of the RdRp (or nsp12) between A250-R365 residues. Previous studies have shown that the interface domain has functional significance in the RdRp of Flavivirus. In addition, when polar or charged residue mutations were introduced into these sites, viral replication levels were significantly affected [38]. Thus, mutations on nsp12 interface residues may affect the polymerase activity and RNA replication of SARS-CoV-2. Proline is often found in very tight turns in protein structures and can also function to introduce kinks into α-helices. In , we investigated the proposed intermolecular bonds that P323 can make, where the backbone COO- group of proline can form H-bonds with the backbone NH groups of T324 and S325 or the OH- group of S325 side chain. The pyrrolidine, on the other hand, forms hydrophobic interactions with W268 and F275 (). The mutation to leucine tightens the structure and reduces the flexibility with an increase in ΔΔG: 0.717 kcal/mol and a decrease in vibrational entropy to ΔΔSVib ENCoM: -0.301 kcal.mol-1.K-1 (). Furthermore, leucine possesses a non-polar side chain, seldomly involving catalysis, which can play a role in substrate recognition such as binding/recognition of hydrophobic ligands. L323 backbone COO- forms a H-bond with the sidechain OH- group of S325, in addition to a hydrophobic interaction with W68 and L270 (). L270 positioned on top of the flexible loop region, forming a hydrophobic interaction would therefore displace any water molecules entering the looped region and thereby make it more compact. The overall improved stability of RdRp can make it more efficient in RNA replication and hence increase SARS-CoV-2 replication.

Fig 10

3D modelling of P323L mutation.

(A) Suggested bonding network of P323 where the COO- group might form H-bonds with the backbone NH group of T324 and S325 and the side chain of S325. The grey dashed lines depict the hydrophobic interactions between P323 and W268 and F275. (B) The mutated L323 forms a H-bond with the side chain of S325 and forms a hydrophobic interaction with L270, which is at the curve of the loop making that region more compact.

SARS-CoV-2 RNA-dependent RNA polymerase structure in complex with nsp7 and two nsp8.

The viral RNA template and NTP entry is shown in black arrow heads. The active site is a large groove with several structural pockets. (A) Wild type RdRp complex P323 is shown in pink (B) L323 mutation is shown in green. RdRp-RNA-dependent RNA polymerase.

3D modelling of P323L mutation.

3D depiction of the less relaxed loop caused by L323 mutation.

(A) 3D structure of the RNA-dependent RNA polymerase where the blue region depicts a more rigid structure due to P323L mutation with an increase in stability of ΔΔG: 0.717 kcal/mol and a decrease in vibrational entropy to ΔΔSVib ENCoM: -0.301 kcal.mol-1.K-1. (B) A zoomed-in structure showing the less flexible loop region caused by the tight hydrophobic interactions between L323 and the hydrophobic moieties.

Discussion

Our pairwise homoplasy index tests suggest that, among continental datasets, European and North American sequences have shown evidence for the presence of recombination events (P-value = 0.001 and 0.007 respectively); while African, Oceanic, South American, and Asian datasets have shown no recombination events. The recombination events in European population were further confirmed with significant P-values by MaxChi2, Chimera, and 3Seq tools, while possible recombination events in North American populations were confirmed by MaxChi2 and 3Seq tools. This indicates once more that European and North American populations are at higher risk of coinfection with different SARS-CoV-2 variants simultaneously. The recombination effects might also lead to deletion of big portions of RNA such as the one reported in Holland et al. paper, where an 81-nucleotide deletion was detected in SARS-CoV-2 ORF7a [39]. Similar deletions might decrease viral fitness and affect COVID-19 pandemic trajectory. Depending on the variance seen within the clustered samples, PCA analysis indicated that clustering in Europe and North America is probably associated with a founder effect where a single mutation was introduced and subsequently transmitted; such founder mutations are 23403AG in Europe and 28144TC in North America. Admixture analysis identified differing number of clusters of viral strains in different continents: Europe with six clusters, South America with five, North America with four, and Africa and Asia with two each. Both strong and weak LD blocks were seen among clusters of strains in every continent; the four dominant strains in Europe have significant proportion of strong and weak LD signatures in them. Proportions of strains carrying missense variants over synonymous variants differ among continents–five clusters of missense variants were dominant in Europe, three in North America while two clusters of missense variants were dominant in Asia, Africa, and South America. Regarding recombination patterns, European and North American continents showed evidence for the presence of recombination events among SARS-CoV-2 genomes indicating continuous evolution amid SARS-CoV-2 viral strains. Admixture analyses have shown 7 different strains with differential segregation of alleles in SARS-CoV-2 isolates. Upon constricting variants with strong LD, the proportional assignment did not change in Africa and Asia, whereas it changed in Europe, North America, Oceania, and South America. In fact, proportions in Europe (C2 & C3) and North America (C2 & C4) increased excessively suggesting that strong LD sites were present in more than one strain in each of these two continents. The presence of an un-admixed block pattern of strong LD between strains suggests that LD sites were not broken by significant recombination seen in these two continents. This is either due to their physical distance or natural selection. On the contrary, weak LD sites have shown clear admixture between strains. Strikingly, each continent is dominated by a different set of nonsynonymous clusters such as Africa by C4, Asia by C1, Europe by C2, North America by C3, Oceania by C5, and South America by C2. This is also evident from the allele frequency variation seen in each continent. Further, estimation of continental-wise haplotype block enabled us to identify variations in linked nonsynonymous and synonymous sites in Asia, Europe, and North America. Although selection primarily acts on variation that undergoes amino acid change, many synonymous variants were observed in haplotype blocks. This suggests that these synonymous sites hitch-hiked along with nonsynonymous variants due to their physical proximity. Another interesting feature we observed was the variation in the number of nonsynonymous sites between Asia, Europe, and North America. The Asian haplotype block carried three nonsynonymous variants, European haplotype block carried ten nonsynonymous variants, and North American haplotype block carried seven nonsynonymous variants. This suggests that the initial strain, which originated and travelled from Asia, had less functional sites whereas coinfection-led recombination in Europe and North America enriched functional sites in strains. The structural analysis of the two European strain mutations, D614G located in the spike gene and P314L located in the RdRp gene, indicated that the former mutation may render the furin cleavage site more accessible while the latter would increase protein stability. Around 73% of the European samples have both mutations segregating together; while in Africa, only 11% of the viral samples harbor these mutations. The European strain carries additional mutations, notably the hotspot mutations R203K and G204R, that cluster in a serine-rich linker region at the RdRp. It was suggested that these mutations might potentially enhance RNA binding and replication and may alter the response to serine phosphorylation events [40], which might further exacerbate SARS-CoV-2 virulence. We understand that other confounding factors such as SARS-CoV-2 testing, socioeconomic status, the availability of suitable medical services, and the burden of other diseases are important contributors to the disparities seen in mortality rates around the world. Nonetheless, it is imperative that more functional studies are conducted to delineate the impact of these variants on SARS-CoV-2 transmissibility, diagnostics, vaccines, and therapeutics. Finally, our data highlight the urgent need to correlate patients’ medical/infection history with viral variants to anticipate the effect of these strains on the COVID-19 pandemic.

Principal Component Analysis for the 2352 SARS-CoV-2 sequences distributed by their collection month.

Principal Component Analysis (PCA) blot based on the collection month (January, February, March) for the 2352 SARS-CoV-2 sequences extracted from GISAID data; PC1 (EV = 9.7030), PC2 (EV = 8.84814). Red circle indicates the founder strain that was sequenced in January. (PDF) Click here for additional data file.

Trend of CV error in the raw dataset and in variants filtered for MAF ≥ 0.5%.

The CV for the MAF>0.5% dataset comprised 72 variants shown. K = 7 is the best fit (upon observing consistency in CV error between the raw set of variants and the set of variants with MAF≥0.5% at K = 7, optimum number of clusters 7 was selected; the inconsistency observed in the raw dataset from K = 8 may be resulting from MAF<0.5% variants), suggesting that 7 different SARS-CoV-2 strains existed in early transmission of SARS-CoV-2 across continents. (CV-cross validation procedure). (PDF) Click here for additional data file.

Linkage disequilibrium (LD) variation in haplotype block of combined dataset in each continental dataset.

Extent of LD variation observed in each continental dataset when haplotype block comprising the set of 18 variants identified in combined dataset were mapped to continental datasets. (PDF) Click here for additional data file.

Alignments used for the recombination tests.

(ALN) Click here for additional data file. 3 Mar 2021 PONE-D-20-33456 SARS-CoV-2: Proof of recombination between strains and emergence of possibly more virulent ones PLOS ONE Dear Dr. Hammad, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The manuscript was reviewed by two experts. One of the reviewers recommended a minor revision, while the other recommended rejecting the manuscript. Having read the manuscript myself, I would ask you to take into account all comments which the reviewers made when revising your manuscript. Reviewer 2 made detailed suggestions to improve the presentation of recombination analyses. Please submit your revised manuscript by Apr 17 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, Houssam Attoui, PharmD, PhD Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: No ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Hammad and co-workers analyzed early spread SARS-CoV-2 using sequences from GISAID. The work is important and compelling. They should make a few changes: Title: “SARS-CoV-2: Proof of recombination between strains and emergence of possibly more virulent ones” should be modified. Suggest: SARS-CoV-2: Possible recombination and emergence of potentially more virulent strains. The analysis does not prove recombination. The authors should specify the gene for each mutation, at least the first time it is used. For example: Spike_ D614G and Nsp12_P314L. Specific changes: Line 34 P314L which is a RdRp mutation likely does not “enhance viral entry and stability.” Line 58 Change Beta-coronaviruses to betacoronaviruses Line 220 “cleavage site (Fig 5 C) shows a high surface accessible region, where the viral protein can attach to the host protein.” This should be clarified. The host protein I believe they are discussing is furin (or a furin-like protease) . Not clear that this is attachment pers e, which is usually reserved for the ACE2 interaction. Line 226 and elsewhere change FURIN to furin. Reviewer #2: The authors claimed that they had proof of recombination between SARS-CoV-2 strains and therefore, the emergence of more virulent ones. The preliminary analysis of recombination is not followed by an in a deep analysis of this process but instead, is filled with another analysis which confounds the reader for the manuscript's main message. Regarding the mere analysis of recombination with PhiPack, which implements NSS, MaxChi, and Phi test, the authors said they found evidence of recombination for the European and American datasets. The evidence of recombination of European recombination only stands for a statistical test, with the rest of the tests largely rejecting the hypothesis. Recombination should be supported by more than one method (see Posada and Crandall 2001; doi:10.1073/pnas.241370698). For the North America dataset, the recombination signal should be analysed deeper. First, ¿how many permutations did you run? At least 100 permutations should be done. It can be seen that MaxChi2 and Phi (with unknown permutations, data not given) almost reject the recombination hypothesis. From Bruen et al., "Max χ2 and NSS falsely infer the presence of recombination under a simple model of mutation rate correlation" (doi:10.1534/genetics.105.048975). Thus, authors need to corroborate further these results. For example, they can try the program 'profile' implemented in the PhiPack package, to see with a window approach which regions exhibit the strongest evidence of mosaicism and analysed them. Which variants would be affected by recombination? Are those variants the ones further analysed in this work? Other methods can be also included. For instance, the authors determine the linkage disequilibrium (LD) but they did not correlate with the genetic distance, which is the test for recombination. As the authors do not provide enough evidence of recombination, the message of this work should be changed. Finally, alignments use for test recombination should be made accessible in case authors proved that. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes: Beatriz Beamud [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 13 Apr 2021 Dear Dr. Attoui, We would like to sincerely thank you and the reviewers for your valuable time and useful contributions. We appreciate your input which helped improving our manuscript. As suggested, we considered all comments that the reviewers made during the revision process specially to improve the presentation of our recombination analyses. Attached with the revised submission, you will find our point-by-point response to the referees and the changes we made based on the reviewers’ comments. Submitted filename: Response to Reviewers.docx Click here for additional data file. 26 Apr 2021 SARS-CoV-2: Possible recombination and emergence of potentially more virulent strains. PONE-D-20-33456R1 Dear Dr. Hammad, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Houssam Attoui, PharmD, PhD Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 17 May 2021 PONE-D-20-33456R1 SARS-CoV-2: Possible recombination and emergence of potentially more virulent strains Dear Dr. Thanaraj: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Houssam Attoui Academic Editor PLOS ONE

Table 1

Evidence of recombination in the sequences from continental datasets and regions exhibiting evidence of recombination observed from ‘Profile’ program in continental datasets.

Dataset	Number of informative variants	Regions showing significant (p<0.05) evidence of recombination	Tests to detect evidence of recombination	Significance of observed Phi statistics
Africa (n = 25)	19	18800–18900, 22450–23425, 29300–29325	NSS	1
			MaxChi²	0.978
			Phi (permutation)	1
			Phi (normal)	1
Asia (n = 364)	127	4600–4825, 5200–5825, 6425–6500, 6900–7375, 8800–8925, 9175–9750, 9950–10150, 21000–21050, 22650–23625	NSS	0.113
			MaxChi²	0
			Phi (permutation)	0.493
			Phi (normal)	0.305
Europe (n = 1132)	276	500–700, 2100–2125, 3300–3325, 4050–4725, 5150–5450, 7825–7850, 12925–14075, 18825–18850, 19375–19400, 20075–20475, 22975–23000, 23575–23625, 25275–25325, 26275–26325, 29200–29325	NSS	0.001
			MaxChi²	0.208
			Phi (permutation)	0.343
			Phi (normal)	0.223
North America (n = 738)	194	800–1100, 5575–6300, 6475–6525, 7300–7450, 18800–18875, 19950–19975, 21525–22625, 23500–23600, 24275–24675, 24975–25000, 25725–26050, 26375–26425, 27600–27800, 29300–29325	NSS	0.007
			MaxChi²	0.061
			Phi (permutation)	0.060
			Phi (normal)	0.042
South America (n = 24)	24	None	NSS	0.596
			MaxChi²	0.680
			Phi (permutation)	0.717
			Phi (normal)	0.454
Oceania (n = 69)	50	22625–23425	NSS	1
			MaxChi²	0.502
			Phi (permutation)	0.872
			Phi (normal)	0.361
Combined (n = 2352)	554	1625–1650, 3300–3525, 3975–4225, 4625–4725, 5175–5375, 7550–7800, 8800–8925, 9175–9750, 9925–10050, 10600–12125, 13025–14075, 18175–18550, 18850–19150, 20075–20475, 23000–23025, 23475–23525, 24700–24750, 27050–28225	NSS	0.003
			MaxChi²	0.001
			Phi (permutation)	0.008
			Phi (normal)	0.015

Results of NSS, MaxChi2, Phi (permutation) and Phi (normal) tests using pairwise homoplasy index test available from PhiPack software on the combined dataset of all the 2352 samples. Significant P-values suggest the possibility of coinfection on a global level. European (NSS test, P-value of 0.001) and North American (NSS and Phi(normal), P-value of 0.007, 0.042 respectively) show evidence for the presence of recombination events, while African, Oceanic, South American, and Asian datasets show no recombination in early spread of SARS-CoV-2 in respective continents.

Table 2

Plausible recombination events validated by RDP4 suite.

Continental datasets	Event	Start	End	RDP	GENECONV	MaxChi²	Chimera	3Seq
North American	1~	530*	29326	NS	NS	3.84E-03	NS	1.14E-04
	2~	226*	29535	NS	NS	1.05E-02	NS	2.54E-04
	3~	175*	26046	NS	NS	1.02E-02	NS	4.83E-03
European	1~	820*	24825*	NS	NS	9.72E-05	2.12E-03	4.47E-04
Combined	1~	1519*	29004	NS	NS	1.09E-03	NS	1.07E-04
	2~	29575*	29835*	NS	4.98E-05	NS	NS	2.56E-02

Analysis performed using RDP, GENECONV, MaxChi2, Chimera, and 3Seq algorithms.

NS = No significant p-value is observed for the recombinant event using respective method.

* = The actual breakpoint position is undetermined (it was most likely overprinted by a subsequent recombination event.

~ = It is possible that this apparent recombination signal could have been caused by an evolutionary process other than recombination.

Table 3

MAF distribution of 18 variants involved in haplotype block of combined dataset in each continental data.

SNV	Minor allele frequency							Functional consequence	Gene
SNV	Africa	Asia	Europe	North America	South America	Oceania	Combined	Functional consequence	Gene
241CT	0.08	0.0811	0.2507	0.3231	0.4583	0.1905	0.49	downstream	5’-UTR
1059CT	0.24	0.019	0.1435	0.1865	0	0.0289	0.1313	nonsynonymous	ORF1a
3037CT	0.08	0.0760	0.2502	0.313	0.4583	0.1884	0.4804	synonymous	ORF1a
8782CT	0.04	0.2304	0.0424	0.4197	0.2917	0.1884	0.2484	synonymous	ORF1a
11083GT	0.04	0.269	0.1339	0.0531	0.125	0.4928	0.1416	nonsynonymous	ORF1a
14408CT	0.08	0.0760	0.25	0.3148	0.4583	0.1884	0.481	nonsynonymous	ORF1b
14805CT	0.08	0.0047	0.1366	0.0224	0.3333	0.1159	0.0787	synonymous	ORF1b
17747CT	0	0	0.0106	0.4616	0	0.0579	0.174	nonsynonymous	ORF1b
17858AG	0	0	0.0097	0.4418	0	0.0579	0.1798	nonsynonymous	ORF1b
18060CT	0	0.0166	0.0088	0.4382	0	0.0579	0.1829	synonymous	ORF1b
23403AG	0.08	0.0783	0.25	0.3121	0.4583	0.1884	0.4808	nonsynonymous	S
25563GT	0.32	0.0213	0.1851	0.2262	0	0.0434	0.1647	nonsynonymous	ORF3a
26144GT	0.04	0.1119	0.1293	0.0211	0.125	0.1594	0.0923	nonsynonymous	ORF3a
27046CT	0	0	0.1114	0.0013	0.125	0.0144	0.0538	nonsynonymous	M
28144TC	0.04	0.2304	0.0407	0.4185	0.2917	0.1884	0.2483	nonsynonymous	N
28881GA	0	0.0381	0.2396	0.0423	0.375	0.0869	0.1378	nonsynonymous	N
28882GA	0	0.0381	0.2396	0.0410	0.375	0.0869	0.1374	synonymous	N
28883GC	0	0.0381	0.2396	0.0410	0.375	0.0869	0.1374	nonsynonymous	N

Display of minor allele frequency for each variant in different continents, the functional consequence of these variants, and their corresponding genes. (SNV- single nucleotide variant).

Table 4

Characteristics of haplotype blocks estimated from three continental datasets.

Dataset	Haplotype block start	Haplotype block end	Length (in kb)	Number of variants	Number of nonsynonymous variants	Variant	MAF
Asia	3037	23403	20.367	5	3	3037CT	0.076
						8782CT	0.23
						11083GT	0.269
						14408CT	0.076
						23403AG	0.078
Europe	241	28883	28.643	17	10	241CT	0.25
						1059CT	0.143
						1440GA	0.052
						3037CT	0.25
						11083GT	0.134
						14408CT	0.25
						14805CT	0.136
						15324CT	0.062
						17247TC	0.0689
						20268AG	0.0734
						23403AG	0.25
						25563GT	0.185
						26144GT	0.129
						27046CT	0.111
						28881GA	0.239
						28882GA	0.239
						28883GC	0.239
North America	241	8782	8.54	4	1	241CT	0.323
						1059CT	0.186
						3037CT	0.313
						8782CT	0.419
	14408	28144	13.737	7	6	14408CT	0.3148
						17747CT	0.462
						17858AG	0.442
						18060CT	0.438
						23403AG	0.312
						25563GT	0.226
						28144TC	0.418

Characteristics of haplotype blocks estimated from Asian, European, and North American datasets. Nonsynonymous variants are shown with bold font.

38 in total

1. A simple and robust statistical test for detecting the presence of recombination.

Authors: Trevor C Bruen; Hervé Philippe; David Bryant
Journal: Genetics Date: 2006-02-19 Impact factor: 4.562

2. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.

Authors: Wei Shen; Shuai Le; Yan Li; Fuquan Hu
Journal: PLoS One Date: 2016-10-05 Impact factor: 3.240

3. The RNA polymerase activity of SARS-coronavirus nsp12 is primer dependent.

Authors: Aartjan J W te Velthuis; Jamie J Arnold; Craig E Cameron; Sjoerd H E van den Worm; Eric J Snijder
Journal: Nucleic Acids Res Date: 2009-10-29 Impact factor: 16.971

4. GISAID: Global initiative on sharing all influenza data - from vision to reality.

Authors: Yuelong Shu; John McCauley
Journal: Euro Surveill Date: 2017-03-30

Review 5. SARS-CoV-2 Vaccine Development: Current Status.

Authors: Gregory A Poland; Inna G Ovsyannikova; Stephen N Crooke; Richard B Kennedy
Journal: Mayo Clin Proc Date: 2020-07-30 Impact factor: 7.616

6. DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability.

Authors: Carlos Hm Rodrigues; Douglas Ev Pires; David B Ascher
Journal: Nucleic Acids Res Date: 2018-07-02 Impact factor: 16.971

7. Genetic diversity and evolution of SARS-CoV-2.

Authors: Tung Phan
Journal: Infect Genet Evol Date: 2020-02-21 Impact factor: 3.342

Review 8. The role of host genetics in the immune response to SARS-CoV-2 and COVID-19 susceptibility and severity.

Authors: Inna G Ovsyannikova; Iana H Haralambieva; Stephen N Crooke; Gregory A Poland; Richard B Kennedy
Journal: Immunol Rev Date: 2020-07-13 Impact factor: 12.988

9. Virus strain from a mild COVID-19 patient in Hangzhou represents a new trend in SARS-CoV-2 evolution potentially related to Furin cleavage site.

Authors: Xi Jin; Kangli Xu; Penglei Jiang; Jiangshan Lian; Shaorui Hao; Hangping Yao; Hongyu Jia; Yimin Zhang; Lin Zheng; Nuoheng Zheng; Dong Chen; Jinmei Yao; Jianhua Hu; Jianguo Gao; Liang Wen; Jian Shen; Yue Ren; Guodong Yu; Xiaoyan Wang; Yingfeng Lu; Xiaopeng Yu; Liang Yu; Dairong Xiang; Nanping Wu; Xiangyun Lu; Linfang Cheng; Fumin Liu; Haibo Wu; Changzhong Jin; Xiaofeng Yang; Pengxu Qian; Yunqing Qiu; Jifang Sheng; Tingbo Liang; Lanjuan Li; Yida Yang
Journal: Emerg Microbes Infect Date: 2020-12 Impact factor: 7.163

10. Comprehensive annotations of the mutational spectra of SARS-CoV-2 spike protein: a fast and accurate pipeline.

Authors: Mohammad Shaminur Rahman; Mohammad Rafiul Islam; Mohammad Nazmul Hoque; Abu Sayed Mohammad Rubayet Ul Alam; Masuda Akther; Joynob Akter Puspo; Salma Akter; Azraf Anwar; Munawar Sultana; Mohammad Anwar Hossain
Journal: Transbound Emerg Dis Date: 2020-10-06 Impact factor: 4.521

21 in total

Review 1. Role of the Microbiome in the Pathogenesis of COVID-19.

Authors: Rituparna De; Shanta Dutta
Journal: Front Cell Infect Microbiol Date: 2022-03-31 Impact factor: 5.293

Review 2. Recombination in Coronaviruses, with a Focus on SARS-CoV-2.

Authors: Daniele Focosi; Fabrizio Maggi
Journal: Viruses Date: 2022-06-07 Impact factor: 5.818

3. Mutational dynamics across VOCs in International travellers and Community transmission underscores importance of Spike-ACE2 interaction.

Authors: Priyanka Mehta; Varsha Ravi; Priti Devi; Ranjeet Maurya; Shaista Parveen; Pallavi Mishra; Aanchal Yadav; Aparna Swaminathan; Sheeba Saifi; Kriti Khare; Partha Chattopadhyay; Monika Yadav; Nar Singh Chauhan; Bansidhar Tarai; Sandeep Budhiraja; Uzma Shamim; Rajesh Pandey
Journal: Microbiol Res Date: 2022-06-25 Impact factor: 5.070

Review 4. Emergence of SARS-CoV-2 Variants in the World: How Could This Happen?

Authors: Alfredo Parra-Lucares; Paula Segura; Verónica Rojas; Catalina Pumarino; Gustavo Saint-Pierre; Luis Toro
Journal: Life (Basel) Date: 2022-01-28

5. Unusual SARS-CoV-2 intrahost diversity reveals lineage superinfection.

Authors: Filipe Zimmer Dezordi; Paola Cristina Resende; Felipe Gomes Naveca; Valdinete Alves do Nascimento; Victor Costa de Souza; Anna Carolina Dias Paixão; Luciana Appolinario; Renata Serrano Lopes; Ana Carolina da Fonseca Mendonça; Alice Sampaio Barreto da Rocha; Taina Moreira Martins Venas; Elisa Cavalcante Pereira; Marcelo Henrique Santos Paiva; Cassia Docena; Matheus Filgueira Bezerra; Laís Ceschini Machado; Richard Steiner Salvato; Tatiana Schäffer Gregianini; Leticia Garay Martins; Felicidade Mota Pereira; Darcita Buerger Rovaris; Sandra Bianchini Fernandes; Rodrigo Ribeiro-Rodrigues; Thais Oliveira Costa; Joaquim Cesar Sousa; Fabio Miyajima; Edson Delatorre; Tiago Gräf; Gonzalo Bello; Marilda Mendonça Siqueira; Gabriel Luz Wallau
Journal: Microb Genom Date: 2022-03

6. Genomic diversity of SARS-CoV-2 in Malaysia.

Authors: Noorliza Mohamad Noordin; Joon Liang Tan; Chee Kheong Chong; Yu Kie Chem; Norazimah Tajudin; Rehan Shuhada Abu Bakar; Selvanesan Sengol; Hannah Yik Phing Phoon; Nurul Aina Murni Che Azid; W Nur Afiza W Mohd Arifin; Zirwatul Adilah Aziz; Hani Hussin; Nurul Syahida Ibrahim; Aziyati Omar; Ushananthiny Ravi; Kamal Hisham Kamarul Zaman; Mohd Asri Yamin; Yun Fong Ngeow
Journal: PeerJ Date: 2021-11-03 Impact factor: 2.984

7. Emergent SARS-CoV-2 variants: comparative replication dynamics and high sensitivity to thapsigargin.

Authors: Sarah Al-Beltagi; Leah V Goulding; Daniel K E Chang; Kenneth H Mellits; Christopher J Hayes; Pavel Gershkovich; Christopher M Coleman; Kin-Chow Chang
Journal: Virulence Date: 2021-12 Impact factor: 5.882

8. Possible recombination between two variants of concern in a COVID-19 patient.

Authors: Yaqing He; Wentai Ma; Shengyuan Dang; Long Chen; Renli Zhang; Shujiang Mei; Xinyi Wei; Qiuying Lv; Bo Peng; Jiancheng Chen; Dongfeng Kong; Ying Sun; Xiujuan Tang; Weihua Wu; Zhigao Chen; Shimin Li; Jia Wan; Xuan Zou; Mingkun Li; Tiejian Feng; Lili Ren; Jianwei Wang
Journal: Emerg Microbes Infect Date: 2022-12 Impact factor: 7.163

Review 9. SARS-CoV-2 Variants, Vaccines, and Host Immunity.

Authors: Priyal Mistry; Fatima Barmania; Juanita Mellet; Kimberly Peta; Adéle Strydom; Ignatius M Viljoen; William James; Siamon Gordon; Michael S Pepper
Journal: Front Immunol Date: 2022-01-03 Impact factor: 7.561

Review 10. The Immune Response to SARS-CoV-2 and Variants of Concern.

Authors: Elham Torbati; Kurt L Krause; James E Ussher
Journal: Viruses Date: 2021-09-23 Impact factor: 5.048