Literature DB >> 32818852

SARS-CoV-2 exhibits intra-host genomic plasticity and low-frequency polymorphic quasispecies.

Timokratis Karamitros1, Gethsimani Papadopoulou2, Maria Bousali2, Anastasios Mexias2, Sotirios Tsiodras3, Andreas Mentis4.   

Abstract

In December 2019, an outbreak of atypical pneumonia (Coronavirus disease 2019 -COVID-19) associated with a novel coronavirus (SARS-CoV-2) was reported in Wuhan city, Hubei province, China. The outbreak was traced to a seafood wholesale market and human to human transmission was confirmed. The rapid spread and the death toll of the new epidemic warrants immediate intervention. The intra-host genomic variability of SARS-CoV-2 plays a pivotal role in the development of effective antiviral agents and vaccines, as well as in the design of accurate diagnostics. We analyzed NGS data derived from clinical samples of three Chinese patients infected with SARS-CoV-2, in order to identify small- and large-scale intra-host variations in the viral genome. We identified tens of low- or higher- frequency single nucleotide variations (SNVs) with variable density across the viral genome, affecting 7 out of 10 protein-coding viral genes. The majority of these SNVs (72/104) corresponded to missense changes. The annotation of the identified SNVs but also of all currently circulating strain variations revealed colocalization of intra-host as well as strain specific SNVs with primers and probes currently used in molecular diagnostics assays. Moreover, we de-novo assembled the viral genome, in order to isolate and validate intra-host structural variations and recombination breakpoints. The bioinformatics analysis disclosed genomic rearrangements over poly-A / poly-U regions located in ORF1ab and spike (S) gene, including a potential recombination hot-spot within S gene. Our results highlight the intra-host genomic diversity and plasticity of SARS-CoV-2, pointing out genomic regions that are prone to alterations. The isolated SNVs and genomic rearrangements reflect the intra-patient capacity of the polymorphic quasispecies, which may arise rapidly during the outbreak, allowing immunological escape of the virus, offering resistance to anti-viral drugs and affecting the sensitivity of the molecular diagnostics assays.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19 epidemic; Genomic rearrangements; Intra-host variability; Molecular diagnostics; Quasispecies; SARS-CoV-2

Mesh:

Year:  2020        PMID: 32818852      PMCID: PMC7418792          DOI: 10.1016/j.jcv.2020.104585

Source DB:  PubMed          Journal:  J Clin Virol        ISSN: 1386-6532            Impact factor:   3.168


Introduction

Coronaviruses (CoVs), considered to be the largest group of viruses, belong to the Nidovirales order, Coronaviridae family and Coronavirinae subfamily, which is further subdivided into four genera, the alpha- and betacoronaviruses, which infect mammalian species and gamma- and deltacoronaviruses infecting mainly birds [1,2]. Small mammals (mice, dogs, cats) serve as reservoirs for Human Coronaviruses (HCoVs), with significant diversity seen in bats, which are considered to be primordial hosts of HCoVs [3]. Until 2002, minor consideration was given to HCoVs, as they were associated with mild-to-severe disease phenotypes in immunocompetent people [[3], [4], [5]]. In 2002, the beginning of severe acute respiratory syndrome (SARS) outbreak took place [6]. In 2005, after the discovery of SARS-CoV-related viruses in horseshoe bats (Rhinolophus), palm civets were suggested as intermediate hosts, and bats as primordial hosts of the virus [6,7]. In 2012, the emerging Middle East respiratory syndrome coronavirus (MERS-CoV) caused an outbreak in Saudi Arabia, which affected both camels and humans, with a high mortality rate of approximately 343% among humans [8]. MERS-CoV has zoonotic origins [9] and was transmitted to humans through direct contact with dromedary camels or indirect contact with contaminated meat or milk [10]. On December 31st – 2019, a novel Coronavirus (SARS-CoV-2) was first reported from the city of Wuhan, Hubei province in China, causing severe infection of the respiratory tract in humans, after the identification of a group of similar cases of patients with pneumonia of unknown etiology [11]. Similarly to SARS, epidemiological links between the majority of COVID-19 cases and Huanan South China Seafood Market, a live-animal market, have been reported. A total of 76,775 confirmed cases of “Coronavirus Disease 2019” (COVID-19) were reported up to February 21st 2020, from which 2247 died and 18,855 recovered. Notably, 75,447 of the confirmed cases were reported in China [12]. The size of the ssRNA genome of SARS-CoV-2 is 29,891 nucleotides, it encodes 9860 amino acids and is characterized by nucleotide identity of ∼ 89 % with bat SARS-related (SL) CoV-ZXC21 and bat-SL-CoVZC45. However, when compared to HCoVs, SARS-CoV-2 showed genetic similarity of ∼ 80 % with human SARS-CoVs BJ01 2003 and Tor2 [13] and and 50 % with MERS-CoV [14,15]. CoVs are enveloped positive-sense RNA viruses, characterized by a very large non-segmented genome (26–32 kb length), ready to be translated [2,4]. The genes arrangement on the SARS-CoV-2 genome is: 5′UTR -replicase (ORF1/ab) -Spike (S) -ORF3a -Envelope (E) -Membrane (M) -ORF6 -ORF7a -ORF8 -Nucleocapsid (N) ORF10 -3′UTR [13]. SARS-CoV-2 encodes proteins that are very similar in length compared to bat-SL-CoVZC45 and bat-SL-CoVZXC21. The SARS-CoV-2 S protein however is longer compared to those encoded by SARS-CoV, and MERS-CoV [15]. At inter-host level, adaptive mutations are essential for the newly emerging viruses in order to increase replication and facilitate onward transmission in the new hosts [16]. Particularly for MERS-CoV, SARS-CoV and SARS-CoV-2, the genetic diversity and frequent recombination events, lead to periodical emergence of new viruses capable of infecting a wide range of hosts [17]. Intra-host variability in viral infections, emerges from genomic phenomena taking place during error-prone replication, ending up to multiple circulating quasispecies of low or higher frequency [18]. These variants, in combination with the genetic profile of the host, can potentially influence the natural history of the infection, the viral phenotype, but also the sensitivity of molecular and serological diagnostics assays [19,20]. In the case of flu epidemics for example, de novo arising mutations and intra-host diversity not only forms intra-host evolution of Influenza A, but also greatly affects the pathogenesis of the virus [[21], [22], [23]]. Indeed, it is suggested that SARS-CoV-2 genomic variants that emerge from inter- and intra-host evolution might be associated with susceptibility to SARS-CoV-2 infection and the severity of COVID-19 [24]. Viruses have developed multiple adaptive strategies to counteract the host immunological response, which are subject to inter- and intra-host selection pressures; “Selfish” strategies confer a selective advantage in a particular quasispecies, impair the immune response inside the infected cell and evolve by intra-host selection, while neutral or “unselfish” defence strategies impair the immune response outside the infected cell and evolve by inter-host selection, preferentially in viruses with low mutation rates [25]. SARS-CoV-2 mutation rate is moderate and similar to other RNA viruses (0.00084 per site per year) [26], but still generally higher compared to DNA viruses [27]. Moreover, most of the suggested immune escape mechanisms of SARS-CoV-2 involve intra-cellular interactions [28], thus expected to evolve by intra-host selective pressure. These observations highlight the importance of SARS-CoV-2 intra-host variability in the frame of viral evolution and host-pathogen interactions. Intra-host genomic variability also leads to antigenic variability, which is of higher importance, especially for pathogens that fail to elicit long-lasting immunity in their hosts, and remains a major contributor to the complexity of vaccine design [29,30]. To date, there are no clinically approved vaccines available for protection of general population from SARS- and MERS-CoV infections as there is no effective vaccine to induce robust cell mediated and humoral immune responses [31,32]. Here, we explore intra-host genomic variants and low-frequency polymorphic quasispecies in Next Generation Sequencing (NGS) data derived from patients infected by SARS-CoV-2. Intra-host genomic variability is critical for the development of novel drugs and vaccines, which are of urgent necessity, towards the containment of the pandemic.

Materials and methods

In this study NGS data derived from three Chinese patients (oral swabs) infected by SARS-CoV-2 were analysed (SRA projects PRJNA601736 and PRJNA603194). All datasets available in SRA up to February 20th, 2020 were analysed. The two patients (SRR10903401 and SRR10903402/PRJNA601736), 39- and 21-year-old respectively, experienced unusual pneumonia. Despite his anti-viral treatment, patient 1 experienced more severe symptoms The two patients were admitted to the hospital on 25th and 22th December 2019 and were discharged in stable condition on 12th and 11th January 2020, respectively [33]. The third 41-year-old male patient (SRR1097138/PRJNA603194), presented acute onset of common COVID19 symptoms. A combinatory antiviral therapy was administered to the patient. However, he exhibited respiratory failure and was admitted to the intensive care unit. Six days after his admission, he was transferred to another hospital in Wuhan for further treatment [34]. Detailed clinical metadata of the patients are presented in the Supplementary Material. The raw read data were aligned on the complete (29,891 bp) SARS-CoV-2 reference sequence (GenBank accession no. MN975262.1, isolate 2019-nCoV_HKU-SZ-005b_2020) using bowtie2 v2.3.0 [35], after quality check with FastQC v0.11.5 [36]. The resulting alignments were visualized with the Integrated Genomics Viewer (IGV) v2.3.60 [37]. After removing PCR duplicates, SNVs were called with a Bonferroni-corrected P-value threshold of 0.05 using samtools v1.7 (htslib1.7.2) [38] and LoFreq v2.1.5. LoFreq is a very accurate SNV caller especially designed for viral and bacterial genomes; its performance depends on the sequencing depth and the quality of the NGS reads. For the datasets analyzed in this study (average read depth 133.5x – 598.2x) and based on the assessed read quality > Q30 = 88.2–92.7%, LoFreq has calling sensitivity = ∼1% and PPV = 100 [39]. Variants supported by absolute read concordance (>98 %) were filtered-out from intra-host variant frequency calculations. Four SNVs from sample SRR10903402 and 3 SNVs from sample SRR10971381 with statistically significant strand bias (P-value < 0.05) were also excluded from further analyses. Variations were annotated to the reference genome using snpEff v4.3p [40], SNVs effects were further filtered with snpSift v4.3p [41] and the average mutation rate per gene across the viral genome was estimated using R scripts (v3.6.2) in RStudio v1.1.456. The colocalization of the intra-host SNVs and population level SNPs retrieved from www.GISAID.org on February 18th 2020, with primers and probes coordinates was also examined, to identify potential interferences with all currently available molecular diagnostic assays [42]. The impact of these SNVs on the binding affinity of primers and probes to their genomic targets, was predicted using FastPCR 3.3.28 [43] and DINAMelt webserver [44]. To investigate intra-host genomic rearrangements, de novo assembly of the SARS-CoV-2 genomes was performed using Spades v3.13.1 [45]. Spades outperforms most modern de novo assemblers in terms of viral genome retrieval and coverage, presenting the highest sensitivity (99.48 %) [46]. The resulting contigs were analyzed with BLAST v2.6.0 [47] and confirmed by remapping of the raw reads, setting a threshold of 5 not replicated reads for contigs suggesting rearrangements. Smaller contigs (<200 bp) were elongated where possible, after pair-wise realignment of the corresponding mapped reads. Basic computations and visualizations were implemented in R programming language v3.6.2, using in-house scripts. The secondary structures of the genomic regions surrounding the recombination breakpoints were predicted using RNAfold webserver [48].

Results

The mapping assembly of the viral genome was almost complete for all samples. The genome coverage and the average read depth across the genome was 100.0 % and 133.5x for sample SRR10903401, 100.0 % and 522.5x for sample SRR10903402, and 99.9 %, and 598.2x for sample SRR10971381, respectively (Table 1 ).
Table 1

NGS read alignment and genome coverage metrics.

Sample
SRR10903401SRR10903402SRR10971381
Paired Reads, N (%)
 Total Number476,632 (100)676,694 (100)28,282,964 (100)
 Aligned13,913 (2.94)54,723 (8.18)62,288 (0.22)
 Concordantly Aligned11,469 (2.40)44,176 (6.52)59,261(0.21)
 Discordantly Aligned2444 (0.53)10,547 (1.67)3027 (0.01)
Single Mates, N (%)
 Aligned244 (0.03)1308 (0.11)294(0.001)
Overall Alignment Rate (%)2.948.180.22



Quality score > Q30 (%)92.792.188.2
Genome Coverage (%)100.0100.099.9
Average read depth (X)133.5522.2598.2
NGS read alignment and genome coverage metrics. In all samples, the same 5 SNVs isolated with 98–100 % read concordance, thus in total divergence with the reference genome (MN975262.1), were excluded from downstream analysis. For sample SRR10903401 34 lower frequency SNVs were isolated in total. Of these, 33 were present with frequencies ranking between 2 and 15 %, while only one was present in 40 % of the intra-host viral population. The sequencing depth, which is also evaluated during the SNV calling by the LoFreq algorithm, ranked between 39x and 290x at the corresponding SNV positions. The sequencing depth of sample SRR10903402 at the polymorphic positions was higher (103x – 1137x), allowing the isolation of 55 SNVs with frequencies distributed between 0.9 % and 14 %. The depth over the polymorphic positions of sample SRR10971381 was between 159x – 1872x, allowing the isolation of 10 intra-host SNVs, with frequencies 1.1 %–6.8 % (Fig. 1 .A, Suppl.Table 1).
Fig. 1

Intra – host SNVs: (A) Intra host SNV frequency vs sequencing read depth (X coverage) in the corresponding alignment position. (B) Venn diagram representing unique and common SNVs isolated from the three patients (C) Boxplot of intra-host SNVs frequency vs. SNV type – synonymous, missense, nonsense (stop gained) (low, moderate and high impact respectively). Average values are in red rhombs. (D) Intra-host SNVs frequency vs. all seven genes affected (ORF1ab, S, ORF3a, ORF6, ORF7a, ORF8, N). Average values are in red rhombs. (E) Density histogram of intra-host SNVs isolated from all patients (total number of SNVs / 100 bp - blue bars) and average sequencing read depth (X coverage – green line), across the SARS-CoV-2 genome map (genes in orange, 5′ and 3′ untranslated regions in light blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Intra – host SNVs: (A) Intra host SNV frequency vs sequencing read depth (X coverage) in the corresponding alignment position. (B) Venn diagram representing unique and common SNVs isolated from the three patients (C) Boxplot of intra-host SNVs frequency vs. SNV type – synonymous, missense, nonsense (stop gained) (low, moderate and high impact respectively). Average values are in red rhombs. (D) Intra-host SNVs frequency vs. all seven genes affected (ORF1ab, S, ORF3a, ORF6, ORF7a, ORF8, N). Average values are in red rhombs. (E) Density histogram of intra-host SNVs isolated from all patients (total number of SNVs / 100 bp - blue bars) and average sequencing read depth (X coverage – green line), across the SARS-CoV-2 genome map (genes in orange, 5′ and 3′ untranslated regions in light blue). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). Intra-host variants were distributed across 7 out of the 10 protein-coding genes of the viral genome, namely ORF1ab, S, ORF3a, ORF6, ORF7a, ORF8 and N. After normalising for the gene length (variants/kb-gene-length, “v/kbgl”), the density of the SNVs for each gene was estimated (Table 2 ). The majority of the SNPs corresponded to missense changes (leading to amino-acid change) compared to synonymous changes (cumulatively 72 vs. 29 respectively, ratio 2.48:1) (Table 2), while the average number of missense changes was marginally significantly higher compared to synonymous changes (233 vs. 8,0 respectively, Wilcoxon rank sum test, p = 0.054). The average intra-host variant frequency did not differ significantly either between missense and synonymous polymorphisms (Wilcoxon rank sum test, p > 0.05) (Fig. 1.C), or between their hosting genes (pairwise Wilcoxon rank sum tests, p > 0.05) (Fig. 1.D). We did not detect any small-scale insertions or deletions in the samples (Suppl. Table 1).
Table 2

Impact of Intra-host SNVs on viral genes.

Intra-host Variants Impact, N
GeneLow (synonymous)Moderate (missense)High (stop gained)Total, N (v/kbgl)*
ORF1ab1953274 (3.47)
S69116 (4.18)
ORF3a0101 (1.20)
E0000 (0)
M0000 (0)
ORF62103 (16.21)
ORF7a0101 (2.73)
ORF80303 (8.21)
N2406 (4.76)
ORF100000 (0)



Total, N29723

normalised variants per 1 kb gene length (variants / gene-length *1000).

Impact of Intra-host SNVs on viral genes. normalised variants per 1 kb gene length (variants / gene-length *1000). The comparison of all SNVs (intra-host and population level) with the genomic targets of the molecular diagnostics assays, revealed colocalization of 3 intra-host SNVs and 2 isolate-specific SNVs with primers and probes currently in use in RdRP_SARSr, HKU-N, 2019-nCoV-N1 and 2019-nCoV-N2 diagnostic reactions (Fig. 2 ). The thermodynamic assessment of these SNVs revealed variable impact on the binding affinity of the corresponding primers and probes on the mutated genomic region (Suppl. Table 2)
Fig. 2

Truncated map of SARS-CoV-2 genome illustrating a subset of intra-host (blue lines) and globally collected, isolate-specific SNVs (orange lines) with respect to the genomic targets of molecular diagnostics assays (red arrows – primers, red bars - probes). Three intra-host variants (orange triangles), and two strain specific variants (Wuhan/IVD-HB-04/2020 and Chongqing/YC01/2020 - red triangles), are colocalized with the RdRP_SARSr probe (15,474 T > G), the 2019-nCoV_N1 forward primer (28,291 C > T), the HKU-N reverse primer (28,971 A > G) and the 2019-nCoV-N2 probe (29,188 T > C and 29,200 C > T). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Truncated map of SARS-CoV-2 genome illustrating a subset of intra-host (blue lines) and globally collected, isolate-specific SNVs (orange lines) with respect to the genomic targets of molecular diagnostics assays (red arrows – primers, red bars - probes). Three intra-host variants (orange triangles), and two strain specific variants (Wuhan/IVD-HB-04/2020 and Chongqing/YC01/2020 - red triangles), are colocalized with the RdRP_SARSr probe (15,474 T > G), the 2019-nCoV_N1 forward primer (28,291 C > T), the HKU-N reverse primer (28,971 A > G) and the 2019-nCoV-N2 probe (29,188 T > C and 29,200 C > T). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). The de novo assembly of the viral genomes was almost complete for samples SRR10903401 and SRR10903402 covering 99.7 % of the genome with 4 overlapping contigs and 99.5 % of the genome with a single contig, respectively. The de novo assembly of sample SRR10971381 was complete, with one contig covering 100 % of the genome. Alternative contigs revealed intra-host genomic rearrangements ( Fig. 3 , Table 3 ). For samples SRR10903401 and SRR10903402, these large-scale structural events were systematically observed over poly-A / poly-U-rich genomic regions, located in ORF1ab and S genes. All rearrangements were validated by remapping of the raw reads on the corresponding de novo assembled contigs, setting a threshold of at least 5 supporting reads of high mapping quality (>40) in each case. For sample SRR10903401 three inversions/misassemblies in ORF1ab (Suppl. Fig. 1) and one inversion/misassembly in S gene (Fig. 4 -A) were isolated. Notably, we were able to validate the same inversion in S gene for sample SRR10903402 as well (Fig. 4-B). Apart from 2 inversions in ORF1ab supported by only 2 reads each (not passing the validation threshold), there were no further large-scale intra-host events observed for sample SRR10903402. Similarly, one inversion/misassembly in sample SRR10971381 that was supported by only one read was identified. The alignment coordinates of all rearrangement-supporting contigs with respect to the reference strain are presented in Table 3.
Fig. 3

Alignment of the de novo assembled contigs on the genomic map (bottom). Concordantly aligned contigs (correct or gapped) are in green, while discordantly aligned contigs are in red. Sequencing read depth (X coverage) across the genome (blue histograms) and relative % GC content (green line) is presented for each sample. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Table 3

Alignment characteristics of de novo assembled contigs.

Contig NameContig LengthReference* Coordinates
Contig Coordinates
Alignment Identity (%)Alignment TypeAverage Read Depth (x)QC Pass#
startendstartend
SRR10903401(99.7 % coverage)
 Contig 123,9947524,06823,994199.99Correct57.01+
 Contig 2568124,24629,8911564699.96Correct71.40+
 Contig 333123,99224,3223311100Correct164.39+
 Contig 417924,22124,3991791100Correct97.56+
 Contig 519217,81617,909941100Inversion7.22+
17,93318,03095192100Correct
 Contig 618118,05218,1521011100Relocation, Inconsistency8.12+
17,76617,845102181100Misassembly
 Contig 716917071765624100Inversion7.62+
181519036315197.75Correct
 Contig 816523,99224,087961100Inversion18.04+
23,96324,03197165100misassembly
SRR10903402(99.5 % coverage)
 Contig 129,84213329,89129,8428499.98Correct234.32+
 Contig 224220752139178242100Partial1.09
 Contig 324221,57721,629242190100Partial1.06
 Contig 417323,99224,0901024100Inversion39.30+
23,96324,033103173100Misassembly
SRR10971381(100.0 % coverage)
 Contig 129,902129,89129,897799.98Correct267.59+
 Contig 2241516559163120100Inversion1.00
47250111990100Misassembly

Corresponding to reference MN975262 coordinates.

contig supported by at least 5 non duplicated reads of mapping quality >40.

Fig. 4

Recombination events in S gene. Samples (A) SRR10903401 and (B) SRR10903402. Alignments of the de novo assembled contigs with respect to the reference genome (MN 975262). Donor – acceptor palindrome sequences are indicated in green bars. Raw, non-duplicated NGS reads, validating the recombination event, are represented below the corresponding contig. (C): Prediction of the secondary structure of the genomic region spanning the rearrangement breakpoint (100 bases upstream and 100 bases downstream). The corresponding donor- acceptor sequences, exposed in internal loops, are indicated in green bars. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Alignment of the de novo assembled contigs on the genomic map (bottom). Concordantly aligned contigs (correct or gapped) are in green, while discordantly aligned contigs are in red. Sequencing read depth (X coverage) across the genome (blue histograms) and relative % GC content (green line) is presented for each sample. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article). Alignment characteristics of de novo assembled contigs. Corresponding to reference MN975262 coordinates. contig supported by at least 5 non duplicated reads of mapping quality >40. Recombination events in S gene. Samples (A) SRR10903401 and (B) SRR10903402. Alignments of the de novo assembled contigs with respect to the reference genome (MN 975262). Donor – acceptor palindrome sequences are indicated in green bars. Raw, non-duplicated NGS reads, validating the recombination event, are represented below the corresponding contig. (C): Prediction of the secondary structure of the genomic region spanning the rearrangement breakpoint (100 bases upstream and 100 bases downstream). The corresponding donor- acceptor sequences, exposed in internal loops, are indicated in green bars. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article).

Discussion

The rapid spread and the death toll of the new SARS-CoV-2 epidemic warrants the immediate identification / development of effective antiviral agents and vaccines, and the design of accurate diagnostics as well. The intra- and inter- patient variability affects the compatibility of molecular diagnostics but also impairs the effectiveness of the vaccines and the serological assays by altering the antigenicity of the virus. All samples analysed in this study were probably infected by the same viral strain since they shared the same set of consensus SNVs. However, apart from 3 intra-host SNVs that were common between SRR10903401 and SRR10903402, there was no other overlap observed between the low frequency variants of each sample (Fig. 1-B). This indicates that these variations have occurred in a rather random fashion and are not subject to selective pressures, which is also supported by the fact that the missense mutations were systematically more, compared to the synonymous mutations [49]. On the other hand, missense substitutions are more common in loci involving pathogen resistance, indicating positive selection [50]. The analysed viral RNA might have originated from functional/packed virions, but also from unpacked viral genomes, unable to replicate and infect other host cells. Even if a viral genome is unable to replicate independently, its abundant presence in the pool of viral quasispecies implies some functionality regarding the intra-host evolution and adaptation. For example, defective viral genomes might affect infection dynamics such as viral persistence as well as the natural history of the infection [51,52]. At the same time, these variants may arise rapidly during an outbreak and can be used for tracking the transmission chains and the spatiotemporal characteristics of the epidemic [[53], [54], [55]]. More studies based on genomic datasets accompanied by clinical metadata are needed, in order to accurately define associations between intra-host SARS-CoV-2 genomic variants, the progression and the clinical outcome of COVID19. SNVs and quasispecies observed at low frequency could represent viral variations of low impact on the functionality of the genome. Bal et al., suggest that development of quasispecies may promote viral evolution, however high depth of coverage is essential for the study of intra-host adaptation [56]. The abundance of low-frequency variations is largely affected by the population size and the epidemic characteristics. For example, a neutral substitution in a region that represents a primer target for a molecular diagnostic assay can drift to fixation rather quickly in a rapidly spreading virus, jeopardizing the sensitivity of the assay [57,58]. Here, we highlight three intra-host but also two fixed variants that are colocalized with primers or probes of real-time PCR diagnostics assays that are currently in use (Fig. 2). Since the binding affinity of these oligos to their genomic targets (Suppl.Table 2) is directly linked to the performance of the corresponding diagnostic assays, the community should pay extra attention in the evaluation of these potentially emerging variations and be alerted, in case redesigning of these oligos is needed. As it is well documented, recombination events lead to substantial changes in genetic diversity of RNA viruses [49,59]. In CoVs, discontinuous RNA synthesis is commonly observed, resulting in high frequencies of homologous recombination [60], which can be up to 25 % across the entire CoV genome [61]. For pathogenic HCoVs genomic rearrangements are frequently reported during the course of epidemic outbreaks, such as HCoV-OC43 [62], and HCoV-NL63 [63], SARS-CoV [64,62] and MERS-CoV [65]. We have isolated intra-host genomic rearrangements, located in poly-A and poly-U enriched palindrome regions across the SARS-CoV-2 genome (Fig. 4). We conclude that these rearrangements do not represent artifacts derived from the NGS library preparation (e.g. PCR crosstalk artifacts), especially since all the supporting reads were not duplicated and, in some cases, differed in polymorphic positions (Suppl. Fig. 1). Recombination processes involving S gene particularly, have been reported for SARS- and SARS-like CoV but also for HCoV-OC43. In the case of sister species HCoV-NL63 and HCoV-229E, recombination breakpoints are located near 3′- and 5′-end of the gene [1,65]. S is a trimeric protein, which is cleaved into two subunits, the globular N-terminal S1 and the C-terminal S2 [66]. Our analysis revealed that similarly to other genomic regions, the S1 subunit hosts many low-frequency SNVs, characterized by higher density compared to the rest of the S gene sequence (Fig. 1-E). The S2 subunit is highly conserved [13] and contains two fusion peptides (FP, IFP) [66]. In S gene, the same rearrangement event has taken place in two samples analyzed in this study, located in nt24,000, which corresponds to the ∼200 nt linking region between FP and IFP (aa 812-813). This observation highlights a potential recombination hot-spot. Examining closely the secondary structure of the RNA genome around the breakpoints, we suggest a model where the palindromes 5′-UGGUUUU-3′ and 5′-AAAACCAA-3′, have served as donor-acceptor sequences during the recombination event, since they are both exposed in the single-stranded internal loops formed in a highly structured RNA pseudoknot (Fig. 4-C). The RB domain of the S protein has been tested as a potential immunogen as it contains neutralization epitopes which appear to have a role in the induction of neutralizing antibodies [31]. It should be mentioned though that the S protein of SARS-CoV is the most divergent in all strains infecting humans [67], as in both C and N-terminal domains variations arise rapidly, allowing immunological escape [68]. Our findings support that apart from these variations, the N-terminal region also hosts a recombination hot-spot, which together with the rest of the observed rearrangements, indicates the genomic instability of SARS-CoV-2 over poly-A and poly-U regions.

Ethics approval and consent to participate

Not applicable.

CRediT authorship contribution statement

Timokratis Karamitros: Conceptualization, Data curation, Formal analysis, Methodology, Supervision, Validation, Visualization, Writing - original draft, Writing - review & editing. Gethsimani Papadopoulou: Data curation, Formal analysis, Writing - original draft, Writing - review & editing. Maria Bousali: Visualization, Writing - review & editing. Anastasios Mexias: Writing - original draft, Writing - review & editing. Sotirios Tsiodras: Writing - original draft, Writing - review & editing. Andreas Mentis: Writing - original draft, Writing - review & editing.

Declaration of Competing Interest

The authors report no declarations of interest. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
  59 in total

1.  The interferon receptor-1 promoter polymorphisms affect the outcome of Caucasians with HBeAg-negative chronic HBV infection.

Authors:  Timokratis Karamitros; George Papatheodoridis; Eleni Dimopoulou; Maria-Vasiliki Papageorgiou; Dimitrios Paraskevis; Gkikas Magiorkinis; Vana Sypsa; Angelos Hatzakis
Journal:  Liver Int       Date:  2015-05-17       Impact factor: 5.828

2.  An Innovative Study Design to Assess the Community Effect of Interventions to Mitigate HIV Epidemics Using Transmission-Chain Phylodynamics.

Authors:  Gkikas Magiorkinis; Timokratis Karamitros; Tetyana I Vasylyeva; Leslie D Williams; Jean L Mbisa; Angelos Hatzakis; Dimitrios Paraskevis; Samuel R Friedman
Journal:  Am J Epidemiol       Date:  2018-12-01       Impact factor: 4.897

3.  Variant Review with the Integrative Genomics Viewer.

Authors:  James T Robinson; Helga Thorvaldsdóttir; Aaron M Wenger; Ahmet Zehir; Jill P Mesirov
Journal:  Cancer Res       Date:  2017-11-01       Impact factor: 12.701

Review 4.  Recent Advances in the Vaccine Development Against Middle East Respiratory Syndrome-Coronavirus.

Authors:  Chean Yeah Yong; Hui Kian Ong; Swee Keong Yeap; Kok Lian Ho; Wen Siang Tan
Journal:  Front Microbiol       Date:  2019-08-02       Impact factor: 5.640

5.  Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding.

Authors:  Roujian Lu; Xiang Zhao; Juan Li; Peihua Niu; Bo Yang; Honglong Wu; Wenling Wang; Hao Song; Baoying Huang; Na Zhu; Yuhai Bi; Xuejun Ma; Faxian Zhan; Liang Wang; Tao Hu; Hong Zhou; Zhenhong Hu; Weimin Zhou; Li Zhao; Jing Chen; Yao Meng; Ji Wang; Yang Lin; Jianying Yuan; Zhihao Xie; Jinmin Ma; William J Liu; Dayan Wang; Wenbo Xu; Edward C Holmes; George F Gao; Guizhen Wu; Weijun Chen; Weifeng Shi; Wenjie Tan
Journal:  Lancet       Date:  2020-01-30       Impact factor: 79.321

6.  The recombinant N-terminal domain of spike proteins is a potential vaccine against Middle East respiratory syndrome coronavirus (MERS-CoV) infection.

Authors:  Lan Jiaming; Yao Yanfeng; Deng Yao; Hu Yawei; Bao Linlin; Huang Baoying; Yan Jinghua; George F Gao; Qin Chuan; Tan Wenjie
Journal:  Vaccine       Date:  2016-11-26       Impact factor: 3.641

7.  Simultaneous detection of severe acute respiratory syndrome, Middle East respiratory syndrome, and related bat coronaviruses by real-time reverse transcription PCR.

Authors:  Ji Yeong Noh; Sun-Woo Yoon; Doo-Jin Kim; Moo-Seung Lee; Ji-Hyung Kim; Woonsung Na; Daesub Song; Dae Gwin Jeong; Hye Kwon Kim
Journal:  Arch Virol       Date:  2017-02-20       Impact factor: 2.574

8.  A new coronavirus associated with human respiratory disease in China.

Authors:  Fan Wu; Su Zhao; Bin Yu; Yan-Mei Chen; Wen Wang; Zhi-Gang Song; Yi Hu; Zhao-Wu Tao; Jun-Hua Tian; Yuan-Yuan Pei; Ming-Li Yuan; Yu-Ling Zhang; Fa-Hui Dai; Yi Liu; Qi-Min Wang; Jiao-Jiao Zheng; Lin Xu; Edward C Holmes; Yong-Zhen Zhang
Journal:  Nature       Date:  2020-02-03       Impact factor: 49.962

9.  The role of evolution in the emergence of infectious diseases.

Authors:  Rustom Antia; Roland R Regoes; Jacob C Koella; Carl T Bergstrom
Journal:  Nature       Date:  2003-12-11       Impact factor: 49.962

10.  SARS-CoV-2 genomic variations associated with mortality rate of COVID-19.

Authors:  Yujiro Toyoshima; Kensaku Nemoto; Saki Matsumoto; Yusuke Nakamura; Kazuma Kiyotani
Journal:  J Hum Genet       Date:  2020-07-22       Impact factor: 3.172

View more
  32 in total

1.  SARS-CoV-2 Mutant Spectra at Different Depth Levels Reveal an Overwhelming Abundance of Low Frequency Mutations.

Authors:  Brenda Martínez-González; María Eugenia Soria; Lucía Vázquez-Sirvent; Cristina Ferrer-Orta; Rebeca Lobo-Vega; Pablo Mínguez; Lorena de la Fuente; Carlos Llorens; Beatriz Soriano; Ricardo Ramos-Ruíz; Marta Cortón; Rosario López-Rodríguez; Carlos García-Crespo; Pilar Somovilla; Antoni Durán-Pastor; Isabel Gallego; Ana Isabel de Ávila; Soledad Delgado; Federico Morán; Cecilio López-Galíndez; Jordi Gómez; Luis Enjuanes; Llanos Salar-Vidal; Mario Esteban-Muñoz; Jaime Esteban; Ricardo Fernández-Roblas; Ignacio Gadea; Carmen Ayuso; Javier Ruíz-Hornillos; Nuria Verdaguer; Esteban Domingo; Celia Perales
Journal:  Pathogens       Date:  2022-06-08

2.  Targeted Virome Sequencing Enhances Unbiased Detection and Genome Assembly of Known and Emerging Viruses-The Example of SARS-CoV-2.

Authors:  Vasiliki Pogka; Gethsimani Papadopoulou; Vaia Valiakou; Dionyssios N Sgouras; Andreas F Mentis; Timokratis Karamitros
Journal:  Viruses       Date:  2022-06-11       Impact factor: 5.818

3.  The third international hackathon for applying insights into large-scale genomic composition to use cases in a wide range of organisms.

Authors:  Kimberly Walker; Divya Kalra; Rebecca Lowdon; Guangyi Chen; David Molik; Daniela C Soto; Fawaz Dabbaghie; Ahmad Al Khleifat; Medhat Mahmoud; Luis F Paulin; Muhammad Sohail Raza; Susanne P Pfeifer; Daniel Paiva Agustinho; Elbay Aliyev; Pavel Avdeyev; Enrico R Barrozo; Sairam Behera; Kimberley Billingsley; Li Chuin Chong; Deepak Choubey; Wouter De Coster; Yilei Fu; Alejandro R Gener; Timothy Hefferon; David Morgan Henke; Wolfram Höps; Anastasia Illarionova; Michael D Jochum; Maria Jose; Rupesh K Kesharwani; Sree Rohit Raj Kolora; Jędrzej Kubica; Priya Lakra; Damaris Lattimer; Chia-Sin Liew; Bai-Wei Lo; Chunhsuan Lo; Anneri Lötter; Sina Majidian; Suresh Kumar Mendem; Rajarshi Mondal; Hiroko Ohmiya; Nasrin Parvin; Carolina Peralta; Chi-Lam Poon; Ramanandan Prabhakaran; Marie Saitou; Aditi Sammi; Philippe Sanio; Nicolae Sapoval; Najeeb Syed; Todd Treangen; Gaojianyong Wang; Tiancheng Xu; Jianzhi Yang; Shangzhe Zhang; Weiyu Zhou; Fritz J Sedlazeck; Ben Busby
Journal:  F1000Res       Date:  2022-05-16

Review 4.  Tools and Techniques for Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)/COVID-19 Detection.

Authors:  Seyed Hamid Safiabadi Tali; Jason J LeBlanc; Zubi Sadiq; Oyejide Damilola Oyewunmi; Carolina Camargo; Bahareh Nikpour; Narges Armanfard; Selena M Sagan; Sana Jahanshahi-Anbuhi
Journal:  Clin Microbiol Rev       Date:  2021-05-12       Impact factor: 26.132

5.  Heterogeneity of SARS-CoV-2 virus produced in cell culture revealed by shotgun proteomics and supported by genome sequencing.

Authors:  Fabrice Gallais; Olivier Pible; Jean-Charles Gaillard; Stéphanie Debroas; Hélène Batina; Sylvie Ruat; Florian Sandron; Damien Delafoy; Zuzana Gerber; Robert Olaso; Fabienne Gas; Laurent Bellanger; Jean-François Deleuze; Lucia Grenga; Jean Armengaud
Journal:  Anal Bioanal Chem       Date:  2021-05-20       Impact factor: 4.142

Review 6.  Characterization of SARS-CoV-2 different variants and related morbidity and mortality: a systematic review.

Authors:  SeyedAhmad SeyedAlinaghi; Pegah Mirzapour; Omid Dadras; Zahra Pashaei; Amirali Karimi; Mehrzad MohsseniPour; Mahdi Soleymanzadeh; Alireza Barzegary; Amir Masoud Afsahi; Farzin Vahedi; Ahmadreza Shamsabadi; Farzane Behnezhad; Solmaz Saeidi; Esmaeil Mehraeen
Journal:  Eur J Med Res       Date:  2021-06-08       Impact factor: 2.175

7.  Host- and Species-Dependent Quasispecies Divergence of Severe Acute Respiratory Syndrome Coronavirus-2 in Non-human Primate Models.

Authors:  Eun-Ha Hwang; Hoyin Chung; Green Kim; Hanseul Oh; You Jung An; Philyong Kang; Choong-Min Ryu; Jong-Hwan Park; Jungjoo Hong; Bon-Sang Koo
Journal:  Front Microbiol       Date:  2021-07-09       Impact factor: 5.640

8.  Vaccine breakthrough infections with SARS-CoV-2 Alpha mirror mutations in Delta Plus, Iota, and Omicron.

Authors:  Brenda Martínez-González; Lucía Vázquez-Sirvent; María E Soria; Pablo Mínguez; Llanos Salar-Vidal; Carlos García-Crespo; Isabel Gallego; Ana I de Ávila; Carlos Llorens; Beatriz Soriano; Ricardo Ramos-Ruiz; Jaime Esteban; Ricardo Fernandez-Roblas; Ignacio Gadea; Carmen Ayuso; Javier Ruíz-Hornillos; Concepción Pérez-Jorge; Esteban Domingo; Celia Perales
Journal:  J Clin Invest       Date:  2022-05-02       Impact factor: 19.456

9.  Mass culling of minks to protect the COVID-19 vaccines: is it rational?

Authors:  R Frutos; C A Devaux
Journal:  New Microbes New Infect       Date:  2020-11-17

10.  Genetic grouping of SARS-CoV-2 coronavirus sequences using informative subtype markers for pandemic spread visualization.

Authors:  Zhengqiao Zhao; Bahrad A Sokhansanj; Charvi Malhotra; Kitty Zheng; Gail L Rosen
Journal:  PLoS Comput Biol       Date:  2020-09-17       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.