Literature DB >> 36075217

Remdesivir-induced emergence of SARS-CoV2 variants in patients with prolonged infection.

Andreas Heyer¹, Thomas Günther², Alexis Robitaille², Marc Lütgehetmann³, Marylyn M Addo⁴, Dominik Jarczak⁵, Stefan Kluge⁵, Martin Aepfelbacher³, Julian Schulze Zur Wiesch⁶, Nicole Fischer⁷, Adam Grundhoff⁸.

Abstract

We here investigate the impact of antiviral treatments such as remdesivir on intra-host genomic diversity and emergence of SARS-CoV2 variants in patients with a prolonged course of infection. Sequencing and variant analysis performed in 112 longitudinal respiratory samples from 14 SARS-CoV2-infected patients with severe disease progression show that major frequency variants do not generally arise during prolonged infection. However, remdesivir treatment can increase intra-host genomic diversity and result in the emergence of novel major variant species harboring fixed mutations. This is particularly evident in a patient with B cell depletion who rapidly developed mutations in the RNA-dependent RNA polymerase gene following remdesivir treatment. Remdesivir treatment-associated emergence of novel variants is of great interest in light of current treatment guidelines for hospitalized patients suffering from severe SARS-CoV2 disease, as well as the potential use of remdesivir to preventively treat non-hospitalized patients at high risk for severe disease progression.

Entities: Chemical

Keywords: SARS-CoV-2 variants; escape mutations; intra-host evolution; prolonged SARS-CoV-2 infection; remdesivir treatment

Mesh：

Substances：

Year: 2022 PMID： 36075217 PMCID： PMC9378267 DOI： 10.1016/j.xcrm.2022.100735

Source DB: PubMed Journal: Cell Rep Med ISSN： 2666-3791

Introduction

Both cell-mediated and humoral immunity are required to control SARS-CoV2 infection.1, 2, 3, 4, 5 Recently, several case reports have reported prolonged respiratory tract-associated or systemic infection in patients with pre-existing conditions, sometimes with positive viral detection over several months.1, 2, 3, 4, 5, 6, 7, 8, 9 Prolonged infection was primarily detected in immunodeficient patients suffering from conditions such as chronic lymphocytic leukemia, follicular lymphoma, or B cell immunodeficiency. Prolonged infections in such patients have been repeatedly discussed as major contributors to viral evolution. It is thought that decreased immune restriction may result in a broad increase of viral intra-host diversity, thus favoring the emergence of novel variants, especially if antiviral treatments such as remdesivir or convalescent plasma exert selection pressure for the acquisition of escape mutations. , Most recently, the emergence of the Omicron variant of concern (VOC) has refueled interest in the mechanisms contributing to variant emergence. In addition to patients suffering from primary or secondary immune deficiencies such as HIV-1, prolonged infections are mainly observed in patients on B cell depleting therapies or in patients on CAR T cell therapy who have failed to clear the virus efficiently.4, 5, 6, 7, 8 , In these patients, the presence of novel mutations was observed particularly in the orf1ab and spike region, as well as deletions in the spike region and in orf7 and orf8, which are thought to have an immunoregulatory function. While there are also a few reports of prolonged infections in patients without known impairment of the humoral and cellular immune response, these are typically patients suffering from underlying conditions such as diabetes or cardiovascular disease, known risk factors for a severe COVID-19 progression. , Treatment of COVID-19 has changed throughout the pandemic with limited therapeutic options, including antiviral therapy with remdesivir, administration of convalescent plasma, or anti-inflammatory therapy through the use of dexamethasone, hydrocortisone, JAK inhibitors, and tocilizumab. While current ICU guidelines in Germany do not generally recommend remdesivir or convalescent plasma treatment, remdesivir can be given to hospitalized patients on oxygen supplementation. Similarly, NIH guidelines give a moderate recommendation for remdesivir treatment for patients who require supplemental oxygen, or remdesivir together with dexamethasone for patients who additionally require high-flow oxygen therapy or invasive ventilation. Furthermore, several agencies have recommended remdesivir administration as an option in treatment guidelines for high-risk patients who have not been hospitalized and are within the 5–7 day window following symptom onset.17, 18, 19, 20 Considering the use of antiviral treatments and increasing reports on prolonged SARS-CoV2 infection, it is of high interest to investigate whether patients with prolonged diseases in general display fundamentally increased intra-host evolution that may allow more rapid emergence of SARS-CoV2 variants, or whether specific treatment regimens foster the emergence of novel variant species.

Results

We addressed this by investigating intra-host genomic diversity in longitudinal samples from 14 patients with prolonged viral persistence (30–146 days) during severe COVID-19 disease, including immune-compromised and immune-competent patients with or without antiviral treatment to evaluate the emergence of mutations with and without selection pressure. We tested respiratory samples of patients for SARS-CoV-2 by qRT-PCR. Viral loads measured by cycle threshold (Ct) values ranged from 20 to 38, and only samples with Ct value < 33 were included in viral sequencing. Fourteen patients, in total 112 respiratory samples, with severe COVID-19 disease and prolonged detection of SARS-CoV-2 in the respiratory tract were included (Table 1 ). The infection time, defined as time with detectable SARS-CoV-2 RNA in the respiratory tract, ranged from 30 to 146 days. The mean age of the patients was 62.2 years (±12.2 years), and 46% of the patients were female. All patients suffered from underlying diseases, such as diabetes, renal insufficiency, obesity, and arterial hypertension (Table 1). Few patients showed pre-existing conditions due to chronic obstructive pulmonary disease (COPD) (P10, P16, and P17) or rheumatoid arthritis (P13). Two patients were immunosuppressed due to follicular lymphoma (P01) and liver transplantation (P14), and two patients received mild immunoregulatory treatment against rheumatic arthritis (P13, P17). All patients except one, P13, who died of multi-organ failure, were eventually discharged from the hospital. In the majority without symptoms, two patients presented symptoms such as atrial fibrillation or polyneuropathy, P07 and P08, respectively (Table 1). Patients were classified based on their treatments as individuals receiving (1) antiviral therapy (n = 2) including remdesivir (R), (2) antiviral (R) and anti-inflammatory therapy (dexamethasone) (n = 1), (3) anti-inflammatory therapy (dexamethasone, hydrocortisone, or both) (n = 10), and (4) neither antiviral nor anti-inflammatory therapy (n = 1).

Table 1

Patient data

ID	Age/gender	Infection timea	Pango lineage	Seq. interval	Comorbidities	Symptoms	Immunosuppressive medicationb	Complications	ICU	Ventilationc	Treatment (R/D/P)d	Outcome
P01	53 F	90	B.1.8	0–83 days	Follicular lymphoma (in remission)	Dyspnea, cough	Obinutuzumab (3 months prior to admission)	Pneumonia	No	–	R (d63-d72/d91-95),P (d95; d97, d99)	Discharged with no symptoms on day 138
P03	34 M	30	A	0–28 days	Diabetes, renal failure, adiposity	Cough, fever	–	–	No	LFOT	R (d9/d12)	Discharged with no symptoms on day 33
P05	61 M	65	B.1.36.1	0–65 days	Diabetes	Hypoxia with accompanying syncope	–	Severe ARDS, septic shock, acute renal failure	Yes	Invasive	D (continuously), Hydrocortisone (d4-12)	Discharged with no symptoms on day 87
P06	43 M	47	B.1.177.81	0–46 days	Pre-diabetes, adiposity	Dyspnea, cough	–	Severe ARDS, acute renal failure, septic shock	Yes	Invasive	D (d2-d18), Hydrocortisone	Transfer to rehabilitation center on day 35
P07	75 F	57	B.1.1	0–47 days	Renal failure, granulomatosis with polyangiitis, hypertension	Dyspnea, cough	Cyclophosphamide (1 month prior to admission)	Severe ARDS	Yes	LFOT/HFOT/invasive	D (d35-d41)	Discharged with atrial fibrillation and dyspnea on day 43
P08	50 F	38	B.1.1	0–24 days	Diabetes, adiposity, hypertension	Pyrexia, dyspnea, nausea, cephalalgia	–	Severe ARDS, acute renal failure, asystole with successful cardiopulmonary resuscitation	Yes	Invasive	D (before admission and d12-d16)	Discharged with Dyspnea and on day 41
P09	62 M	39	B.1.177.81	0–12 days	Diabetes, adiposity, hypertension	Pyrexia, dyspnea, fatigue	–	Severe ARDS, acute renal failure, pneumogenic sepsis	Yes	LFOT/HFOT/invasive	D (before admission and d1-d9)	Discharged with no symptoms on day 54
P10	75 M	55	A	0–24 days	COPD, adiposity, hypertension	Progressive pulmonary failure	–	Moderate ARDS, acute renal failure, sepsis, circulatory failure with successful cardiopulmonary resuscitation	Yes	Invasive	–	Discharged with no symptoms on day 56
P13	60 F	57	A	0–57 days	Rheumatoid arthritis	–	Methotrexate (weekly)Rituximab (most recent administration unknown)	Severe ARDS, acute renal failure, sepsis	Yes	LFOT/invasive	R (before admission and d9-15), D (before admission and d9 –d39)	Deceased 41 days after admission, septic shock with multi-organ failure
P14	59 M	57	B.1.1.277	0–29 days	Liver transplant recipient, hypertension	Dyspnea	Tacrolimus (on admission)Basiliximab (most recent administration unknown)	Pneumonia	Yes	LFOT/HFOT/invasive	D (d20 – d24)	Discharged with no symptoms on day 67
P15	70 F	63	AC.1	0–33 days	Diabetes, adiposity, hypertension	Dyspnea, cough	–	Severe ARDS, heparin induced thrombocytopenia	Yes	Invasive	D (d4 - d6)	Discharged with no symptoms on day 66
P16	75 M	26	B.1	0–25 days	COPD, hypertension	Pyrexia, dyspnea, fatigue, progressive pulmonary failure	–	ARDS, acute atrial fibrillation, acute renal failure, intrahepatic cholestasis	Yes	Invasive	Prednisolone	Discharged with no symptoms on day 92
P17	77 M	52	B.1.1	0–13 days	COPD, rheumatoid arthritis, hypertension, diabetes, renal failure	Dyspnea, pyrexia, diarrhea	Methotrexate (weekly)	Severe ARDS, acute renal failure, multiple septic episodes, critical illness polyneuropathy, and myopathy	Yes	LFOT/invasive	Prednisolone,Hydrocortisone continuously	Discharged with no respiratory symptoms on day 205
P18	66 F	146	A.9	0–117 days	Dementia, hypertension, adiposity, renal failure, Goodpasture syndrome	Dyspnea, altered vigilance	Single dose rituximab, cyclophosphamide (8–10 months prior to admission)	ARDS, AV block I°, sepsis, cardiac decompensation (NYHA IV)	Yes	HFOT/invasive	Hydrocortisone for 3days,Prednisolone continuously	Discharged with no symptoms on day 160

Interval between admission to the hospital and last qRT-PCR positive SARS-CoV-2 report in the respiratory tract.

Immunosuppressive medication upon admission.

LFOT: low-flow oxygen therapy; HFOT: high-flow oxygen therapy.

D: dexamethasone; R: remdesivir; P: convalescent plasma.

Patient data Interval between admission to the hospital and last qRT-PCR positive SARS-CoV-2 report in the respiratory tract. Immunosuppressive medication upon admission. LFOT: low-flow oxygen therapy; HFOT: high-flow oxygen therapy. D: dexamethasone; R: remdesivir; P: convalescent plasma. All patients had been hospitalized between April 2020 and January 2021, prior to the spread of the Alpha variant or other VOCs in the Hamburg metropolitan area. Accordingly, whole-genome SARS-CoV-2 sequencing from the respiratory samples confirmed that no SARS-CoV-2 VOC contributed to the infections at any time (Table 1 and Figures S2, S3, S4, and S5). For all patients included, the overall viral load after an initial decrease remained at a low level with Ct values of 30–33, indicating insufficient immune control to clear the infection. Except for patient P10, all patients were considered viremic due to low viral loads detectable in the blood in the first days of disease progression. However, blood viral loads were too low to perform whole viral genome sequencing and variant profiling. We investigated changes in genomic diversity by amplicon sequencing of whole SARS-CoV-2 genomes in longitudinal respiratory samples that were collected over time periods of up to 117 days (mean 43 days; see Figure 1A for an overview of overall sampling periods, as well as individual sample collection intervals for each patient). In each sample, we identified nucleotide variants (NVs) and evaluated changes in NV abundance and frequency over time. Figures 1B and 1C show a summary analysis of all patients and samples, in which the distance of each sample from the first investigated time point is represented by changes in the total count (A) or frequency (B) across all detected NVs. We furthermore employed a statistical model to identify transition events (defined as events whenever an NV transits between minor and major frequency states with abundance values below or above 50%, respectively), and we calculated p values for the hypothesis that the total number of transition events in a particular sample is higher than the expected average across all samples (see STAR Methods for calculation details and Figure S1). Figures 1D and 1E show the total number of transitions and associated p values for each sample, respectively. Heat maps with individual NV frequencies, along with corresponding major or minor calls are shown in Figure 2B for patient P01, and Figures S2 through S5 for the remaining patients.

Figure 1

Cross-comparison of NV diversity in longitudinal patient samples

(A) Overview of sequenced samples with each sample sequenced once, and sample collection time points, relative to the collection date of the first investigated sample. Consecutive longitudinal samples are shown as stacked columns marked in alternating shades of gray, with the height of each segment denoting the time span between sampling dates. Samples exhibiting NV transition counts significantly above the mean expectation value (see below) are shown in orange or red for significance levels of 0.01 or 0.001, respectively, and are additionally labeled with the corresponding sample collection time point.

(B–E) Summary analysis of patient samples. Individual samples are denoted by filled circles, and mean values for each patient are shown as short horizontal lines. (B) and (C) show the aggregated NV distance relative to the first investigated sample, as measured by either the relative number of SNVs with an estimated frequency of at least 1% (B) or the sum of frequency alterations across all SNVs (C). (D) shows the total number of transitions (i.e., events in which SNVs transition from a non-significant or minor to a major state, or vice versa) between consecutive longitudinal samples. (E) shows p values for the hypothesis that the number of transitions observed in a given sample is greater than that expected from the estimated average transition frequency across all samples (pT; see STAR Methods for calculation details). The 0.05 significance cutoff is shown as a horizontal line. Circles filled in blue, orange, or red across all panels denote samples for which this value reaches significance levels of 0.05, 0.01, or 0.001, respectively.

Figure 2

Summary of key clinical parameters, treatment regimen, and sequencing results in patient 1

(A) Temporal representation of the infection interval indicating dates when respiratory samples were collected and whether they tested positive for SARS-CoV-2, along with remdesivir treatment dates (light gray boxes) and inflammatory markers (CRP and IL-6). CRP values are shown in green and IL-6 in turquoise. Viral loads are shown for respiratory samples (green diamonds) and blood samples (gray triangles). Filled symbols depict samples subjected for sequencing.

(B) Upper panel: Heatmap depicting the frequency of nucleotide variants (NVs) in longitudinal samples from patient 1. Each longitudinal sample was sequenced once. Nucleotide variants (relative to reference sequence NC_045512.2) and potentially resulting amino acid mutations are indicated at the bottom or top of the map, respectively. Nucleotide variant C28628T only occurred in the background of C28629T, thus resulting in an A > F amino acid exchange in a fraction of C28629T mutations that alone led to an A > V exchange. The corresponding amino acid exchanges are marked with an asterisk. Variant frequency is indicated by heatmap colors, ranging from gray (reference) over yellow to dark blue, as indicated in the legend shown to the right. Ct values and treatment regimen are shown to the left. None of the investigated samples exhibited coverage levels below 10 (as labeled with a blue dot in Figures S2, S3, S4, and S5). Lower panel: Color code map indicating classification of nucleotide variants according to the observed coverage/frequency values, as further described in the STAR Methods section and illustrated in Figure S1. Classification calls corresponding to individual colors are shown in the legend to the right of the map (n.s.: not significant).

Cross-comparison of NV diversity in longitudinal patient samples (A) Overview of sequenced samples with each sample sequenced once, and sample collection time points, relative to the collection date of the first investigated sample. Consecutive longitudinal samples are shown as stacked columns marked in alternating shades of gray, with the height of each segment denoting the time span between sampling dates. Samples exhibiting NV transition counts significantly above the mean expectation value (see below) are shown in orange or red for significance levels of 0.01 or 0.001, respectively, and are additionally labeled with the corresponding sample collection time point. (B–E) Summary analysis of patient samples. Individual samples are denoted by filled circles, and mean values for each patient are shown as short horizontal lines. (B) and (C) show the aggregated NV distance relative to the first investigated sample, as measured by either the relative number of SNVs with an estimated frequency of at least 1% (B) or the sum of frequency alterations across all SNVs (C). (D) shows the total number of transitions (i.e., events in which SNVs transition from a non-significant or minor to a major state, or vice versa) between consecutive longitudinal samples. (E) shows p values for the hypothesis that the number of transitions observed in a given sample is greater than that expected from the estimated average transition frequency across all samples (pT; see STAR Methods for calculation details). The 0.05 significance cutoff is shown as a horizontal line. Circles filled in blue, orange, or red across all panels denote samples for which this value reaches significance levels of 0.05, 0.01, or 0.001, respectively. Summary of key clinical parameters, treatment regimen, and sequencing results in patient 1 (A) Temporal representation of the infection interval indicating dates when respiratory samples were collected and whether they tested positive for SARS-CoV-2, along with remdesivir treatment dates (light gray boxes) and inflammatory markers (CRP and IL-6). CRP values are shown in green and IL-6 in turquoise. Viral loads are shown for respiratory samples (green diamonds) and blood samples (gray triangles). Filled symbols depict samples subjected for sequencing. (B) Upper panel: Heatmap depicting the frequency of nucleotide variants (NVs) in longitudinal samples from patient 1. Each longitudinal sample was sequenced once. Nucleotide variants (relative to reference sequence NC_045512.2) and potentially resulting amino acid mutations are indicated at the bottom or top of the map, respectively. Nucleotide variant C28628T only occurred in the background of C28629T, thus resulting in an A > F amino acid exchange in a fraction of C28629T mutations that alone led to an A > V exchange. The corresponding amino acid exchanges are marked with an asterisk. Variant frequency is indicated by heatmap colors, ranging from gray (reference) over yellow to dark blue, as indicated in the legend shown to the right. Ct values and treatment regimen are shown to the left. None of the investigated samples exhibited coverage levels below 10 (as labeled with a blue dot in Figures S2, S3, S4, and S5). Lower panel: Color code map indicating classification of nucleotide variants according to the observed coverage/frequency values, as further described in the STAR Methods section and illustrated in Figure S1. Classification calls corresponding to individual colors are shown in the legend to the right of the map (n.s.: not significant). Most patients exhibited surprisingly stable viral populations, with a relatively low number of transition events observed even over long infection periods (Figure 1D). Statistical analysis suggested a total of four samples with a significantly higher number of transitions (Figure 1E) when compared to expected mean values. One of these was from patient P01, a remdesivir-treated patient who also exhibited the overall highest relative genomic diversity among all patients (Figures 1B and 1C). The remaining three were collected from P07 on day 38 (relative to the initial sampling date) and from patient P14 on days 20 and 23. Transitioning NVs in these samples include silent mutations in nsp13, ORF8, and the 3′-UTR in patient P07, as well as silent and non-silent nsp3 mutations in addition to deletions in the spike gene in patient 14 (see Figure S3, P07: nsp13:F422, ORF8:F120, and G29751A, and Figure S4, P14: nsp3:T725I, nsp3:N1369). Interestingly, both patients also exhibit deletions spanning amino acids 141–144 or 141–145 of the spike gene. This segment in the N-terminal region of the spike protein has been described as a recurrent deleted region, and deletions with this region have been shown to influence antigenicity. However, as with other mutations in patients receiving anti-inflammatory treatment only, these deletions did not become fixed (i.e., did not reach near-100% allelic frequency), indicating insufficient selection advantage to allow displacement of the original genotype(s). We made substantially different observations in patient P01, a patient suffering from prolonged COVID-19 disease and SARS-CoV-2 viremia that required two rounds of remdesivir treatment (the second one combined with convalescent plasma therapy) to achieve a sustained response. A detailed report of this patient’s clinical course has been published before; key parameters are recapitulated in Figure 2A. As shown in Figure 2B, longitudinal samples exhibited a total of nine newly acquired NVs that reached fixation within the investigated time period. Strikingly, six of these underwent transition at day 64, shortly after initiation of the first round of remdesivir treatment. Another two transitioned at day 67, and the ninth mutation (nsp12:V166L) registered 6 days later (day 73). All nine mutations had reached 100% allelic frequency by day 79. Except for a silent mutation in the nsp7-coding region, all mutations result in amino acid exchanges within the envelope, nucleocapsid (N), spike (S), nsp4, nsp8, nsp12, and nsp14 gene products. The two mutations at positions 166 and 536 of nsp 12, the gene encoding the catalytic subunit of the RNA-dependent RNA polymerase (RdRp), are of particular interest (Table 2 and Figures 2B, 3A, and 3B): nsp12:536:I:V resides in a region responsible for nsp8 binding, (another structural component of the active multi-protein RdRp complex), while nsp12:166:V:L maps to the nidovirus RdRp-associated nucleotidyl transferase (NiRAN) domain. Figure 3B shows a structural view of this residue’s location relative to a remdesivir molecule (distance: 19.28Å) bound at the catalytically active polymerization site of nsp12. Four additional variants in nsp3 and nsp4 (a protein contributing to the binding to the endoplasmic reticulum and subsequent formation of double-membrane vesicles involved in viral replication) were registered on day 73, but they did not become fixed and remained at frequency levels below 50% (Figure 1B, nsp4:S209P and A457V).

Table 2

Mutations arising under remdesivir treatment

Patient	Treatment	Location of samplingc	100% allele frequency aa changes	<100% allele frequency aa changes	Transient aa changes
P01	Ra: d63–d72; R: d91–95	LRT	d83nsp4:457:A:Vnsp7:S61dnsp8:20:E:Knsp12:166:V:Lnsp12:536:I:VS:50:S:LS:260:A:DS:490:F:SE:30:T:I	d83nsp3:1614:R:Knsp4:209:S:Pnsp4:272:I:Lnsp4:336:S:L	d64orf8:1614:R:KQ91dN:119:A:FN:119:A:F/VN:119:A:G
P03	R: d9; R: d12–d15	URT	–	d28NC:29,742:G:A	–
P13	R: before admission and d38–36; Db:d28–d36	LRT	–	–	d42nsp10:138:L:FE:30:T:IS:850:I:M orf8:84:L:Sorf8:92:E:K

Mutations observed in multiple patients are shown in bold.

R: remdesivir.

D: dexamethasone.

LRT: lower respiratory tract, URT: upper respiratory tract.

Silent mutation.

Figure 3

Schematic representation of the mutations identified in patient 1

(A) Overview of the location of mutations in the SARS-CoV-2 genome emerging over the infection interval 0–83 days. Open reading frames (ORFs) in which mutations emerge are shown in blue, whereas unaffected ORFs are shown in gray. Non-synonymous mutations that establish during infection are shown in black, synonymous mutations in gray, and mutations that are gradually lost during infection are shown in green.

(B) Representation of mutation 166:V:L within the nsp12 structure (PDB:7BV2). Nsp12 structural elements are depicted in green, the RNA structure in magenta, and remdesivir in pink.

Mutations arising under remdesivir treatment Mutations observed in multiple patients are shown in bold. R: remdesivir. D: dexamethasone. LRT: lower respiratory tract, URT: upper respiratory tract. Silent mutation. Schematic representation of the mutations identified in patient 1 (A) Overview of the location of mutations in the SARS-CoV-2 genome emerging over the infection interval 0–83 days. Open reading frames (ORFs) in which mutations emerge are shown in blue, whereas unaffected ORFs are shown in gray. Non-synonymous mutations that establish during infection are shown in black, synonymous mutations in gray, and mutations that are gradually lost during infection are shown in green. (B) Representation of mutation 166:V:L within the nsp12 structure (PDB:7BV2). Nsp12 structural elements are depicted in green, the RNA structure in magenta, and remdesivir in pink. Besides patient P01, two other patients had received remdesivir treatment, either together with or without anti-inflammatory drugs (P13 and P03; see Table 1 for treatment regimen). Patient P03 received single-dose remdesivir treatments on d9 and d12. We did not observe emergence of variants passing our stringency filters (Figure S2A). However, our ability to detect such variants may have been hampered by a limited number of available post-treatment samples and low viral loads (Ct 33) in the last investigated sample (d28). In patient 13, a mutation in the envelope protein (E), E:30:T:I, emerged on day 28, shortly after remdesivir treatment (Figure S2B). The mutation reached a maximum allelic frequency of 95% during the following 2 weeks but eventually was lost on day 57, a total of 28 days after cessation of antiviral treatment. A mutation in the nsp10 region (nsp10:138:L:F) showed similar patterns but generally remained below frequencies of 50%. Similar to transiently increased intra-host diversity observed in patients with anti-inflammatory treatment, emergence of these mutations coincided with an increase in viral replication, as signified by a decrease of Ct values from 32 to 24 between days 29 and 34 (Figure S2B).

Discussion

Thus far, the appearance of novel SARS-CoV-2 mutations resulting from intra-host evolution has been described mainly in case reports with small patient numbers or few longitudinal samples from immunocompromised individuals with prolonged viral infection.1, 2, 3, 4, 5, 6 , , , , Our study extends these reports by a systematic side-by-side analysis of intra-host diversity and emerging mutations in prolonged SARS-CoV-2 infection in patients with severe COVID-19 under different therapeutic regimens. We demonstrate marked viral intra-host diversification, with emerging mutations that reach frequency levels of 100%, in a patient with prolonged SARS-CoV-2 infection and antiviral remdesivir treatment. In contrast, in patients receiving exclusively anti-inflammatory treatment, we only sporadically observed the emergence of novel variants. Associated mutations often appeared only transiently, and their allelic frequency generally remained below 50%. Hence, our data suggest that prolonged infection per se is unlikely to promote the emergence of novel variant species. Our observation differs from some previous studies, for example, a study that examined two time points of SARS-CoV-2-positive individuals with mild symptoms in 13 cases with an infection interval of 14 days and generally observed a detectable increase in intra-host diversity. In addition to the increased number of serial samples included in our study, we suspect that the stringent variant calling (variants are only included in the analysis if they are present in at least two consecutive longitudinal samples, with an allelic frequency of minimally 20%) may contribute to these differences. While this method does not allow us to observe mutations that may be present at very low frequency, it efficiently eliminates noise, including sequencing errors as well as technical artifacts introduced during reverse transcription and PCR amplification. Interestingly, Lythgoe and colleagues examined intra-host diversity in temporal samples of infected individuals in households in a 20-day time window and observed little consistency in the frequency of minor variants between time points within individuals. The authors concluded that these within-host dynamics suggest inherent stochasticity associated with low viral load samples and that within-host emergence of vaccine- and therapeutic-escape mutations are likely to be relatively rare when viral loads are high. The authors furthermore suggest that potentially beneficial adaptive mutations might be present even in the absence of immunological or therapeutic selection pressure and could then become established under therapy. We tend to side with this view, especially considering the fact that our study preferentially registered changes in intra-host diversity following a low-to-high switch in viral replication levels. This is in line with previous observations and suggests that a given low-frequency variant may rapidly achieve dominance after a severe replication bottleneck is encountered. Our study strongly supports the notion that antiviral treatments such as remdesivir represent such a bottleneck. However, whether the resulting dominant species emerge in a stochastic fashion or whether they may represent escape mutants is another issue. In our study, we observed emergence of variants in nsp10 or the envelope gene, regions of the viral genome that are unlikely to be associated with a potential increase in remdesivir resistance, in one of the remdesivir-treated patients. In contrast, the pattern of mutations observed after the first round of remdesivir treatment in patient P1, and the fact that a viral rebound in this patient could only be controlled by another combinatorial remdesivir and convalescent plasma treatment suggest an at least partial escape to the initial treatment regimen. A prime candidate for a potential resistance mutation was V166L. Interestingly, while we did not reverse-engineer variant genotypes, an independent study performing an experimental remdesivir resistance screen observed selection for another mutation (V166A) at the exact same position of nsp12. We thus opine that our findings suggest a critical role for V166L in providing increased remdesivir resistance, although proof of this hypothesis will ultimately require experimental testing in appropriate in vitro systems. Taken together, our work demonstrates that, while not generally observed in long-lasting infections, novel viral intra-host species may rapidly emerge in the context of antiviral treatments such as remdesivir therapy. Differentiating between true escape and stochastically arising variants in such patients, however, represents a particular challenge that will greatly benefit from additional data. Our findings are significant in light of recent discussions regarding the use of remdesivir to treat non-hospitalized patients at high risk for disease progression, as well as potentially new antiviral therapeutics that are about to be incorporated into patient treatment strategies (e.g., protease inhibitors such as Paxlovid or RdRp-inhibiting compounds such as molnupiravir , ). Unlike remdesivir, for which patient reports and in vitro studies on resistance mutations are available, , , , , , , similar data for Paxlovid or mulnopiravir treatment regimen are still lacking. Therefore, we suggest monitoring patients with long-lasting viral infections by frequent testing of respiratory samples with whole viral genome analysis and variant determination and extending this monitoring to new directly acting therapeutics that may impose a different form of selection pressure when used in these patient populations.

Limitations of the study

The present study was performed in a relatively small number of patients with prolonged disease courses due to limited numbers of patients with sustained infection, comparable concomitant diseases and treatments during the pandemic. Some of the longitudinal samples exhibited low viral loads, and only one replicate per time point was sequenced. Despite the measures described in the STAR Methods section, there thus is the potential risk of potential false positive as well as false negative calls, which may have resulted from amplification biases or stringent filtering procedures implemented to avoid erroneous calls. Our primary conclusion—that remdesivir treatment increases diversity within the host and selects for mutations in the RNA polymerase—has to be validated in a larger number of samples. In addition, our study does not include direct experimental validation to confirm a direct linkage between observed mutations such as V166L and remdesivir resistance.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information and requests for resources should be directed to and will be fulfilled by the lead contact, Nicole Fischer (nfischer@uke.de).

Material availability

This study did not generate new unique reagents.

Experimental model and subject details

Fourteen patients with severe COVID-19 and persistent viral detection in the respiratory tract were included in this study. Patients were admitted to the University Medical Center Hamburg-Eppendorf between April 2020 and January 2021. RNA was extracted from routine diagnostic respiratory samples, and subjected to RT-qPCR. SARS-CoV-2 infection was considered prolonged if the infection period, defined as the period from hospital admission to the last PCR positive result in the respiratory tract, exceeded 14 days. Longitudinal sampling of respiratory samples of ICU patients was conducted every three to four days. Clinical data included patients' preexisting conditions, ventilation, and therapies. Additional parameters including inflammatory markers, such as CRP, and IL-6 levels were documented. The study was approved by the local ethics committee (PV7306).

Method details

SARS-CoV-2 qRT-PCR

SARS-CoV-2 loads were determined by qRT-PCR in a certified diagnostic environment as described previously. ,

SARS-CoV-2 amplicon sequencing and variant calling

Whole genome amplicon sequencing and downstream bioinformatic analysis was performed as described. , Sequencing libraries were generated using the CleanPlex SARS-CoV-2 Panel (Paragon Genomics, CA, USA). Libraries were multiplexed and sequenced on a Mid output NextSeq500 (Illumina). Complete or partially remaining sequencing adapters sequences are trimmed from the reads pairs extremities using cutadapt (-u -1 -U -1 -e 0.1 -O 9). Mate reads pairs are then merged with vsearch, allowing merging of staggered reads, and filtered for a minimum base quality of 15 (--fastq_qminout 15 –fastq_allowmergestagger). Unmerged reads are kept for downstream processing. A more stringent quality step in applied on merged reads using FASTX-Toolkit where only merged reads with a minimum base quality of 20 on at least 95% of the reads length are kept (fastq_quality_filter -q 20 -p 95). Merged reads are aligned to NC_045512.2 using minimap2 32 with default settings for short read alignment (-ax sr). Reads clipped from a minimum of 1 base are filtered out using samclip (https://github.com/tseemann/samclip) for downstream processing. Previously identified unmerged reads are aligned to NC_045512.2 using minimap2 32 with default settings for short read alignment (-ax sr) and all reads clipped from a minimum of 1 base are filtered out using samclip (https://github.com/tseemann/samclip) to be use further in the analysis stream. All original template of clipped reads from merged and unmerged pairs are recovered from filtered fastq files using seqtk (https://github.com/lh3/seqtk) to be aligned with STAR 33 for detection of larger deletions. The four obtained alignment files (merged reads not-clipped aligned with minimap2, unmerged reads not-clipped aligned with minimap2, merged reads clipped during minimap2 alignment and realigned with STAR, unmerged reads clipped during minimap2 alignment and realigned with STAR) are merged in a singled BAM file per sample. The final BAM file is then analyzed by iVAR to remove forward and reversed multiplex primers sequences at the edges of the reads. Coverage table are calculated using genomecov from BEDTools. Major and minor allele frequency variants within the samples of individual patients (≥1% allelic frequency) were called using FreeBayes Bayesian haplotype caller v1.3.1 with ploidy and haplotype independent detection parameters to generate frequency-based calls for all variants supported by a minimum coverage of 10 (-K -F 0.05 -min-coverage 9) in at least one data point per patient. The resulting variants were annotated using ANNOVAR 34 and consensus sequences based on major alleles (>90% AF) were generated using in-house scripts. Pangolin lineage assignment of consensus sequences was performed using the pangolin package.

Variant filtering and classification

Our study aims at identifying nucleotide variant frequency changes across longitudinal patient samples. In this context, variable viral loads (and consequently coverage levels) pose a particular challenge to sample cross-comparison. To mitigate this problem, we employ a statistical model to estimate the significance of observed frequency changes. In this model, for each NV and sampling point, we assume that the frequency distribution at a given coverage c is equivalent to a random draw of c reads from a virtually unlimited pool of amplicons with a composition that approximates the authentic distribution of positive and negative parental genomes (note that this model does not consider potential PCR biases, and therefore must be considered optimistic). If so, we can calculate the minimum coverage required to conclude that, within a given error margin e, the observed frequency value does not substantially deviate from the unknown authentic NV frequency as:where is the observed NV frequency (0 ≥ ≥ 1), is the z score at the chosen confidence interval, and e is the acceptable error margin (for example, for = 0.3 and e = 0.05, the authentic frequency may fall between 0.25 and 0.35). Accordingly, we can test if the coverage value at which a given NV was observed is sufficient to reasonably conclude that the authentic NV frequency is above or below a specific threshold . If so, error intervals centered on and must be smaller than the absolute distance between and : The minimum coverage required to differentiate observed frequency values from a given threshold can then be calculated as: In our longitudinal analysis, we test observed frequency and coverage values against a set of threshold values to filter datapoints and classify SNVs. Figure S1 shows a graphical depiction of the filtering and classification scheme. Firstly, to exclude spurious hits, we generally require the coverage to be minimally 10 and, depending on the observed frequency, sufficiently high to conclude that the authentic frequency is greater than 1% ( > 0.01 in Figure S1). Secondly, for any given NV, at least one datapoint must exhibit coverage and frequency values that allow the conclusion that the authentic NV frequency is significantly above 20% ( > 0.2 in Figure S1). We assume that there is little evidence for a biological selection advantage (or disadvantage) associated with a given NV if frequency values do not reach 20% in any of the investigated samples. Lastly, to principally allow us to establish whether a NV may undergo a minor-to-major or major-to-minor transition, values from at least two sampling points (which may or may not include the data point meeting the 20% frequency cutoff) must allow the reasonable classification of a NV as a major or minor variant. To meet this condition, coverage values must be sufficient to differentiate observed frequency values from the midpoint value (i.e, < 0.5 or > 0.5, for observed frequency value below or above 0.5, respectively). SNVs with a sufficient number of qualifying data points are then subjected to classification as not significant, minor or major according to observed frequencies and coverage levels across all samples. Hybrid calls are being made in cases where a NV cannot be placed in a distinct category (see Figure S1).

Calling of transition events and calculation of transition p-values

To detect changes in variant frequencies that may reflect parental genotypes which acquire or loose dominance, we identify time points in which NV states transition from not significant or minor and major (or vice versa) when compared to the closest preceding time point meeting the selection criteria. Data points having ambiguous classification calls (4 and 5 in Figure S1), or coverage values below 10 do not qualify for this analysis. For each variant, we then calculate a transition probability corresponding to the mean of transition events divided by the number of intervening days across all qualifying samples. This value provides a lower estimate for the probability that a variant may undergo a transition per day. We then calculate an average variant transition probability across all samples by calculating the mean of individual transition probabilities. We calculate p-values for the hypothesis that, in a given sample, the total number of observed transitions is greater than that expected from the average variant transition probability. To do so, for each variant which meets the above selection criteria (i.e., sufficient coverage levels such that a transition event could theoretically be observed), we first calculate an expected value by multiplying the total number of days between the time point in question and that of the closest preceding sample in which the variant meets selection criteria. The expected sample mean then corresponds to the sum of values from all variants:where is the number of variants meeting selection criteria, d is the time period between the considered time point and the closest preceding sample in which meets selection criteria, and is the average variant transition probability. The p-value for above hypothesis then corresponds to the cumulative Poisson probability that the number of transition events is between the observed value and the total number of variants meeting selection criteria. Note that this value does not provide an absolute measure; it only indicates an unexpectedly high number of transitions under the null hypothesis that transition events occur with equal likelihood across all patient samples.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Critical commercial assays

CleanPlex SARS-CoV-2 Flex Panel	Paragon Genomics, CA, USA	Cat.# 918015
Mid output NextSeq500	Illumina, CA, USA	Cat.# 20024905
CleanPlex plated UniqueDual-Indexed PCR Primers for Illumina Set C	Paragon Genomics, CA, USA	Cat.# 716037
CleanPlex plated UniqueDual-Indexed PCR Primers for Illumina Set D	Paragon Genomics, CA, USA	Cat.# 716038
CleanPlex plated UniqueDual-Indexed PCR Primers for Illumina Set E	Paragon Genomics, CA, USA	Cat.# 716039
CleanPlex plated UniqueDual-Indexed PCR Primers for Illumina Set F	Paragon Genomics, CA, USA	Cat.# 716040
cobas® SARS-CoV-2	Roche	Cat.# 09175431190
cobas® SARS-CoV-2 Control Kit	Roche	Cat.# 09175440190
cobas® 6800/8800 Buffer Negative Control Kit	Roche	Cat.# 07002238190
MagNA Pure 96 DNA and Viral Nucleic Acid Small Volume Kit	Roche	Cat.# 06543588001

Deposited data

All SARS-CoV2 sequences are available at	ENA	PRJEB50471

Software and algorithms

FastQC	Andrews S. (2010).	https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Cutadapt	Martin M. (2011).	https://cutadapt.readthedocs.io/en/stable/
Vsearch	Rognes T. (2016).	https://github.com/torognes/vsearch
FASTX-Toolkit	Hannon, G.J. (2010)	http://hannonlab.cshl.edu/fastx_toolkit/
Samtools	Danecek P. (2021).	https://github.com/samtools/samtools
Sambamba	Tarasov A. (2015).	https://lomereiter.github.io/sambamba/
Samclip	N/A	https://github.com/tseemann/samclip
bamaddrg	N/A	https://github.com/ekg/bamaddrg
seqtk	N/A	https://github.com/lh3/seqtk
iVAR	Grubaugh, N.D. (2019).	https://github.com/andersen-lab/ivar
bedtools	Aaron R. Quinlan (2010).	https://bedtools.readthedocs.io/en/latest/
Frrebayes	Garrison E. (2012).	https://github.com/freebayes/freebayes
bcftools	Danecek P. (2021).	https://samtools.github.io/bcftools/bcftools.html
vcflib	Garrison E.	https://github.com/vcflib/vcflib
Pangolin	O’Toole A. (2021).	https://github.com/stevenlovegrove/Pangolin
Nexstrain	Aksamentov, I. (2021).	https://clades.nextstrain.org/

36 in total

1. VSEARCH: a versatile open source tool for metagenomics.

Authors: Torbjørn Rognes; Tomáš Flouri; Ben Nichols; Christopher Quince; Frédéric Mahé
Journal: PeerJ Date: 2016-10-18 Impact factor: 2.984

2. Clinical evaluation of a SARS-CoV-2 RT-PCR assay on a fully automated system for rapid on-demand testing in the hospital setting.

Authors: Dominik Nörz; Nicole Fischer; Alexander Schultze; Stefan Kluge; Ulrich Mayer-Runge; Martin Aepfelbacher; Susanne Pfefferle; Marc Lütgehetmann
Journal: J Clin Virol Date: 2020-04-28 Impact factor: 3.168

3. Remdesivir is a delayed translocation inhibitor of SARS-CoV-2 replication.

Authors: Jack P K Bravo; Tyler L Dangerfield; David W Taylor; Kenneth A Johnson
Journal: Mol Cell Date: 2021-01-28 Impact factor: 17.970

4. In vitro selection of Remdesivir resistance suggests evolutionary predictability of SARS-CoV-2.

Authors: Agnieszka M Szemiel; Andres Merits; Richard J Orton; Oscar A MacLean; Rute Maria Pinto; Arthur Wickenhagen; Gauthier Lieber; Matthew L Turnbull; Sainan Wang; Wilhelm Furnon; Nicolas M Suarez; Daniel Mair; Ana da Silva Filipe; Brian J Willett; Sam J Wilson; Arvind H Patel; Emma C Thomson; Massimo Palmarini; Alain Kohl; Meredith E Stewart
Journal: PLoS Pathog Date: 2021-09-17 Impact factor: 6.823

5. Mutations in the SARS-CoV-2 RNA-dependent RNA polymerase confer resistance to remdesivir by distinct mechanisms.

Authors: Laura J Stevens; Andrea J Pruijssers; Hery W Lee; Calvin J Gordon; Egor P Tchesnokov; Jennifer Gribble; Amelia S George; Tia M Hughes; Xiaotao Lu; Jiani Li; Jason K Perry; Danielle P Porter; Tomas Cihlar; Timothy P Sheahan; Ralph S Baric; Matthias Götte; Mark R Denison
Journal: Sci Transl Med Date: 2022-08-03 Impact factor: 19.319

Review 6. SARS-CoV-2 variants, spike mutations and immune escape.

Authors: William T Harvey; Alessandro M Carabelli; Ben Jackson; Ravindra K Gupta; Emma C Thomson; Ewan M Harrison; Catherine Ludden; Richard Reeve; Andrew Rambaut; Sharon J Peacock; David L Robertson
Journal: Nat Rev Microbiol Date: 2021-06-01 Impact factor: 78.297

7. SARS-CoV-2 outbreak investigation in a German meat processing plant.

Authors: Thomas Günther; Adam Grundhoff; Manja Czech-Sioli; Daniela Indenbirken; Alexis Robitaille; Peter Tenhaken; Martin Exner; Matthias Ottinger; Nicole Fischer; Melanie M Brinkmann
Journal: EMBO Mol Med Date: 2020-10-27 Impact factor: 12.137

Review 8. Characteristics of SARS-CoV-2 and COVID-19.

Authors: Ben Hu; Hua Guo; Peng Zhou; Zheng-Li Shi
Journal: Nat Rev Microbiol Date: 2020-10-06 Impact factor: 78.297