| Literature DB >> 35745516 |
Brenda Martínez-González1,2, María Eugenia Soria1,3,4, Lucía Vázquez-Sirvent1, Cristina Ferrer-Orta5, Rebeca Lobo-Vega1, Pablo Mínguez6,7,8, Lorena de la Fuente6,7,8, Carlos Llorens9, Beatriz Soriano9, Ricardo Ramos-Ruíz10, Marta Cortón6,7, Rosario López-Rodríguez6,7, Carlos García-Crespo3,4, Pilar Somovilla3,11, Antoni Durán-Pastor3, Isabel Gallego3,4, Ana Isabel de Ávila3,4, Soledad Delgado12, Federico Morán13, Cecilio López-Galíndez14, Jordi Gómez4,15, Luis Enjuanes2, Llanos Salar-Vidal1, Mario Esteban-Muñoz1, Jaime Esteban1, Ricardo Fernández-Roblas1, Ignacio Gadea1, Carmen Ayuso6,7, Javier Ruíz-Hornillos16,17,18, Nuria Verdaguer5, Esteban Domingo3,4, Celia Perales1,2,4.
Abstract
Populations of RNA viruses are composed of complex and dynamic mixtures of variant genomes that are termed mutant spectra or mutant clouds. This applies also to SARS-CoV-2, and mutations that are detected at low frequency in an infected individual can be dominant (represented in the consensus sequence) in subsequent variants of interest or variants of concern. Here we briefly review the main conclusions of our work on mutant spectrum characterization of hepatitis C virus (HCV) and SARS-CoV-2 at the nucleotide and amino acid levels and address the following two new questions derived from previous results: (i) how is the SARS-CoV-2 mutant and deletion spectrum composition in diagnostic samples, when examined at progressively lower cut-off mutant frequency values in ultra-deep sequencing; (ii) how the frequency distribution of minority amino acid substitutions in SARS-CoV-2 compares with that of HCV sampled also from infected patients. The main conclusions are the following: (i) the number of different mutations found at low frequency in SARS-CoV-2 mutant spectra increases dramatically (50- to 100-fold) as the cut-off frequency for mutation detection is lowered from 0.5% to 0.1%, and (ii) that, contrary to HCV, SARS-CoV-2 mutant spectra exhibit a deficit of intermediate frequency amino acid substitutions. The possible origin and implications of mutant spectrum differences among RNA viruses are discussed.Entities:
Keywords: COVID-19; RNA virus; deletion; mutation; nsp12 (polymerase); spike; ultra-deep sequencing; viral quasispecies
Year: 2022 PMID: 35745516 PMCID: PMC9227345 DOI: 10.3390/pathogens11060662
Source DB: PubMed Journal: Pathogens ISSN: 2076-0817
Figure 1Representation of the SARS-CoV-2 genome, encoded proteins, and amplicons analyzed by UDS. In the two boxes below the scheme of the genome, the two genomic regions under study have been expanded, with genome residue numbers according to reference genome NCBI accession number NC_045512.2. The position of relevant protein domains is indicated. Left box: polymerase A to G motifs in the RdRp, and other domains of nsp12. Right box: spike (S) receptor binding motif (RBM) within the receptor binding domain (RBD), and the S1/S2 cleavage site. The amplicons analyzed in the present study are depicted as horizontal boxes [A1 to A4 for the nsp12 (polymerase) region, and A5, A6 for the S region]. Residue numbers delimiting each of the amplicons are shown, and the amino acid residues analyzed in the two proteins are indicated in the bottom lines. Materials and procedures used for amplicon preparation are detailed in Materials and Methods.
Figure 2Heat map of point mutation and deletion frequencies for the nsp12 (polymerase)-coding region (genomic residues 14,534–16,054) in mutant spectra of SARS-CoV-2 from individual patients with a cut-off value of 0.1%, divided according to associated COVID-19 severity (indicated on the left of each map). Data are presented in four blocks that correspond to the four amplicons (A1 to A4); the genomic residues spanned by each amplicon are shown in Figure 1. Only positions with a mutation or those affected by a deletion (arrow symbols at the top of each block) are represented; the complete list of mutations, their position, type, deduced amino acid substitutions, their acceptability, and association with disease severity, are listed in Tables S3 and S4. The mutant frequency has been visualized with a color code displayed in the heading box. Each row corresponds to a patient whose clinical profile and identification code were previously reported [47]. Mutations and deletions have been identified relative to NCBI reference sequence NC_045512.2. Procedures are detailed in Materials and Methods.
Figure 3Heat map of point mutation and deletion frequencies for the spike-coding region (genomic residues 22,872–23,645) in mutant spectra of SARS-CoV-2 from individual patients with a cut-off value of 0.1%, divided according to associated COVID-19 severity (indicated on the left of each map). Data are presented in two blocks that correspond to the two amplicons (A5 to A6); the genomic residues spanned by each amplicon are shown in Figure 1. Only positions with a mutation or those affected by a deletion (arrow symbols at the top of the lower block) are represented; the complete list of mutations, their position, type, deduced amino acid substitutions, their acceptability, and association with disease severity, are listed in Tables S3 and S4. The mutant frequency has been visualized with a color code displayed in the heading box. Each row corresponds to a patient whose clinical profile and identification code were previously reported [47]. Mutations and deletions have been identified relative to NCBI reference sequence NC_045512.2. Procedures are detailed in Materials and Methods.
Figure 4Number of genetic lesions in mutant spectra of SARS-CoV-2, distributed according to associated COVID-19 severity, determined at 0.5% and 0.1% cut-off frequency (codes in upper box). The genomic region is indicated at the top of each panel group, and the amplicons are depicted in Figure 1. (A) Number of different and total point mutations. The number of mutations determined with a 0.1% frequency cut-off are indicated on top of each bar, and the number previously determined with a 0.5% cut-off frequency [47] is given above the discontinuous horizontal line within each bar. (B) Same as (A) but for deletions. The complete information of mutations and deletions is listed in Tables S3 and S4. Only statistically significant differences in the number of mutations or deletions are shown (*, p < 0.05; ***, p < 0.001; proportion test).
Figure 5Number of point mutations and deletions detected with different frequency cut-off values in diagnostic samples of SARS-CoV-2, grouped according to associated COVID-19 severity (color code in upper box). The genomic region is indicated at the top of each panel group. Insets include the statistical significance of relevant differences. (A) Number of different and total point mutations, as indicated in ordinate. The complete list of point mutations detected with a 0.1% frequency cut-off is given in Table S3. (B) Number of different and total deletions, as indicated in ordinate. The complete list of deletions detected with a 0.1% frequency cut-off is given in Table S4, and their location in the genomic regions is depicted in Figure S1. Experimental and bioinformatics procedures are described in Materials and Methods. Statistically significant differences in the number of mutations or deletions are shown (n.s., p > 0.05; *, p < 0.05; ***, p < 0.001; proportion test).
Number of different and total mutations in SARS-CoV-2 isolates, classified in mild, moderate, and exitus patients.
| Patient Category | ||||||
|---|---|---|---|---|---|---|
| Total | Mild | Moderate | Exitus | |||
|
|
|
| 578 (98.97%) | 544 (99.63%) | 344 (99.42%) | 416 (99.05%) |
|
| 6 (1.03%) | 2 (0.37%) | 2 (0.58%) | 4 (0.24%) | ||
|
| 96.33 | 272 | 172 | 104 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
|
|
| 7587 (99.82%) | 2883 (99.93%) | 2254 (99.65%) | 2451 (99.84%) | |
|
| 14 (0.18%) | 2 (0.07%) | 8 (0.35%) | 4 (0.16%) | ||
|
| 541.93 | 1441.50 | 281.75 | 612.75 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
|
|
|
| 297 (99.33%) | 273 (100%) | 209 (99.52%) | 210 (99.53%) |
|
| 2 (0.67%) | 0 (0%) | 1 (0.48%) | 1 (0.47%) | ||
|
| 148.50 | - | 209 | 210 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
|
|
| 3718 (99.95%) | 1343 (100%) | 1187 (99.92%) | 1188 (99.92%) | |
|
| 2 (0.05%) | 0 (0%) | 1 (0.08%) | 1 (0.08%) | ||
|
| 1859 | - | 1187 | 1188 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
a Statistical difference of significance is given (***, p < 0.001).
Number of different and total mutations in SARS-CoV-2 isolates, classified in mild, moderate and exitus patients.
| Patient Category | ||||||
|---|---|---|---|---|---|---|
| Total | Mild | Moderate | Exitus | |||
|
|
|
| 238 (40.75%) | 218 (39.93%) | 146 (42.20%) | 175 (41.67%) |
|
| 346 (59.25%) | 328 (60.07%) | 200 (57.80%) | 245 (58.33%) | ||
|
| 0.69 | 0.66 | 0.73 | 0.71 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
|
|
| 2971 (39.08%) | 1130 (39.17%) | 877 (38.78%) | 964 (39.27%) | |
|
| 4631 (60.92%) | 1755 (60.83%) | 1385 (61.23%) | 1491 (60.73%) | ||
|
| 0.64 | 0.64 | 0.63 | 0.65 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
|
|
|
| 125 (41.95%) | 115 (42.28%) | 90 (43.06%) | 89 (42.38%) |
|
| 173 (58.05%) | 157 (57.72%) | 119 (56.94%) | 121 (57.62%) | ||
|
| 0.72 | 0.73 | 0.76 | 0.74 | ||
|
| <0.001 | <0.001 | 0.006 | 0.002 | ||
|
| *** | *** | *** | *** | ||
|
|
| 1659 (44.60%) | 606 (45.12%) | 525 (44.19%) | 528 (44.41%) | |
|
| 2061 (55.40%) | 737 (54.88%) | 663 (55.81%) | 661 (55.59%) | ||
|
| 0.80 | 0.82 | 0.79 | 0.80 | ||
|
| <0.001 | <0.001 | <0.001 | <0.001 | ||
|
| *** | *** | *** | *** | ||
a Statistical difference of significance is given (**, p < 0.01; ***, p < 0.001).
Figure 6Percentage of amino acid substitutions in SARS-CoV-2 and HCV populations sampled from infected patients that fall in each frequency range, with a 1% as the low frequency limit. The virus identification code is given in the upper boxes. (A) Distribution of amino acid substitutions deduced from all amplicons analyzed (A1 to A6 for SARS-CoV-2 as depicted in Figure 1, and amplicons corresponding to proteins NS3, NS5A, and NS5B for HCV, as described in [51,52,53,54]); the complete list of SARS-CoV-2 amino acid substitutions is given in Table S3 and the complete list of HCV is given in Table S1 of [53]. (B) Same as (A) but for the comparison restricted to the two polymerase proteins, nsp12 for SARS-CoV-2, and NS5B for HCV. Data origin, experimental procedures, and bioinformatics pipelines are described in Materials and Methods.