| Literature DB >> 35021068 |
Marco Gerdol1, Klevia Dishnica2, Alejandro Giorgetti2.
Abstract
Tracking the evolution of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) through genomic surveillance programs is undoubtedly one of the key priorities in the current pandemic situation. Although the genome of SARS-CoV-2 acquires mutations at a slower rate compared with other RNA viruses, evolutionary pressures derived from the widespread circulation of SARS-CoV-2 in the human population have progressively favored the global emergence, though natural selection, of several variants of concern that carry multiple non-synonymous mutations in the spike glycoprotein. These are often placed in key sites within major antibody epitopes and may therefore confer resistance to neutralizing antibodies, leading to partial immune escape, or otherwise compensate infectivity deficits associated with other non-synonymous substitutions. As previously shown by other authors, several emerging variants carry recurrent deletion regions (RDRs) that display a partial overlap with antibody epitopes located in the spike N-terminal domain (NTD). Comparatively, very little attention had been directed towards spike insertion mutations prior to the emergence of the B.1.1.529 (omicron) lineage. This manuscript describes a single recurrent insertion region (RIR1) in the N-terminal domain of SARS-CoV-2 spike protein, characterized by at least 49 independent acquisitions of 1-8 additional codons between Val213 and Leu216 in different viral lineages. Even though RIR1 is unlikely to confer antibody escape, its association with two distinct formerly widespread lineages (A.2.5 and B.1.214.2), with the quickly spreading omicron and with other VOCs and VOIs warrants further investigation concerning its effects on spike structure and viral infectivity.Entities:
Keywords: Coronavirus; Covid-19; Evolution; Immune escape; Omicron; Sarbecovirus; Variant
Mesh:
Substances:
Year: 2022 PMID: 35021068 PMCID: PMC8743576 DOI: 10.1016/j.virusres.2022.198674
Source DB: PubMed Journal: Virus Res ISSN: 0168-1702 Impact factor: 3.303
Fig. 1Schematic representation of the SARS-CoV-2 protein, with indication of the two functional S1 and S2 subunits, which are separated by a furin-like proteolytic cleavage site, the N-terminal domain (NTD), the receptor binding domain (RBD) and receptor binding motif (RBM), the SD1 and SD2 subdomains. The absolute number of observed deletion mutations along the S-gene are reported (https://mendel.bii.a-star.edu.sg/ was last accessed on January 5th, 2022). Bars were truncated at 2000 observed genomes; in such cases, the approximate absolute number of observations is reported above the truncated bars, together with the main VOCs and VOIs associated with each indel, indicated with a Greek alphabet letter. The position of RDR1-RDR4 from a previous study (McCarthy et al., 2021), as well as the deletion 157/158 characterizing the delta variant and the ins145T insertion characterizing the mu variant, are reported.
Fig. 2Multiple sequence alignment of the nucleotide sequences of the SARS-CoV-2 S gene of the viral lineages characterized by an insertion at RIR1, compared with the reference sequence Wuhan Hu-1. The multiple sequence alignment only displays a small portion of the S gene and of the encoded spike protein, zoomed-in and centered on RIR1 (i.e. codons 212–217). Red vertical bars indicate codon boundaries, with the encoded amino acids (in the Wuhan Hu-1 reference sequence) indicated below. The number of observed GISAID entries for each insertion as well as the encoded amino acid sequences are shown near the insertion name. Please note that the exact position of all insertion could not be unambiguously detected in all cases; those with ambiguous placement are marked with an asterisk (see Table 1 for details).
Summary of the 49 independent RIR1 insertions found in the SARS-CoV-2 genome, ordered by the earliest date of detection, as of January 5th, 2022.
| I | 214:AKKN | ambiguous (codon 215 phase0/I) | B | 1 | none | Mar 5th, 2020 | Mar 5th, 2020 |
| II | 214:KLGP | in-frame (codon 215 phase 0) | B.1.177 | 1 | E154K, A222V, D614G | Nov 13th, 2020 | Nov 13th, 2020 |
| III | 214:AAG | out-of-frame (codon 215 phase I) | A.2.5 | 1844 | L141del, G142del, V143del, D215Y, L452R, D614G | Nov 20th, 2020 | Nov 5th, 2021 |
| IV | 214:TDR | in-frame (codon 215 phase 0) | B.1.214.2 | 1228 | Q414K, N450K, D614G, T716I | Nov 22nd, 2020 | Jun 28th, 2021 |
| V | 214:ANRN | ambiguous (codon 215 phase0/I) | γ(P.1) | 12 | L18F, P26S, D138Y, K417T, E484K, N501Y, D614G, D1139H, V1176F | Dec 23rd, 2020 | Apr 5th, 2021 |
| VI | 214:KRI | in-frame (codon 215 phase 0) | B | 4 | V367F, E990A | Dec 28th, 2020 | Mar 15th, 2021 |
| VII | 214:AQER | ambiguous (codon 215 phase0/I) | ε(B.1.429) | 2 | S13I, P26S, S98F, W152C, L452R, D614G, T1027I | Jan 15th, 2021 | Jan 18th, 2021 |
| VIII | 214:DLA | ambiguous (codon 215 phase0/I/II, codon 216 phase 0/I/II) | B.1.2 | 5 | D614G | Jan 17th, 2021 | Feb 2nd, 2021 |
| IX | 214:QAS | in-frame (codon 215 phase 0) | B.1.639 | 45 | H69del, V70del, Y144del, M153T, T478K, E484K, D614G, T859N, D936Y | Jan 19th, 2021 | Nov 4th, 2021 |
| X | 216:ADL | ambiguous (codon 216 phase I/II) | B.1.2 | 1 | D614G | Jan 25th, 2021 | Jan 25th, 2021 |
| XI | 214:DRS | out-of-frame (codon 215 phaseI/II) | B.1 | 1 | D215N, V382L, D614G, M1237I | Feb 1st, 2021 | Feb 1st, 2021 |
| XII | 214:KFH | in-frame (codon 215 phase 0) | α(B.1.1.7) | 3 | H69del, V70del, Y144del, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H | Feb 12th, 2021 | Feb 22nd, 2021 |
| XIII | 214:KAFKQ | in-frame (codon 215 phase 0) | α(B.1.1.7) | 2 | H69del, V70del, Y144del, A262S, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H | Feb 25th, 2021 | Feb 25th, 2021 |
| XIV | 214:ER | out-of-frame (codon 215 phase II) | α(B.1.1.7) | 1 | H69del, V70del, Y144del, D215G, N501Y, A570D, D614G, P681H, I712V, T716I, S982A, D1118H | Mar 11th, 2021 | Mar 11th, 2021 |
| XV | 214:NWAHW | in-frame (codon 215 phase 0) | B.1.547 | 1 | T19I, T95I, H69del, V70del, D614G, E484A, A879T, T1027I | Mar 22nd, 2021 | Mar 22nd, 2021 |
| XVI | 214:APR | ambiguous (codon 215 phase0/I) | α(B.1.1.7) | 1 | H69del, V70del, Y144del, A262S, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H | Mar 31st, 2021 | Mar 31st, 2021 |
| XVII | 213:KAR | out-of-frame (codon 213 phase II) | B.1.177 | 1 | A222V, A262S, P272L, D614G, P681H, M1229I | Apr 2nd, 2021 | Apr 2nd, 2021 |
| XVIII | 214:KLRS | ambiguous (codon 215 phase II, codon 216 phase 0) | A.28 | 6 | T76I, D215S, N501T, F592S, H655Y | Apr 23rd, 2021 | Jun 22nd, 2021 |
| XIX | 214:KGE | in-frame (codon 215 phase 0) | B.1.1.519 | 1 | T732A, T478K, D614G, P681H | Apr 24th, 2021 | Apr 24th, 2021 |
| XX | 214:W | in-frame (codon 215 phase 0) | B.1 | 1 | T19I, F140del, P139del, L141del, G142del, V143del, Y144del, Y145del, I210del, L242del, A243del, L244del, T470N, S494P, D614G, H655Y, T859N | May 14th, 2021 | May 14th, 2021 |
| XXI | 216:QMR | out-of-frame (codon 216 phase I) | α(B.1.1.7) | 1 | L5F, S13I, H69del, V70del, Y144del, D215R, E484K, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H | May 25th, 2021 | May 25th, 2021 |
| XXII | 214:KLLIRGD | in-frame (codon 215 phase 0) | B.1 | 2 | Q14del, C15del, V16del, N17del, L18del, W64R, T95I, C136Y, N137del, L141del, G142del, V143del, W152R, I210del, G252V, T415A, N440T, E484K, D614G, H655Y, P681H, T859N, Q1011H, G1219C | Jun 4th, 2021 | Jun 4th, 2021 |
| XXIII | 213:KRLR | out-of-frame (codon 213 phase II) | B.1.1 | 1 | R214Q, D614G, E484D | Jun 6th, 2021 | Jun 6th, 2021 |
| XXIV | 214:EAR | ambiguous (codon 215 phase I/II/III) | δ(B.1.617.2) | 1 | T19R, N137K, G142D, E156G, F157del, R158del, L452R, T478K, E484Q, D614G, P681R, D950N | Jun 8th, 2021 | Jun 8th, 2021 |
| XXV | 214:QCEE | in-frame (codon 215 phase 0) | B.1.247 | 1 | P209S, A222V, T572I | Jul 15th, 2021 | Jul 15th, 2021 |
| XXVI | 214:ELCD | ambiguous (codon 215 phase I/II/III) | δ(AY.12) | 1 | T19R, T95I, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Jul 17th, 2021 | Jul 17th, 2022 |
| XXVII | 214:NFGGG | in-frame (codon 215 phase 0) | δ(AY.4) | 2 | T19R, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Jul 22nd, 2021 | Jul 22nd, 2021 |
| XXVIII | 214:NES | in-frame (codon 215 phase 0) | δ(AY.16) | 1 | T19R, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Jul 30th, 2021 | Jul 30th, 2021 |
| XXIX | 214:VPWI | ambiguous (codon 215 phase 0/I) | δ(AY.4) | 9 | T19R, T95I, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R,V622F, D950N | Aug 4th, 2020 | Aug 21st, 2021 |
| XXX | 214:HSG | in-frame (codon 215 phase 0) | δ(AY.4) | 7 | T19R, T95I, D138H, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Aug 4th, 2021 | Aug 25st, 2021 |
| XXXI | 214:SYCTKSKC | in-frame (codon 215 phase 0) | η(B.1.525) | 1 | A67V, H69del, V70del, Y144del, E484K, D614G, A653V, N679del, Q677H, F888L | Aug 5th, 2021 | Aug 5th, 2021 |
| XXXII | 214:ESH | ambiguous (codon 215 phase 0/I/II) | B.1.240 | 1 | C15F, L141del, G142del, V143del, Y144del, L242del, A243del, L244del, G446V, E484A, D614G, A688V, V1176F | Aug 6th, 2021 | Aug 6th, 2021 |
| XXXIII | 214:NFG | in-frame (codon 215 phase 0) | δ(AY.25) | 3 | T19R, S112L, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Aug 20th, 2021 | Sep 14th, 2021 |
| XXXIV | 214:DQAF | out-of-frame (codon 214 phase II) | δ(B.1.617.2) | 1 | T19R, G142D, E156G, F157del, R158del, A222V, V289I, L452R, T478K, D614G, P681R, D950N | Aug 21st, 2021 | Aug 21st, 2021 |
| XXXV | 214:EGAE | ambiguous (codon 215 phase 0/I/II) | δ(AY.4) | 2 | T19R, T95I, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Sep 2nd, 2021 | Sep 2nd, 2021 |
| XXXVI | 214:RLR | in-frame (codon 215 phase 0) | δ(AY.3) | 2 | T19R, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Sep 20th, 2021 | Sep 20th, 2021 |
| XXXVII | 214:GLKG | out-of-frame (codon 215 phase I) | δ(AY.101) | 1 | T19R, T95I, G142D, E156G, F157del, R158del, L452R, T478K,D614G, P681R, D950N | Oct 1st, 2021 | Oct 1st, 2021 |
| XXXVIII | 215:EAR | out-of-frame (codon 215 phase II) | δ(AY.4) | 1 | T19R, T95I, E156G, F157del, R158del, D215N, L452R, T478K, D614G, P681R, D950N | Oct 3rd, 2021 | Oct 3rd, 2021 |
| XXXIX | 214:RGG | out-of-frame (codon 214 phase II) | δ(AY.100) | 2 | T19R, T95I, G142D, E156G, F157del, R158del, N394S, D614G, P681R, D950N | Oct 19th, 2021 | Nov 8th, 2021 |
| XL | 214:VRR | out-of-frame (codon 215 phase I) | δ(AY.113) | 1 | T19R, G142D, E156G, F157del, R158del, D215H, L452R, D614G, P681R, D950N | Oct 26th, 2021 | Oct 26th, 2021 |
| XLI | 214:EPE | ambiguous (codon 215 phase 0/I/II) | ο (BA.1) | >140,000 | A67V, H69del, V70del, T95I, G142D, V143del, Y144del, Y145del, N211del, L212I, G339D, S371L, S373P, S375F, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H,T547K, D614G, H655Y, P681H, N679K, N764K, N679K, D796Y, N856K, Q954H, N969K, L981F | Nov 8th, 2021 | Jan 2nd, 2022 |
| XLII | 214:ASPN | out-of-frame (codon 215 phase I) | α(B.1.1.7) | 1 | H69del, V70del, Y144del, L242del, A243del, L244del, G446V, E484K, Q498K, N501Y, A570D, D614G, P681H, T716I, W886L, S982A, D1118H | Nov 14th, 2021 | Nov 14th, 2021 |
| XLIII | 215:KGD | ambiguos (codon 215 phase II, codon 216 phase 0) | δ(AY.4.2) | 1 | T19R,T95I,Y145H,E156G,F157del,R158del,G142D,A222V,L452R,T478K,D614G,P681R,D950N | Nov 18th, 2021 | Jan 3rd, 2021 |
| XLIV | 213:RWR | out-of-frame (codon 213 phase II) | δ(AY.4) | 2 | T19R, T95I, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Nov 19th, 2021 | Nov 22nd, 2021 |
| XLV | 215:AGS | out-of-frame (codon 215 phase II) | δ(AY.122) | 1 | T19R, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Nov 29th, 2021 | Nov 29th, 2021 |
| XLVI | 215:GGD | in-frame (codon 216 phase 0) | δ(AY.25) | 5 | T19R, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Dec 5th, 2021 | Dec 7th, 2021 |
| XLVII | 214:SRA | in-frame (codon 215 phase 0) | δ(AY.4.2.1) | 1 | T19R, V36F, T95I, G142D, Y145H, E156G, F157del, R158del, A222V, L452R, T478K, P681R, D614G, D950N | Dec 6th, 2021 | Dec 6th, 2021 |
| XLVIII | 214:AAVE | out-of-frame (codon 215 phase I) | δ(AY.4) | 1 | T19R, T95I, G142D, E156G, F157del, R158del, D215N, N282S, L452R, T478K, D614G, P681R, D950N, A1020S | Dec 9th, 2021 | Dec 9th, 2021 |
| XLIX | 213:SG | in-frame (codon 214 phase 0) | δ(AY.4) | 1 | T19R, T95I, G142D, E156G, F157del, R158del, L452R, T478K, D614G, P681R, D950N | Dec 16th, 2021 | Dec 16th, 2021 |
as of January 5th, 2022.
Fig. 3Panel A: circular time tree exemplifying the phylogeny of the A.2.5 lineage related sublineages. Only high quality, complete genomes have been included. The Wuhan-Hu-1 strain was used to root the tree; the sister lineage A.2.4 is also indicated. The acquisition of relevant spike mutations placed in the receptor binding domain (i.e. S477N, K417T, E484K and N501Y) is marked with arrows. Please note that the monophyletic clade linked with the acquisition of S477N corresponds to the A.2.5.3 sublineage. Panel B: key mutations associated with the A.2. lineages. Genes associated with mutations (compared with the reference strain Wuhan-Hu-1) are indicated; only mutations detected in > 50% of the genomes belonging to this lineage and associated sublineages are shown. Modified from https://outbreak.info/. Panel C: root-to-tip genetic distance (number of nucleotide substitutions) of the genomes belonging to the A.2.5 lineage and related sublineages, compared with the reference genome Wuhan-Hu-1. The black dashed line represents the average rate of mutation of all SARS-CoV-2 sequenced genomes, according to GISAID (i.e. 25 substitutions per genome per year, as of January 5th, 2022). The red dashed line represent the rate of mutation computed for A.2.5. Note that insertions and deletions were excluded from this calculation.
Fig. 4Upper panel: global spread of A.2.5 and related sublineages. Lower panel: detailed timing of the detection of sequenced genomes belonging to A.2.5 and related sublineages in different countries. Only countries with >= 10 unique days of detection are reported, whereas the others were collapsed in geographic macroareas (i.e. Asia + Oceania, Europe, South America and Central America). The reported dates refer to the dates of sampling reported in GISAID. gray boxes indicate periods of time with no sequencing data available for a given country.
Fig. 5Panel A: RMSF plot for the models of the wild-type and A.2.5 SARS-CoV-2 spike proteins, with indication of the point and insertion mutations present in the two viral lineages target of his study, compared with the wild type virus. Panel B: Three-dimensional structural models obtained for the wild type and A.2.5 spike proteins. The location of the NTD and RBD (within the S1 subunit) and of the S2 subunit in the spike trimer are shown at the left-hand side. The regions where the most significant fluctuations are marked in red.