| Literature DB >> 35552003 |
Mouna Ben Sassi1, Sana Ferjani2, Imen Mkada3, Marwa Arbi4, Mouna Safer5, Awatef Elmoussi6, Salma Abid6, Oussema Souiai4, Alya Gharbi7, Asma Tejouri8, Emna Gaies1, Hanene Eljabri9, Samia Ayed10, Aicha Hechaichi5, Riadh Daghfous1, Riadh Gouider7, Jalila Ben Khelil10, Maher Kharrat8, Imen Kacem7, Nissaf Ben Alya5, Alia Benkahla4, Sameh Trabelsi1, Ilhem Boutiba-Ben Boubaker11.
Abstract
Since the beginning of the Coronavirus disease-2019 pandemic, there has been a growing interest in exploring SARS-CoV-2 genetic variation to understand the origin and spread of the pandemic, improve diagnostic methods and develop the appropriate vaccines. The objective of this study was to identify the SARS-CoV-2s lineages circulating in Tunisia and to explore their amino acid signature in order to follow their genome dynamics. Whole genome sequencing and genetic analyses of fifty-eight SARS-CoV-2 samples collected during one-year between March 2020 and March 2021 from the National Influenza Center were performed using three sampling strategies.. Multiple lineage introductions were noted during the initial phase of the pandemic, including B.4, B.1.1, B.1.428.2, B.1.540 and B.1.1.189. Subsequently, lineages B1.160 (24.2%) and B1.177 (22.4%) were dominant throughout the year. The Alpha variant (B.1.1.7 lineage) was identified in February 2021 and firstly observed in the center of our country. In addition, A clear diversity of lineages was observed in the North of the country. A total of 335 mutations including 10 deletions were found. The SARS-CoV-2 proteins ORF1ab, Spike, ORF3a, and Nucleocapsid were observed as mutation hotspots with a mutation frequency exceeding 20%. The 2 most frequent mutations, D614G in S protein and P314L in Nsp12 appeared simultaneously and are often associated with increased viral infectivity. Interestingly, deletions in coding regions causing consequent deletions of amino acids and frame shifts were identified in NSP3, NSP6, S, E, ORF7a, ORF8 and N proteins. These findings contribute to define the COVID-19 outbreak in Tunisia. Despite the country's limited resources, surveillance of SARS-CoV-2 genomic variation should be continued to control the occurrence of new variants.Entities:
Keywords: Amino acid change analysis; Amino acid signature; Coronavirus disease-2019; Lineages phylogenetic; SARS-CoV-2; Tunisia; Whole genome sequencing
Mesh:
Substances:
Year: 2022 PMID: 35552003 PMCID: PMC9085353 DOI: 10.1016/j.meegid.2022.105300
Source DB: PubMed Journal: Infect Genet Evol ISSN: 1567-1348 Impact factor: 4.393
Fig. 1Distribution of the 58 Tunisian SARS-CoV-2 genetic lineages in Tunisia during the first year of the pandemic: March 2020 to March 2021. *No COVID-19 cases was detected in Tunisia during June 2020, ** According to our sampling strategy, no COVID-19 cases were included during August 2020.
Fig. 2Phylogenetic analysis of the 58 Tunisian SARS-CoV-2 genome sequences.
Phylogenetic analysis of 58 Tunisian SARS-CoV-2 sequences, compared with SARS-CoV-2 reference sequence of Wuhan*: NC_045512, inferred by Neighbor-Joining method. Branches are colored according to the Nexclade Clade Nomenclature. The evolutionary distances were computed using the Maximum Likely hood method.
Non-synonymous mutations among SARS-CoV-2 clades from 58 Tunisian samples.
| 20B ( | 20I (Alpha) ( | 20C ( | 20 A ( | 20 E ( | 19 B ( | 19 A ( | ||
|---|---|---|---|---|---|---|---|---|
| ORF1a | NSP1 | E93K1, R124C1, G192D1 | ||||||
| NSP2 | G392C1, E489D1 | L730F4 | T265I6, S318L1 T346I3, H388Y3 A482V1 | K292R1, E342G1, H1141Y1, M1312I1 | D194N1 | P286L6 | V378I2 | |
| NSP3 | A591V1 | T1001I6 | P1596L3, T1908I1, T2154I1, S2535L1, A2690V3 | P1596L1, K1895N1, L2688F1, M3087I14, L3201I10 | V559M1, K1247N12, S1515F1, P1659S1, P1803S2 | |||
| NSP4 | P1158S1 | A1708D6 | T3058I1 | T3284I1, K3353R1 | D2980G6, T3082I4 | G3072C2 | ||
| NSP5 | I1232V1, P3359S2 | I2230T6 | S3384L1, L3711F1, P4223L1 | P2018S1 | S3386F1 | |||
| NSP6 | C2210F2, T3716A1 | S3675-6, G3676-6, F3677-6 | A2345V1, A3209V1, A3497V1 | N3651S6 | L3606F2 | |||
| NSP7 | S2500F2 S3884L1 | L3606F2 | ||||||
| NSP8 | A3623S12 | |||||||
| NSP9 | P4197S1 | |||||||
| NSP10 | T4304I1 | |||||||
| ORF1b | NSP12 | P314L6 | T132I1, P314L6 | P314L6 | A176S14, P314L17, T730I1, V767L14, S904L1 | D275Y2, P314L14 | ||
| NSP13 | K1383R2 | M1156I1 M1499I1 | P976L1, K1141R14, E1184D14, M1352I2, S1408L1 | P1001S1, Y1229C1 | P1000L6 | |||
| NSP14 | T1545A1 | E1871G2 | T1540I1, T1555I1R1737L1, T1747N1, V1840F1, T2040I1 | |||||
| NSP15 | M2269I1 | D2179Y1, Q2247H2, E2253G1, D2263N1 | P2313S1 | |||||
| NSP16 | Q2635H2 | A2559S1 | ||||||
| S-Protein | D111N1, D614G6 A684V2, I770V1 A892S1 | V6A2, H695, V705, D138H1, Y1446, N501Y6, A570D6, D614G6, P681H6, T716I6, S982A6, D1118H6 | D614G6, E780Q1 | L5F1, V120I1, L176F1, I233V2, G261S1, S477N14, D614G17, D627E1, S640F1, D936Y1, A1020S1, H1101Y1 | L5F2, L18F1, Q23H1, A222V14, T572I1, D614G14, G932C1, H1101Y1, K1191N3 | L18F6, V227A4, L452R6, N501Y6, A653V5, H655Y6, Q677H1, D796Y6, K1191N5, G1219V6 | ||
| ORF3a | F230V1 | L140F1, G174D3 | Q57H6, A99V1, G224C1 | Q57H15, V97I1, T223I1 | A39T1, D155Y1, L101F1, W131C1, T223I2 | V50A1 | ||
| M-Protein | H125Y1 | V70F1 | ||||||
| ORF7a | F6S1 | T14I1 | T120I1 | I10L1 | ||||
| ORF8 | S21N2 | Q27* 6, R52I6, Y73C5 | H28Y1, A51S1, Q72L1, C83F1 | V5I1, E64*1 | D119-6, F120-6, A65S4, L84S6 | |||
| ORF9b | S6I1, Q18H1 | |||||||
| N-Protein | R203K6, G204L1, G204R5, 5G321F2 T325I1 | D3L5, R203K6, G204R6, S235F6 | S186Y1, T205I3, D348H1 | K80E3, G19R1, Q83R3, M234I14, A376T14, Q384H2 | Q9H1, D22Y14, A220V1, G236C1Q418L1 | S202N6 | M1I2, S188P2 | |
Superscript number: number of isolates that harboured mutation; - : deletion; *: stop codon; E: envelope protein; M: membrane glycoprotein; N: nucleocapsid phosphoprotein; ORF: open reading frame; S: spike glycoprotein.
Deletion characteristics: 58 Tunisian SARS-CoV-2 whole genome sequences.
| Deletions | Number of nucleotides | Position (bp) | AA change | Frameshift | Corresponding Protein | Clades | Number of affected sequences | Sequences reference |
|---|---|---|---|---|---|---|---|---|
| Deletion 1 | 7 | 6833 | I-K | Yes | Nsp3 | 20 E | 1 | 55,400 |
| Deletion 2 | 9 | 11,288 | S-G-F | No | Nsp6 | 20I (Alpha. V1) | 6 | Q8734/18267/19152/18506/18507/18915 |
| Deletion 3 | 6 | 21,765 | H-H | No | Protein S | 20I (Alpha. V1) | 6 | Q8734/18267/19152/18506/18507/18915 |
| Deletion 4 | 3 | 21,992 | Y | No | Protein S | 20I (Alpha. V1) | 6 | Q8734/18267/19152/18506/18507/18915 |
| Deletion 5 | 4 | 26,158 | V | Yes | Protein E | 20B | 1 | 55,304 |
| Deletion 6 | 8 | 26,161 | N-P | Yes | Protein E | 19B | 6 | G6590/19153/G6575bis/5509/4409/14670 |
| Deletion 7 | 1 | 27,293 | – | Yes | Orf7a | 20A | 1 | 55,319 |
| Deletion 8 | 1 | 27,388 | – | Yes | Orf7a | 19B | 6 | G6590/19153/G6575bis/5509/4409/14670 |
| Deletion 9 | 6 | 28,248 | D-F | No | Orf8 | 19B | 6 | G6590/19153/G6575bis/5509/4409/14670 |
| Deletion 10 | 1 | 28,271 | – | Yes | Protein N | 20I (Alpha. V1) | 5 | 18,267/19152/18506/18507/18915 |
Bp: Base pairs; AA: Amino Acids; Nsp: Non structural protein; Protein S: Spike glycoprotein; Protein E: Envelopeprotein; Protein N: Nucleocapsid protein.
Fig. 3Genomic variation frequency of Tunisian SARS-CoV-2 sequences (n = 58). Genomic variants were identified by referring to the first diagnosed NC_045512 Wuhan variant using MEGA X (Kumar et al., 2018). The locations and the mutations frequencies of the variants were plotted along genomic sequence of NC_045512. The open reading frames (ORFs) of SARS-CoV-2 were shown as rectangles that were aligned with nucleotide positions of the coronavirus. The frequency of each mutation in the population is presented by color coded circles. Abbreviations: ORF: Open Reading Frame; E: Envelope; M: Membrane protein; N: Nucleocapsid protein.