| Literature DB >> 35388091 |
Alessia Lai1,2, Annalisa Bergna1, Stefano Toppo3,4, Marina Morganti5, Stefano Menzo6, Valeria Ghisetti7, Bianca Bruzzone8, Mauro Codeluppi9, Vito Fiore10, Emmanuele Venanzi Rullo11, Guido Antonelli12, Loredana Sarmati13, Gaetano Brindicci14, Annapaola Callegaro15, Caterina Sagnelli16, Daniela Francisci17, Ilaria Vicenti18, Arianna Miola19, Giovanni Tonon20,21, Daniela Cirillo22, Ilaria Menozzi5, Sara Caucci6, Francesco Cerutti7, Andrea Orsi23, Roberta Schiavo24, Sergio Babudieri10, Giuseppe Nunnari11, Claudio M Mastroianni25, Massimo Andreoni13, Laura Monno14, Davide Guarneri15, Nicola Coppola16, Andrea Crisanti26,27, Massimo Galli1, Gianguglielmo Zehender28,29,30.
Abstract
The aims of this study were to characterize new SARS-CoV-2 genomes sampled all over Italy and to reconstruct the origin and the evolutionary dynamics in Italy and Europe between February and June 2020. The cluster analysis showed only small clusters including < 80 Italian isolates, while most of the Italian strains were intermixed in the whole tree. Pure Italian clusters were observed mainly after the lockdown and distancing measures were adopted. Lineage B and B.1 spread between late January and early February 2020, from China to Veneto and Lombardy, respectively. Lineage B.1.1 (20B) most probably evolved within Italy and spread from central to south Italian regions, and to European countries. The lineage B.1.1.1 (20D) developed most probably in other European countries entering Italy only in the second half of March and remained localized in Piedmont until June 2020. In conclusion, within the limitations of phylogeographical reconstruction, the estimated ancestral scenario suggests an important role of China and Italy in the widespread diffusion of the D614G variant in Europe in the early phase of the pandemic and more dispersed exchanges involving several European countries from the second half of March 2020.Entities:
Mesh:
Year: 2022 PMID: 35388091 PMCID: PMC8986836 DOI: 10.1038/s41598-022-09738-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Characteristics of the studied populations.
| Sequences (n = 192) | ||
|---|---|---|
| Age | Median (min–max) | 68 (9–99) |
| Gender | M | 89 |
| F | 76 | |
| Region | Apulia | 9 |
| Campania | 4 | |
| Emilia Romagna | 14 | |
| Lazio | 20 | |
| Liguria | 15 | |
| Lombardy | 10 | |
| Marche | 11 | |
| Piedmont | 17 | |
| Sardinia | 13 | |
| Sicily | 12 | |
| Umbria | 2 | |
| Veneto | 65 | |
| Travel History | Yes | 0 |
| No | 121 | |
| n.a.* | 16 |
*n.a.: not available.
Figure 1Spatial distribution of lineages and clades. (a, b) Map of Italy reporting the lineage distribution (a) and the clade assignment (b) in every region.
Aminoacid substitutions found in more than 10% of sequences stratified according to lineage and clade.
| Gene | B n = 73 (%) | B.1 n = 222 (%) | B.1.1 n = 141 (%) | B.1.1.1 n = 29 (%) | 19A n = 85 (%) | 20A n = 207 (%) | 20B n = 141 (%) | 20C n = 4 (%) | 20D n = 29 (%) | |
|---|---|---|---|---|---|---|---|---|---|---|
| ORF1aa | T265I | – | – | – | – | – | – | 4 (100) | – | |
| T1246I | – | – | – | 29 (100) | – | – | – | 29 (100) | ||
| T1543I | 11 (15.6) | – | – | – | 11 (13.1) | – | – | – | – | |
| G3278S | – | – | – | – | – | – | – | 29 (100) | ||
| M3752L | – | – | – | 5 (17.2) | – | – | – | 5 (17.2) | ||
| M3752T | – | – | – | 5 (17.2) | – | – | – | 5 (17.2) | ||
| L3606F | 67 (91.8) | – | – | – | 69 (82.1) | – | – | – | – | |
| F3753I | – | – | – | 8 (27,6) | – | – | – | 8 (27.6) | ||
| ORF1bb | P314L | – | 214 (96.4) | 130 (92.2) | 29 (100) | 207 (100) | 130 (92.2) | 4 (100) | 29 (100) | |
| Sc | D614G | – | 205 (92.3) | 18 (84.3) | 29 (100) | 9 (10.7) | 197 (95.2) | 118 (83.7) | 1 (25) | 29 (100) |
| ORF3ad | Q57H | – | – | – | – | – | – | 4 (100) | – | |
| A99V | – | – | – | – | – | – | 3 (75) | – | ||
| G251V | 67 (91.8) | – | – | – | 69 (82.1) | – | – | – | – | |
| Me | D3G | – | 51 (22.9) | – | – | – | 49 (23.7) | – | – | |
| Nf | R203K | – | – | 140 (99.3) | 29 (100) | – | – | 140 (99.3) | – | 29 (100) |
| G204R | – | – | 140 (99.3) | 29 (100) | – | – | 140 (99.3) | – | 29 (100) | |
| ORF14g | G50R | – | – | 138 (98.6) | 29 (100) | – | – | 138 (97.9) | – | 29 (100) |
aOpen Reading Frames 1a.
bOpen Reading Frames 1b.
cSpike gene.
dOpen Reading Frames 3a.
eMembrane gene.
fNucleocapsid gene.
gOpen Reading Frames 14.
Figure 2SARS-CoV-2 Bayesian phylogeographic tree of 479 strains. Large red and purple circles indicate highest posterior probability ranging from 1 to 0.9. The branches are coloured based on the most probable lineage of the descendent nodes.
Time of the Most Recent Common Ancestor (tMRCA) estimates and confidence intervals (CI) of the mains lineages.
| Node | Maximum likelihood | Bayesian | |||||
|---|---|---|---|---|---|---|---|
| Median | CI_low | CI_up | Median | CI_low | CI_up | spp* | |
| Tree root | 17/12/2019 | 05/11/2019 | 28/12/2019 | 20/12/2019 | 09/12/2019 | 28/12/2019 | 1 |
| B | 24/12/2019 | 30/11/2019 | 10/01/2020 | 04/01/2020 | 28/12/2020 | 04/01/2020 | 0.82 |
| B IT | 29/01/2020 | 20/01/2020 | 29/01/2020 | 19/01/2020 | 08/01/2020 | 26/01/2020 | 0.92 |
| B.1 | 24/12/2019 | 30/11/2019 | 10/01/2020 | 15/01/2020 | 09/01/2020 | 23/01/2020 | 0.95 |
| B.1 IT** | 24/01/2020 | 13/01/2020 | 24/01/2020 | 19/01/2020 | 16/01/2020 | 23/01/2020 | 0.99 |
| B.1.1 | 12/02/2020 | 31/01/2020 | 16/02/2020 | 17/02/2020 | 10/02/2020 | 21/02/2020 | 0.73 |
| B.1.1.1 | 22/02/2020 | 10/02/2020 | 05/03/2020 | 03/03/2020 | 03/03/2020 | 10/03/2020 | 0.99 |
*spp, state posterior probability.
**IT, Italy.
Figure 3Ancestral reconstruction of SARS-CoV-2 lineages B.1 using the Italian dataset. The figure shows the compressed visualization produced by PastML using marginal posterior probability approximation (MPPA) with an F81-like model. Different colours correspond to different Italian geographical regions and lineages. Numbers inside (or next to) the circles indicate the number of strains assigned to the specific node.
Main characteristics of the identified clusters.
| Cluster_ID | Num Seqs | ITa | EUb | CNc | Lineage | Clade | MLd median | CIe_low | CI_up | Type of cluster* |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 4 | 4 | 0 | 0 | B.1 | 20A | 20/01/2020 | 08/01/2020 | 24/01/2020 | IT |
| 6 | 11 | 2 | 8 | 1 | B.1 | 20A | 24/01/2020 | 10/01/2020 | 21/02/2020 | M |
| 22 | 3 | 3 | 0 | 0 | B.1 | 20A | 31/01/2020 | 11/01/2020 | 03/03/2020 | IT |
| 2 | 6 | 6 | 0 | 0 | B.1 | 20A | 01/02/2020 | 17/01/2020 | 10/02/2020 | IT |
| 8 | 3 | 3 | 0 | 0 | B.1.1 | 20B | 10/02/2020 | 28/01/2020 | 12/03/2020 | IT |
| 3 | 7 | 5 | 2 | 0 | B.1 | 20A | 13/02/2020 | 26/01/2020 | 22/02/2020 | M |
| 7 | 3 | 2 | 1 | 0 | B.1 | 20A | 17/02/2020 | 17/01/2020 | 01/03/2020 | M |
| 11 | 3 | 3 | 0 | 0 | B.1 | 20A | 20/02/2020 | 18/01/2020 | 11/03/2020 | IT |
| 10 | 4 | 2 | 2 | 0 | B.1.1 | 20B | 20/02/2020 | 31/01/2020 | 12/03/2020 | M |
| 9 | 5 | 3 | 2 | 0 | B.1 | 20A | 20/02/2020 | 26/01/2020 | 13/03/2020 | M |
| 12 | 6 | 1 | 5 | 0 | B.1.1 | 20B | 22/02/2020 | 06/02/2020 | 27/02/2020 | S |
| 13 | 3 | 3 | 0 | 0 | B.1 | 20A | 23/02/2020 | 22/01/2020 | 24/03/2020 | IT |
| 5 | 11 | 11 | 0 | 0 | B | 19A | 24/02/2020 | 14/02/2020 | 24/02/2020 | IT |
| 4 | 3 | 1 | 0 | 0 | B | 19A | 24/02/2020 | 28/01/2020 | 28/02/2020 | S |
| 14 | 3 | 3 | 0 | 0 | B.1 | 20A | 01/03/2020 | 29/01/2020 | 01/03/2020 | IT |
| 15 | 11 | 3 | 8 | 0 | B.1.1.1 | 20D | 02/03/2020 | 22/02/2020 | 02/03/2020 | M |
| 17 | 5 | 1 | 4 | 0 | B.1.1.1 | 20D | 02/03/2020 | 22/02/2020 | 02/03/2020 | S |
| 16 | 8 | 8 | 0 | 0 | B.1.1.1 | 20D | 02/03/2020 | 22/02/2020 | 02/03/2020 | IT |
| 19 | 4 | 4 | 0 | 0 | B.1 | 20C | 08/03/2020 | 07/02/2020 | 11/03/2020 | IT |
| 18 | 3 | 3 | 0 | 0 | B.1.1 | 20B | 08/03/2020 | 06/02/2020 | 17/03/2020 | IT |
| 20 | 5 | 5 | 0 | 0 | B.1.1 | 20B | 18/03/2020 | 24/02/2020 | 24/03/2020 | IT |
| 21 | 4 | 4 | 0 | 0 | B.1 | 20A | 31/03/2020 | 05/03/2020 | 15/04/2020 | IT |
aItalian strains.
bEuropean strains, with the exception of Italy.
cChinese strains.
dMaximum likelihood.
eConfidence Interval.
*Type of cluster: M, mixed; IT, Italian; S, single Italian isolate.
Figure 4Ancestral reconstruction of SARS-CoV-2 lineages B.1 using the European dataset. The figure shows the compressed visualization produced by PastML using marginal posterior probability approximation (MPPA) with an F81-like model. Different colours correspond to different European countries and lineages. Numbers inside (or next to) the circles indicate the number of strains assigned to the specific node. The joint ancestral scenario (Joint) and maximum a posteriori (MAP) predictions are shown for the uncertain nodes (shown as octagonal icons). CN, China; IT, Italy, EU, Europe.
Figure 5Ancestral reconstruction of SARS-CoV-2 lineages B using the European dataset. The figure shows the compressed visualization produced by PastML using marginal posterior probability approximation (MPPA) with an F81-like model. Different colours correspond to different European countries and lineages. Numbers inside (or next to) the circles indicate the number of strains assigned to the specific node. The joint ancestral scenario (Joint) and maximum a posteriori (MAP) predictions are shown for the uncertain nodes (shown as octagonal icons). CN, China; IT, Italy, EU, Europe.