| Literature DB >> 32869040 |
Jacob E Lemieux1,2, Katherine J Siddle1,3, Bennett M Shaw1,2, Christine Loreth1, Stephen F Schaffner1,3,4, Adrianne Gladden-Young1, Gordon Adams1, Timelia Fink5, Christopher H Tomkins-Tinch1,3, Lydia A Krasilnikova1,3, Katherine C DeRuff1, Melissa Rudy1, Matthew R Bauer1,6, Kim A Lagerborg1,6, Erica Normandin1,7, Sinead B Chapman1, Steven K Reilly1,3, Melis N Anahtar8, Aaron E Lin1,3, Amber Carter1, Cameron Myhrvold1,3, Molly E Kemball1,7, Sushma Chaluvadi1, Caroline Cusick1, Katelyn Flowers1, Anna Neumann1, Felecia Cerrato1, Maha Farhat9,10, Damien Slater2, Jason B Harris2,11, John Branda8, David Hooper2, Jessie M Gaeta12,13, Travis P Baggett12,14,15, James O'Connell12,14,15, Andreas Gnirke1, Tami D Lieberman1,16, Anthony Philippakis1, Meagan Burns5, Catherine M Brown5, Jeremy Luban1,17,18, Edward T Ryan2,4,15, Sarah E Turbett2,8,15, Regina C LaRocque2,15, William P Hanage19, Glen R Gallagher5, Lawrence C Madoff5,20, Sandra Smole5, Virginia M Pierce8,21,22, Eric Rosenberg2,8, Pardis C Sabeti1,3,4,18,23, Daniel J Park1, Bronwyn L Maclnnis1,4,18.
Abstract
SARS-CoV-2 has caused a severe, ongoing outbreak of COVID-19 in Massachusetts with 111,070 confirmed cases and 8,433 deaths as of August 1, 2020. To investigate the introduction, spread, and epidemiology of COVID-19 in the Boston area, we sequenced and analyzed 772 complete SARS-CoV-2 genomes from the region, including nearly all confirmed cases within the first week of the epidemic and hundreds of cases from major outbreaks at a conference, a nursing facility, and among homeless shelter guests and staff. The data reveal over 80 introductions into the Boston area, predominantly from elsewhere in the United States and Europe. We studied two superspreading events covered by the data, events that led to very different outcomes because of the timing and populations involved. One produced rapid spread in a vulnerable population but little onward transmission, while the other was a major contributor to sustained community transmission, including outbreaks in homeless populations, and was exported to several other domestic and international sites. The same two events differed significantly in the number of new mutations seen, raising the possibility that SARS-CoV-2 superspreading might encompass disparate transmission dynamics. Our results highlight the failure of measures to prevent importation into MA early in the outbreak, underscore the role of superspreading in amplifying an outbreak in a major urban area, and lay a foundation for contact tracing informed by genetic data.Entities:
Year: 2020 PMID: 32869040 PMCID: PMC7457619 DOI: 10.1101/2020.08.23.20178236
Source DB: PubMed Journal: medRxiv
Fig 1.Epidemiology of SARS-CoV-2 in Massachusetts and of sequenced viral genomes.
A. Cumulative confirmed and presumed cases reported state-wide in MA (7) from March 1 through May 1, 2020, and the number of these cases that were processed (orange) and successfully yielded complete genomes with >98% coverage (green) in this study. B. Cumulative proportion of all MA confirmed positive cases with complete genome sequences from unique individuals that are part of this dataset over time. C. Daily reported cases across MA from March 1 through June 15 statewide (blue) and at MGH (orange). D. Total number of cases compared to cases per 100,000 people for cities across MA. Cities in blue are highly represented in the genome dataset. E. Distribution of MA cases with sequenced viral genomes by county. F. As in E but showing only Middlesex and Suffolk counties, the two counties with the highest number of sequenced samples, by zip code. Cases associated with congregate living environments were excluded from the maps in E and F.
Fig 2.A.
A. Time tree of 772 MA genomes and a global set of 4,011 high-quality genomes from GISAID. The embedded panel shows the C2416T clade in detail (outlined in gray on the main tree). To view an interactive version of this tree and for more information on specific sub-groupings within the MA dataset see auspice.broadinstitute.org. B. Estimated allele frequency in sequenced genomes over time for major Boston-area lineages. C. Frequency of the C2416T allele in 58,043 GISAID samples reported through July 14, 2020. D. Proportion of genomes that were inferred as imported (ancestral state as not from MA) in the early (prior to March 28, 2020), middle (March 28 - April 14, 2020) and late (after April 15, 2020) time periods of the MA epidemic.
Ancestral trait inference. Results of discrete trait inference using a binary model (MA vs non-MA) and regional model (regional geographic categories) are shown, divided into date ranges representing the early, middle, and late period of the first wave of the MA epidemic.
| Region | Before March 28 | March 28 - April IB | After April 15 |
|---|---|---|---|
| Binary model | |||
| Imported (Non-MA) | 44 | 24 | 14 |
| Not imported (MA) | 90 | 289 | 172 |
| Regional model | |||
| North America | 18 | 17 | 5 |
| Europe | 18 | 3 | 4 |
| Oceania | 1 | 0 | 0 |
| Asia | 2 | 2 | 0 |
Fig 3.A.
A. Time-measured maximum-likelihood phylogeny of 772 MA genomes. B. Maximum clade credibility tree with tips labeled by clade. Nodes with posterior support > 0.8 are labeled. C. Violin plots of tMRCA for the major Boston-area clades.
Fig 4.SARS-CoV-2 superspreading events.
A. Haplotype network of SARS-CoV-2 haplotypes in the MA dataset with major known superspreading events highlighted. B, C. Gene graphs showing clusters of highly similar sequences among viral genomes from the SNF (B) and BHCHP (C) cohorts. Sequences are clustered when they are separated by < 4 SNPs, and the lengths of lines between points reflect genetic distance. D. Detection of common respiratory viruses from metagenomic sequencing data. Samples with >10 reads mapped to at least 1 of these viruses using Kraken2 are shown in red. Enterovirus and Rhinovirus species have been grouped due to difficulty in discriminating at the sequence level.
Major Boston-area lineages identified by lineage-defining mutation.
| Lineage | Root | C20099T | G3892T | C2416T | G105T | G28899T |
|---|---|---|---|---|---|---|
| Number of Genomes | 772 | 21 | 77 | 288 | 98 | 34 |
| Epidemiology | BHCHP | SNF | Conference, BHCHP | BHCHP | ||
| Amino Acid substitution | ORF1b: A2211V; NSP15: A160V | ORF1a: E1209D; NSP3: E391D | N: R56I, ORF14: E56* | |||
| Median tMRCA (95% HPD) | 2019-12-15 (2019-11-20 – 2019-01-04) | 2020-04-04 (2020-03-30 – 2020-04-08) | 2020-03-19 (2020-03-13 – 2020-03-23) | 2020-02-14 (2020-02-04 – 2020-02-20) | 2020-03-10 (2020-03-01 – 2020-03-16) | 2020-03-15 (2020-03-04 – 2020-03-21) |