Literature DB >> 32386545

Coast-to-Coast Spread of SARS-CoV-2 during the Early Epidemic in the United States.

Joseph R Fauver1, Mary E Petrone2, Emma B Hodcroft3, Kayoko Shioda2, Hanna Y Ehrlich2, Alexander G Watts4, Chantal B F Vogels2, Anderson F Brito2, Tara Alpert5, Anthony Muyombwe6, Jafar Razeq6, Randy Downing6, Nagarjuna R Cheemarla7, Anne L Wyllie2, Chaney C Kalinich2, Isabel M Ott8, Joshua Quick9, Nicholas J Loman9, Karla M Neugebauer5, Alexander L Greninger10, Keith R Jerome10, Pavitra Roychoudhury10, Hong Xie11, Lasata Shrestha11, Meei-Li Huang10, Virginia E Pitzer2, Akiko Iwasaki12, Saad B Omer13, Kamran Khan14, Isaac I Bogoch15, Richard A Martinello16, Ellen F Foxman17, Marie L Landry18, Richard A Neher3, Albert I Ko2, Nathan D Grubaugh19.   

Abstract

The novel coronavirus SARS-CoV-2 was first detected in the Pacific Northwest region of the United States in January 2020, with subsequent COVID-19 outbreaks detected in all 50 states by early March. To uncover the sources of SARS-CoV-2 introductions and patterns of spread within the United States, we sequenced nine viral genomes from early reported COVID-19 patients in Connecticut. Our phylogenetic analysis places the majority of these genomes with viruses sequenced from Washington state. By coupling our genomic data with domestic and international travel patterns, we show that early SARS-CoV-2 transmission in Connecticut was likely driven by domestic introductions. Moreover, the risk of domestic importation to Connecticut exceeded that of international importation by mid-March regardless of our estimated effects of federal travel restrictions. This study provides evidence of widespread sustained transmission of SARS-CoV-2 within the United States and highlights the critical need for local surveillance.
Copyright © 2020 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  COVID-19; MinION sequencing; SARS-CoV-2; coronavirus; genomic epidemiology; phylogenetics; travel risk

Mesh:

Year:  2020        PMID: 32386545      PMCID: PMC7204677          DOI: 10.1016/j.cell.2020.04.021

Source DB:  PubMed          Journal:  Cell        ISSN: 0092-8674            Impact factor:   41.582


Introduction

A novel coronavirus, known as SARS-CoV-2, was identified as the cause of an outbreak of pneumonia in Wuhan, China, in December 2019 (Gorbalenya et al., 2020, Wu et al., 2020, Zhou et al., 2020). Travel-associated cases of coronavirus disease 2019 (COVID-19) were reported outside of China as early as January 13, 2020, and the virus has subsequently spread to nearly all nations (World Health Organization, 2020a, World Health Organization, 2020b). The first detection of SARS-CoV-2 in the United States was a travel-associated case from Washington state on January 19, 2020 (Centers for Disease Control and Prevention, 2020a). The majority of early COVID-19 cases in the United States were (1) associated with travel to a “high-risk” country or (2) close contacts of previously identified cases according to the testing criteria adopted by the Centers for Disease Control and Prevention (CDC) (Centers for Disease Control and Prevention, 2020b). In response to the risk of more travel-associated cases, the United States placed travel restrictions on multiple regions with SARS-CoV-2 transmission, including China on January 31, Iran on February 29, and Europe on March 11 (Taylor, 2020). However, community transmission of SARS-CoV-2 was detected in the United States in late February, when a California resident contracted the virus despite meeting neither testing criterium (Moon et al., 2020). From March 1–19, 2020, the number of reported COVID-19 cases in the United States rapidly increased from 74 to 13,677, and the virus was detected in all 50 states (Dong et al., 2020). It was recently estimated that the true number of COVID-19 cases in the United States is likely in the tens of thousands (Perkins et al., 2020), suggesting substantial undetected infections and spread within the country. We hypothesized that, with the growing number of COVID-19 cases in the United States and the large volume of domestic travel, new United States outbreaks are now more likely to result from interstate rather than international spread. Because of its proximity to several high-volume airports, southern Connecticut is a suitable location in which to test this hypothesis. By sequencing SARS-CoV-2 from local cases and comparing their relatedness to virus genome sequences from other locations, we used “genomic epidemiology” (Grubaugh et al., 2019a) to identify the likely sources of SARS-CoV-2 in Connecticut. We supplemented our viral genomic analysis with airline travel data from major airports in southern New England to estimate the risk of domestic and international importation therein. Our data suggest that the risk of domestic importation of SARS-CoV-2 into this region now far outweighs that of international introductions regardless of federal travel restrictions and provide evidence for coast-to-coast SARS-CoV-2 spread in the United States.

Results

Phylogenetic Clustering of Connecticut SARS-CoV-2 Genomes Demonstrates Interstate Spread

To delineate the roles of domestic and international virus spread in the emergence of new United States COVID-19 outbreaks, we sequenced SARS-CoV-2 viruses collected from cases identified in Connecticut. Our phylogenetic analyses showed that the outbreak in Connecticut was caused by multiple virus introductions and that most of these viruses were related to those sequenced from other states rather than international locations (Figure 1 ).
Figure 1

The COVID-19 Outbreak in Connecticut Is Phylogenetically Linked to SARS-CoV-2 from Washington

(A) We constructed a maximum-likelihood tree using 168 global SARS-CoV-2 protein coding sequences, including 9 sequences from COVID-19 patients identified in Connecticut from March 6–14, 2020. The total number of nucleotide differences from the root of the tree quantifies evolution since the putative SARS-CoV-2 ancestor. We included clade-defining nucleotide substitutions to directly show the evidence supporting phylogenetic clustering. The number of SARS-CoV-2 genomes used in this phylogenetic tree from each location is shown in parentheses.

(B) We enlarged the United States clade consisting primarily of SARS-CoV-2 sequences from Washington state and Connecticut. The map shows the location and number of SARS-CoV-2 genomes that cluster within this clade. The MinION sequencing statistics are enumerated in Data S1, and the SARS-CoV-2 sequences used and author acknowledgments can be found in Data S2. A root-to-tip plot showing the genetic diversity and substitution rate of the data can be found in Figure S1. The genomic data can be visualized and interacted with at https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper1.

The COVID-19 Outbreak in Connecticut Is Phylogenetically Linked to SARS-CoV-2 from Washington (A) We constructed a maximum-likelihood tree using 168 global SARS-CoV-2 protein coding sequences, including 9 sequences from COVID-19 patients identified in Connecticut from March 6–14, 2020. The total number of nucleotide differences from the root of the tree quantifies evolution since the putative SARS-CoV-2 ancestor. We included clade-defining nucleotide substitutions to directly show the evidence supporting phylogenetic clustering. The number of SARS-CoV-2 genomes used in this phylogenetic tree from each location is shown in parentheses. (B) We enlarged the United States clade consisting primarily of SARS-CoV-2 sequences from Washington state and Connecticut. The map shows the location and number of SARS-CoV-2 genomes that cluster within this clade. The MinION sequencing statistics are enumerated in Data S1, and the SARS-CoV-2 sequences used and author acknowledgments can be found in Data S2. A root-to-tip plot showing the genetic diversity and substitution rate of the data can be found in Figure S1. The genomic data can be visualized and interacted with at https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper1.
Figure S1

Root-to-Tip Plot Showing the Evolutionary Rate of the SARS-CoV-2 Genomes in Our Dataset, Related to Figure 1

We sequenced SARS-CoV-2 genomes from nine of the first COVID-19 cases reported in Connecticut, with sample collection dating from March 6–14, 2020 (Data S1). These individuals are residents of eight different cities in Connecticut. According to the Connecticut State Department of Public Health, none of the cases were associated with international travel. Using our amplicon sequencing approach, “PrimalSeq” (Grubaugh et al., 2019b, Quick et al., 2017), with the portable Oxford Nanopore Technologies (ONT) MinION platform, we generated the first SARS-CoV-2 genome approximately 14 h after receiving the sample (CT-Yale-006), demonstrating our ability to perform near-real-time clinical sequencing and bioinformatics. Our complete workflow included RNA extraction, PCR testing, validation of PCR results, library preparation, sequencing, and live base calling and read mapping. We shared the genomes of these viruses publicly as we generated them (GISAID EPI_ISL_416416-416424). We combined our genomes with other publicly available sequences for a final dataset of 168 SARS-CoV-2 genomes (Figure 1; Data S2). The dataset can be visualized on our “community” Nextstrain page (https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper1). We built phylogenetic trees using a maximum likelihood reconstruction approach, and we used shared nucleotide substitutions to assess clade support (Figure 1; Data S3). Our first nine SARS-CoV-2 genomes clustered into three distinct phylogenetic clades, indicating multiple independent virus introductions into Connecticut. Our SARS-CoV-2 genome CT-Yale-001 clusters closely with other viruses sequenced from Asia (China), whereas the close genetic relatedness of genomes from Europe and Washington state in the clade that contains CT-Yale-006 makes it difficult to track the origins of this virus (Figure 1A). Regardless, neither the CT-Yale-001 nor the CT-Yale-006 COVID-19 cases were travel-associated, which indicates that these patients were part of domestic transmission chains that stemmed from undetected introductions. The other seven SARS-CoV-2 genomes clustered with a large, primarily United States clade, within which the majority of genomes were sequenced from cases in Washington state (Figure 1B). Because of a paucity of SARS-CoV-2 genomes from other regions within the United States, we could not determine the exact domestic origin of these viruses in Connecticut. We also cannot yet determine whether the higher number of substitutions observed in CT-Yale-007 and CT-Yale-008 (Figure 1B) compared with the other Connecticut virus genomes within this clade was the result of multiple introductions or of significant undersampling. However, given that seven of our nine Connecticut SARS-CoV-2 genomes fell within this clade versus the many other international clades, these were most likely the result of a common domestic source(s) rather than repeated international introductions. Importantly, our data indicate that, by early to mid-March, there had already been interstate spread during the early COVID-19 epidemic in the United States.

Travel and Epidemiological Patterns Reveal Significant Domestic Importation Risk

Our phylogenetic analysis shows that the COVID-19 outbreak in Connecticut was driven, in part, by domestic virus introductions. To compare the roles of interstate and international SARS-CoV-2 spread in the United States, we used airline travel data and the epidemiological dynamics in regions where travel routes originated to evaluate importation risk. We found that, because of the large volume of daily domestic air passengers, the dominant importation risk into the Connecticut region switched from international to domestic by early to mid-March (Figure 2 ).
Figure 2

Domestic Outbreaks and Travel Are a Rising Source of SARS-CoV-2 Importation Risk

(A) To compare the relative risks of SARS-CoV-2 importations from domestic and international sources, we selected five international (China, Italy, Iran, Spain, and Germany) and out-of-region states (Washington, California, Florida, Illinois, and Louisiana) with the highest number of reported COVID-19 cases as of March 19, 2020.

(B) We selected three international airports in the region that are commonly used by Connecticut residents: Hartford (BDL), Boston (BOS), and New York (JFK). We used data from January to March 2019 to estimate relative differences in daily air passenger volumes from the selected origins to the airport destinations. These daily estimates were then combined by either international or domestic travel.

(C and D) The cumulative number of daily COVID-19 cases were divided by 100,000 population to calculate normalized disease prevalence for each international location (China, Italy, Iran, Spain, and Germany) (C). The cumulative number of daily COVID-19 cases were divided by 100,000 population to calculate normalized disease prevalence for each international location (Washington, California, Florida, Illinois, and Louisiana) (D).

(E) We calculated importation risk by modeling the number of daily prevalent COVID-19 cases in each potential importation source and then estimating the number of infected travelers using the daily air travel volume from each location. The data, criteria, and analyses used to create this figure can be found in Data S3.

Domestic Outbreaks and Travel Are a Rising Source of SARS-CoV-2 Importation Risk (A) To compare the relative risks of SARS-CoV-2 importations from domestic and international sources, we selected five international (China, Italy, Iran, Spain, and Germany) and out-of-region states (Washington, California, Florida, Illinois, and Louisiana) with the highest number of reported COVID-19 cases as of March 19, 2020. (B) We selected three international airports in the region that are commonly used by Connecticut residents: Hartford (BDL), Boston (BOS), and New York (JFK). We used data from January to March 2019 to estimate relative differences in daily air passenger volumes from the selected origins to the airport destinations. These daily estimates were then combined by either international or domestic travel. (C and D) The cumulative number of daily COVID-19 cases were divided by 100,000 population to calculate normalized disease prevalence for each international location (China, Italy, Iran, Spain, and Germany) (C). The cumulative number of daily COVID-19 cases were divided by 100,000 population to calculate normalized disease prevalence for each international location (Washington, California, Florida, Illinois, and Louisiana) (D). (E) We calculated importation risk by modeling the number of daily prevalent COVID-19 cases in each potential importation source and then estimating the number of infected travelers using the daily air travel volume from each location. The data, criteria, and analyses used to create this figure can be found in Data S3. We first estimated daily passenger volumes arriving in the region from the five countries (China, Italy, Iran, Spain, and Germany) and out-of-region states (Washington, California, Florida, Illinois, and Louisiana) that have reported the most COVID-19 cases to date (Figures 2A–2D). By March 18, the five countries comprised 78% of reported non-United States cases, whereas the five states comprised 48% of reported domestic cases outside of Connecticut and New York. To this end, we collected passenger volumes arriving in three major airports in southern New England: Bradley International Airport (BDL; Hartford, Connecticut), General Edward Lawrence Logan International Airport (BOS; Boston, Massachusetts), and John F. Kennedy International Airport (JFK; New York, New York; Figure 2B). Because travel data for 2020 are not yet available, we calculated the total passenger volume from each origin and destination pair between January and March 2019, and estimated the number of daily passengers. We found that the daily domestic passenger volumes were ∼100 times greater than international in Hartford, ∼10 times greater in Boston, and ∼4 times greater in New York in our dataset (Figure 2B). By combining daily passenger volumes (Figure 2B) with COVID-19 prevalence at the travel route origin (Figures 2C and 2D) and accounting for differences in reporting rates, we found that the domestic and international SARS-CoV-2 importation risk started to increase dramatically at the beginning of March 2020 (Figure 2E). Without accounting for the effects of international travel restrictions, our estimated domestic importation risk from the selected five states surpassed the international importation risk by March 10. Using previous assumptions around travel restrictions (Chinazzi et al., 2020), we also modeled two scenarios where federal travel restrictions reduced passenger volume by 40% and by 90% from the restricted countries (Figure 2E). Because of the overall low prevalence of COVID-19 in China, we did not find any significant effects of travel restrictions from China that were enacted on February 1 (Data S3). Also, we did not find significant changes to the importation risk following travel restrictions from Iran on March 1, likely because of the relatively small number of passengers arriving from that country (Data S3). Although we did find a dramatic decrease in international importation risk following the restrictions on travel from Europe (March 13), this decrease occurred after our estimates of domestic travel importation risk had already surpassed that of international importation (Figure 2E). The dramatic rises in domestic and international importation risk preceded the state-wide COVID-19 outbreak in Connecticut (Figure 2E), and the recent increase in risk of domestic importation may give rise to new outbreaks in the region.

Discussion

The combined results of our genomic epidemiology and travel pattern analyses suggest that domestic spread recently became a significant source of new SARS-CoV-2 infections in the United States. We find strong evidence that outbreaks on the East Coast (Connecticut) are linked to outbreaks on the West Coast (Washington), demonstrating that trans-continental spread has already occurred. As of March 25, there are more than 1,000 SARS-CoV-2 genomes sequenced from around the world, including more than 350 from the United States (https://nextstrain.org/ncov); however, most of the latter were obtained from a small number of states. Therefore, we cannot determine the exact origins of the viral introductions into Connecticut. Recent domestic travel history of the nine reported cases was not available, but it is unlikely that all of the infections originated in Washington state. Furthermore, because of low genetic diversity between these early sequences from Connecticut and Washington, we cannot yet quantify the rate at which the virus may be spreading between the United States coasts or whether an introduction from a common source is responsible for phylogenetic grouping. There are likely other large, multi-state phylogenetic SARS-CoV-2 clades that exist in the United States. As testing capacity increases and more viral genome sequences become available from new locations, more granular reconstructions of virus spread throughout the United States will be possible (Grubaugh et al., 2019a). Specifically, elucidating the phylogenetic relationship of viral genomes collected in Connecticut to those collected in neighboring states, especially states with a high burden of disease, like New York, will improve our understanding of critical interstate dynamics. Our estimates of domestic importation risk are likely conservative despite some important limitations of our air travel analysis. Because we do not have access to current airline data, we could not exactly quantify the effect of government restrictions on international travel. In addition, even without explicit government restrictions, general social distancing and work-from-home guidelines are reducing all airline travel. By using airline data available from 2019, we did not account for these decreases in our international or domestic travel patterns. Although such variations may lower our domestic risk estimates, we also did not account for the large volumes of regional automobile and rail travel, especially along the corridor that connects Massachusetts, New York, New Jersey, Pennsylvania, and Washington D.C. to Connecticut. We do not believe that Connecticut is more closely connected to its neighbors than states in other regions of the country. Therefore, our risk estimates indicate that this interconnectedness will perpetuate the domestic spread of SARS-CoV-2 and that domestic spread will likely become the primary source of new infections in the United States. We argue that, although simplistic, our model demonstrates the urgent need to focus control efforts in the United States on preventing further domestic virus spread. As this epidemic progresses, domestic introductions of the virus could undermine control efforts in areas that have successfully mitigated local transmission. In China, local outbreak dynamics were highly correlated with travel between Wuhan and the outbreak dynamics therein during the early months of the epidemic (Kraemer et al., 2020). Similarly, if interstate introductions are not curtailed in the United States with improved surveillance measures, more robust diagnostic capabilities, and proper clinical care, quelling local transmission within states will be a Sisyphean task. We therefore propose that a unified effort to detect and prevent new COVID-19 cases will be essential for mitigating the risk of future domestic outbreaks. This effort must ensure that states have sufficient personal protective equipment, sample collection materials, and testing reagents because these supplies enable effective surveillance. Finally, state- and local-level policymakers must recognize that the health and well-being of their constituents are contingent on that of the nation. If spread between states is now occurring, as our results indicate, then the United States will struggle to control COVID-19 in the absence of a unified surveillance strategy.

STAR★Methods

Key Resources Table

Resource Availability

Lead Contact

Further information and requests for data, resources, and reagents should be directed to and will be fulfilled by the Lead Contact, Nathan D. Grubaugh (nathan.grubaugh@yale.edu).

Materials Availability

This study did not generate new unique reagents, but raw data and code generated as part of this research can be found in the Supplemental Files, as well as on public resources as specified in the Data and Code Availability section below.

Data and Code Availability

The accession number for the SARS-CoV-2 sequence data reported in this paper is NCBI BioProject:PRJNA614976 and GISAID:EPI_ISL_416416-416424. Sequencing data have been made available via SRA. Data used to create the figures can be found in the supplemental files. The interactive Nextstain page to visualize the genomic data can be found at: https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper1. The raw data, results, and analyses can be found at: https://github.com/grubaughlab/CT-SARS-CoV-2.

Experimental Model and Subject Details

Ethics Statement

Residual de-identified nasopharyngeal samples testing positive for SARS-CoV-2 by reverse-transcriptase quantitative (RT-q)PCR were obtained from the Yale-New Haven Hospital Clinical Virology Laboratory or the Connecticut State Department of Public Health. In accordance with the guidelines of the Yale Human Investigations Committee and the Connecticut State Department of Public Health, this work with de-identified samples is considered non-human subjects research. All samples were de-identified before receipt by the study investigators.

Method Details

Sample collection and processing

Samples for this study were collected during an early testing phase by the Connecticut State Department of Public Health or the Yale Clinical Virology Laboratory at the Yale School of Medicine. None of the cases that we sequenced in this study were associated with international travel. All samples included in this study had CT values less than 35, sufficient volume of RNA for library preparation, and were collected by March 14. As early samples were crucial for validating PCR diagnostics in multiple laboratories, the number of samples meeting these criteria were limited. Nasopharyngeal swabs were collected from patients presenting with symptoms of SARS-CoV-2 infection at multiple medical centers in Connecticut. These patients are all Connecticut residents, but we do not have access to location data associated with each of these early SARS-CoV-2 genomes to avoid patient identification. Swabs were placed in virus transport media (BD Universal Viral Transport Medium) immediately upon collection. Samples (200 μL) were subjected to total nucleic acid extraction using the NUCLISENS easyMAG platform (BioMérieux, France) at the Yale Clinical Virology Laboratory. The recommended CDC RT-qPCR assay was used to test for the presence of SARS-CoV-2 RNA (Centers for Disease Control and Prevention, 2020c). A total of 10 samples from 10 different individuals met our inclusion criteria and were selected to to move forward with next generation sequencing (NGS). Of these, we were successfully able to generate sequencing libraries from nine samples.

SARS-CoV-2 Sequencing

SARS-CoV-2 positive samples were processed for NGS using a highly multiplexed PCR amplicon approach for sequencing on the Oxford Nanopore Technologies (ONT; Oxford, United Kingdom) MinION using the V1 primer pools (Quick et al., 2017). Sequencing libraries were barcoded and multiplexed using the Ligation Sequencing Kit and Native Barcoding Expansion pack (ONT) following the ARTIC Network’s library preparation protocol (V1 primers) (Quick, 2020) with the following minor modifications: cDNA was generated with SuperScriptIV VILO Master Mix (ThermoFisher Scientific, Waltham, MA, USA), a total of 20 ng of each sample was used as input into end repair, end repair incubation time was increased to 25 min followed by a 1:1 bead-based clean up, and Blunt/TA ligase (New England Biolabs, Ipswich, MA, USA) was used to ligate barcodes to each sample. cDNA synthesis and amplicon generation was performed concurrently for each sample. Samples were processed by CT value to reduce the likelihood of contamination from high titer samples to low titer samples. Barcoding, adaptor ligation, and sequencing was performed on samples with CT values between 25-35 (low titer group) prior to samples with CT values below 25 (high titer group) (Data S1). Two samples, Yale-006 and Yale-007, were diluted 1:100 in nuclease-free water prior to cDNA synthesis. A no template control was created at the cDNA synthesis step and amplicon generation step to detect cross-contamination between samples. Controls were barcoded and sequenced with both the high and low titer sample groups. A total of 24 ng of the low titer group was loaded onto a MinION R9.4.1 flow cell and sequenced for a total of 5.5 h and generated 2.1 million reads. The flow cell was nuclease treated, flushed, and primed prior to loading 25 ng of the high titer group library. These samples were sequenced for a total of 9 h and generated 1.4 million reads (Data S1). The RAMPART software from the ARTIC Network was used to monitor the sequencing run to estimate the depth of coverage across the genome for each barcoded sample in both runs https://github.com/artic-network/rampart). Following completion of the sequencing runs, .fast5 files were basecalled with Guppy (v3.5.1, ONT) using the high accuracy module. Basecalling was performed on a single GPU node on the Yale HPC. Consensus genomes were generated for input into phylogenetic analysis according to the ARTIC Network bioinformatic pipeline (Artic Network). Variants in the consensus genomes were called using nanopolish per the bioinformatic pipeline (Loman et al., 2015). Amplicons that were not sequenced to depth of 20x were not included in the final consensus genome, and these positions are represented by stretches of NNN’s (Data S1).

Phylogenetic analysis

To investigate the origin and diversity of SARS-CoV-2 in Connecticut, we compiled a dataset of our nine genomes with another 159 representative sample of SARS-CoV-2 genomes that were available from GenBank (https://www.ncbi.nlm.nih.gov/genbank/sars-cov-2-seqs/) and GISAID (https://www.gisaid.org/). See Data S2 for a list of sequences and acknowledgments to the originating and submitting labs. No data that was only released on GISAID was used without consent from the authors (see Acknowledgments). We aligned consensus genomes using the augur toolkit version 6.4.2 (Hadfield et al., 2018). Specifically, we aligned sequences using mafft (Katoh et al., 2002), masked sites at the 5′ and 3′ ends of the alignment as well as a small number of sites that likely vary due to assembly artifacts (see https://github.com/nextstrain/ncov), and reconstructed a phylogeny using IQ-Tree (Nguyen et al., 2015). These trees are further processed using augur and treetime to add ancestral reconstructions (Sagulenko et al., 2018). The tree is rooted on the ancestor of the two genomes “Wuhan-Hu-1/2019” and “Wuhan/WH01/2019.” Sequences in this sample differ from the root by 10 or fewer nucleotide substitutions. Bootstrap values are not a meaningful measure of branch support in this case. Here, many of the branches are supported by one substitution, which would correspond to a bootstrap support of 0.63. For a branch supported by two substitutions the bootstrap support value would correspondingly be 0.86. Given this approximate one-to-one mapping between bootstrap values and the number of substitutions, we directly show mutations supporting the major splits in the tree as it is more informative. The substitutions defining these clades are compatible with the tree topology and are not homoplastic. The probability that all clade defining substitutions arose multiple times independently in a manner compatible with the tree topology is vanishingly small. For example, with a rate of 2 nucleotide substitutions per month in a genome of length approximately 29’000 bases, the probability of this happening for any pair of six sister clades within a 2 month time frame is < 0.01. A root-to-tip plot can be found in Figure S1 . The data can be visualized at: https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper1. Root-to-Tip Plot Showing the Evolutionary Rate of the SARS-CoV-2 Genomes in Our Dataset, Related to Figure 1

International and U.S. COVID-19 cases

Daily COVID-19 cases from international locations were obtained from the European Centre for Disease Prevention and Control via Our World in Data (https://ourworldindata.org/coronavirus-source-data). International data were accessed on March 19, 2020. Daily COVID-19 cases from Connecticut and other U.S. locations (Washington, California, Florida, Illinois, and Louisiana) were obtained from the repository (https://github.com/CSSEGISandData/COVID-19) hosted by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (Dong et al., 2020). These represent the international and out-of-region domestic (i.e., excluding New York, Massachusetts, and New Jersey) locations with the most reported COVID-19 cases.

Air passenger volumes

To investigate the domestic and international spread of SARS-CoV-2, we obtained air passenger volumes from the International Air Transport Association (IATA; http://www.iata.org/). IATA data consist of global ticket sales, which account for true origins and final destinations, and represents 90% of all commercial flights. We obtained the monthly number of passengers traveling by air from five international (China, Italy, Iran, Spain, and Germany) and five U.S. locations (Washington, California, Florida, Illinois, and Louisiana) to airports that are commonly used by Connecticut residents: Bradley International Airport (BDL, Hartford, Connecticut; ranked 53rd in U.S. in yearly passenger volume; https://www.faa.gov/airports/planning_capacity/passenger_allcargo_stats/passenger/), General Edward Lawrence Logan International Airport (BOS, Boston, Massachusetts; ranked 16th), and John F. Kennedy International Airport (JFK, New York, New York; ranked 6th). Air passenger data from 2020 is not currently available; thus, we used data from January to March 2019 to represent general trends in passenger volumes, as done previously (Bogoch et al., 2020). We took the average of the 3-month passenger volumes to estimate the daily number of travelers entering each airport from the specified origin. To account for passenger reductions following U.S. government alerts and restrictions (Taylor, 2020), we modeled two scenarios: a 40% reduction in passenger volume and a 90% reduction in passenger volume. These thresholds were determined based on previously reported estimates and assumptions around travel restrictions (Chinazzi et al., 2020).

Travel importation risk estimates

We estimated the true number of incident cases per day by adjusting the number of reported incident cases to reflect the ascertainment period and reporting rate using:where C is the number of reported incident cases of COVID-19 on day t, d is the number of days from symptom onset to testing, and ρ is the reporting rate. We assumed a constant ascertainment period of d = 5 days between symptom onset and testing (Ferguson et al., 2020). Because of the evidence of pre-symptomatic transmission (Tindale et al., 2020), we also assumed that cases become infectious one day before symptom onset. To account for substantial uncertainty around reporting rates, we assigned different reporting rates to individual locations based on the testing criteria enacted in that location (Niehus et al., 2020). For each country and state, we first extracted testing criteria from the department or ministry of health website. We assumed that countries or states with similar testing criteria policies captured similar proportions of true infections. Using the respective testing criteria, we categorized countries or states as having narrow, moderate, or broad testing levels. We then assigned reporting rates to each testing level by using the mean and 95% confidence interval of the reporting rate estimated by Nishiura et al. (2020): 0.092 (95% CI = 0.05–0.20). The reporting rate for the broadest testing level, ρ = 0.20, also corresponded to the reporting rate in Mainland China (Chinazzi et al., 2020). We thus assigned Iran, Florida, Washington, and Illinois to a “narrow” testing level (ρ = 0.05); Spain, Italy, and Louisiana to a “moderate'' testing level (ρ=0.092); and China, Germany, and California to a “broad” testing level (ρ = 0.20; Data S2, “testing-criteria”). To estimate the number of prevalent infectious individuals on day t (P ), we multiplied the number of incident infections up to day t by the probability that an individual who became infectious on day i was still infectious on day t:Where γ(t-i) is the cumulative distribution function of the infectious period. We modeled the infectious period as gamma distribution with mean 7 days and standard deviation 4.5 days which aligns with other modeling studies (Prem et al., 2020, Zhao et al., 2020). We assumed that cases would not travel once they were diagnosed and therefore removed them from our estimate of infectious travelers (T ):The first term of Equation 3 accounts for the assumption that some cases had been diagnosed by day t and thus would not travel. The second and third terms capture cases who are infectious on day t and have not yet been diagnosed. We calculated daily risk of importation as a function of the population-adjusted density of infectious travelers and passenger volume:where T is the number of infectious travelers on day t, pop is the population of location A, and n is the number of passengers traveling from each location to southern New England on day t. We summed the calculated risk across the three airports (BDL, BOS, JFK) and then across domestic and international travelers to arrive at our final estimates.

Maps

The maps presented in our figures were generated using shape files from Natural Earth (http://www.naturalearthdata.com/). The basemaps are open source and freely available to anyone.

Quantification and Statistical Analysis

Statistical analyses were performed using R version 3.5.2 (R Core Team, 2017) and are described in the figure legends and in the Method Details.
REAGENT or RESOURCESOURCEIDENTIFIER
Biological Samples

Clinical samplesCT State Dept Public HealthN/A
Clinical samplesYale Clinical Virology LabN/A

Critical Commercial Assays

SuperScript IV VILO Master MixThermoFisher11756050
Q5 High-Fidelity 2X Master MixNew England BioLabsM0492S
Qubit High Sensitivity dsDNA kitThermoFisherQ32851
Nuclisens easyMAGBioMérieux280135
Mag-Bind TotalPure NGSOmega Bio-TekM1378-01
Ligation Sequencing KitOxford Nanopore TechSQK-LSK109
Native Barcoding KitOxford Nanopore TechEXP-NBD114
R9.4.1 Flow cellOxford Nanopore TechFLO-MIN106D
Blunt/TA Ligase Master MixNew England BioLabsMO367L
NEBNext Ultra II End Repair/dA-Tailing ModuleNew England BioLabsE7546S
NEBNext Quick Ligation ModuleNew England BioLabsE6056S

Deposited Data

International COVID-19 casesECDChttps://ourworldindata.org/coronavirus-source-data
U.S. COVID-19 casesJHUDong et al., 2020
Air passenger volumes (commercial)IATAhttps://www.iata.org/pages/default.aspx
SARS-CoV-2 GenomesGISAID (EPI_ISL_416416-416424)https://www.gisaid.org/
SARS-CoV-2 Sequencing DataSRA, NCBIhttps://www.ncbi.nlm.nih.gov/bioproject/PRJNA614976

Software and Algorithms

RCRANhttps://cran.r-project.org/
IQ-Treehttp://www.iqtree.org/Nguyen et al., 2015
augur toolkithttps://bedford.io/projects/augur/Hadfield et al., 2018
MAFFThttps://mafft.cbrc.jp/alignment/software/Katoh et al., 2002
treetimehttps://github.com/neherlab/treetimeSagulenko et al., 2018
RAMPARTARTIC Networkhttps://github.com/artic-network/rampart
ARTIC Network Bioinformatic protocolARTIC Networkhttps://artic.network/ncov-2019/ncov2019-bioinformatics-sop.html
Nextstrainhttps://nextstrain.org/Hadfield et al., 2018

Other

Amplicon sequencing protocolPrimalSeqQuick et al., 2017
  134 in total

1.  Systematic Review of Mutations Associated with Isoniazid Resistance Points to Continuing Evolution and Subsequent Evasion of Molecular Detection, and Potential for Emergence of Multidrug Resistance in Clinical Strains of Mycobacterium tuberculosis.

Authors:  Siavash J Valafar
Journal:  Antimicrob Agents Chemother       Date:  2021-02-17       Impact factor: 5.191

2.  Fundamental evolution of all Orthocoronavirinae including three deadly lineages descendent from Chiroptera-hosted coronaviruses: SARS-CoV, MERS-CoV and SARS-CoV-2.

Authors:  Denis Jacob Machado; Rachel Scott; Sayal Guirales; Daniel A Janies
Journal:  Cladistics       Date:  2021-04-26       Impact factor: 5.254

3.  COVID-19 in Patients With Inflammatory Arthritis: A Prospective Study on the Effects of Comorbidities and Disease-Modifying Antirheumatic Drugs on Clinical Outcomes.

Authors:  Rebecca H Haberman; Rochelle Castillo; Alan Chen; Di Yan; Deborah Ramirez; Vaish Sekar; Robert Lesser; Gary Solomon; Andrea L Neimann; Rebecca B Blank; Peter Izmirly; Dan E Webster; Alexis Ogdie; Andrea B Troxel; Samrachana Adhikari; Jose U Scher
Journal:  Arthritis Rheumatol       Date:  2020-10-25       Impact factor: 10.995

4.  Evolution and Epidemiology of SARS-CoV-2 Virus.

Authors:  Yu-Nong Gong; Kuo-Ming Lee; Shin-Ru Shih
Journal:  Methods Mol Biol       Date:  2022

5.  Shotgun Transcriptome and Isothermal Profiling of SARS-CoV-2 Infection Reveals Unique Host Responses, Viral Diversification, and Drug Interactions.

Authors:  Daniel J Butler; Christopher Mozsary; Cem Meydan; David Danko; Jonathan Foox; Joel Rosiene; Alon Shaiber; Ebrahim Afshinnekoo; Matthew MacKay; Fritz J Sedlazeck; Nikolay A Ivanov; Maria Sierra; Diana Pohle; Michael Zietz; Undina Gisladottir; Vijendra Ramlall; Craig D Westover; Krista Ryon; Benjamin Young; Chandrima Bhattacharya; Phyllis Ruggiero; Bradley W Langhorst; Nathan Tanner; Justyna Gawrys; Dmitry Meleshko; Dong Xu; Peter A D Steel; Amos J Shemesh; Jenny Xiang; Jean Thierry-Mieg; Danielle Thierry-Mieg; Robert E Schwartz; Angelika Iftner; Daniela Bezdan; John Sipley; Lin Cong; Arryn Craney; Priya Velu; Ari M Melnick; Iman Hajirasouliha; Stacy M Horner; Thomas Iftner; Mirella Salvatore; Massimo Loda; Lars F Westblade; Melissa Cushing; Shawn Levy; Shixiu Wu; Nicholas Tatonetti; Marcin Imielinski; Hanna Rennert; Christopher E Mason
Journal:  bioRxiv       Date:  2020-05-01

6.  Patterns of within-host genetic diversity in SARS-CoV-2.

Authors:  Gerry Tonkin-Hill; Inigo Martincorena; Roberto Amato; Andrew R J Lawson; Moritz Gerstung; Ian Johnston; David K Jackson; Naomi Park; Stefanie V Lensing; Michael A Quail; Sónia Gonçalves; Cristina Ariani; Michael Spencer Chapman; William L Hamilton; Luke W Meredith; Grant Hall; Aminu S Jahun; Yasmin Chaudhry; Myra Hosmillo; Malte L Pinckert; Iliana Georgana; Anna Yakovleva; Laura G Caller; Sarah L Caddy; Theresa Feltwell; Fahad A Khokhar; Charlotte J Houldcroft; Martin D Curran; Surendra Parmar; Alex Alderton; Rachel Nelson; Ewan M Harrison; John Sillitoe; Stephen D Bentley; Jeffrey C Barrett; M Estee Torok; Ian G Goodfellow; Cordelia Langford; Dominic Kwiatkowski
Journal:  Elife       Date:  2021-08-13       Impact factor: 8.140

7.  Genomic epidemiology of COVID-19 in care homes in the east of England.

Authors:  William L Hamilton; Gerry Tonkin-Hill; Emily R Smith; Dinesh Aggarwal; Charlotte J Houldcroft; Ben Warne; Luke W Meredith; Myra Hosmillo; Aminu S Jahun; Martin D Curran; Surendra Parmar; Laura G Caller; Sarah L Caddy; Fahad A Khokhar; Anna Yakovleva; Grant Hall; Theresa Feltwell; Malte L Pinckert; Iliana Georgana; Yasmin Chaudhry; Colin S Brown; Sonia Gonçalves; Roberto Amato; Ewan M Harrison; Nicholas M Brown; Mathew A Beale; Michael Spencer Chapman; David K Jackson; Ian Johnston; Alex Alderton; John Sillitoe; Cordelia Langford; Gordon Dougan; Sharon J Peacock; Dominic P Kwiatowski; Ian G Goodfellow; M Estee Torok
Journal:  Elife       Date:  2021-03-02       Impact factor: 8.140

8.  Genomic epidemiology of SARS-CoV-2 in Esteio, Rio Grande do Sul, Brazil.

Authors:  Vinícius Bonetti Franceschi; Gabriel Dickin Caldana; Amanda de Menezes Mayer; Gabriela Bettella Cybis; Carla Andretta Moreira Neves; Patrícia Aline Gröhs Ferrareze; Meriane Demoliner; Paula Rodrigues de Almeida; Juliana Schons Gularte; Alana Witt Hansen; Matheus Nunes Weber; Juliane Deise Fleck; Ricardo Ariel Zimerman; Lívia Kmetzsch; Fernando Rosado Spilki; Claudia Elizabeth Thompson
Journal:  BMC Genomics       Date:  2021-05-20       Impact factor: 3.969

9.  Longitudinal immune profiling of a SARS-CoV-2 reinfection in a solid organ transplant recipient.

Authors:  Jon Klein; Anderson Brito; Paul Trubin; Peiwen Lu; Patrick Wong; Tara Alpert; Mario Pena-Hernandez; Winston Haynes; Kathy Kamath; Feimei Liu; Chantal Vogels; Joseph Fauver; Carolina Lucas; Ji Eun Oh; Tianyang Mao; Julio Silva; Anne Wyllie; M Catherine Muenker; Arnau Casanovas-Massana; Adam Moore; Mary Petrone; Chaney Kalinich; Charles Dela Cruz; Shelli Farhadian; Aaron Ring; John Shon; Albert Ko; Nathan Grubaugh; Benjamin Goldman-Israelow; Akiko Iwasaki; Marwan Azar
Journal:  Res Sq       Date:  2021-05-05

10.  The coSIR model predicts effective strategies to limit the spread of SARS-CoV-2 variants with low severity and high transmissibility.

Authors:  Longchen Xu; Haohang Zhang; Hengyi Xu; Han Yang; Lei Zhang; Wei Zhang; Fei Gu; Xun Lan
Journal:  Nonlinear Dyn       Date:  2021-07-23       Impact factor: 5.022

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.