Literature DB >> 34608459

The role of viral genomics in understanding COVID-19 outbreaks in long-term care facilities.

Dinesh Aggarwal1,2,3,4, Richard Myers2, William L Hamilton1,3, Tehmina Bharucha2,5,6, Niamh M Tumelty7, Colin S Brown2,5,6, Emma J Meader8, Tom Connor9,10,11, Darren L Smith12, Declan T Bradley13,14, Samuel Robson15, Matthew Bashton12, Laura Shallcross16, Maria Zambon2, Ian Goodfellow17, Meera Chand2,18, Justin O'Grady11, M Estée Török1,3, Sharon J Peacock1,4, Andrew J Page11.   

Abstract

We reviewed all genomic epidemiology studies on COVID-19 in long-term care facilities (LTCFs) that had been published to date. We found that staff and residents were usually infected with identical, or near identical, SARS-CoV-2 genomes. Outbreaks usually involved one predominant cluster, and the same lineages persisted in LTCFs despite infection control measures. Outbreaks were most commonly due to single or few introductions followed by a spread rather than a series of seeding events from the community into LTCFs. The sequencing of samples taken consecutively from the same individuals at the same facilities showed the persistence of the same genome sequence, indicating that the sequencing technique was robust over time. When combined with local epidemiology, genomics allowed probable transmission sources to be better characterised. The transmission between LTCFs was detected in multiple studies. The mortality rate among residents was high in all facilities, regardless of the lineage. Bioinformatics methods were inadequate in a third of the studies reviewed, and reproducing the analyses was difficult because sequencing data were not available in many facilities. Crown
Copyright © 2021 Published by Elsevier Ltd.

Entities:  

Mesh:

Year:  2021        PMID: 34608459      PMCID: PMC8480962          DOI: 10.1016/S2666-5247(21)00208-1

Source DB:  PubMed          Journal:  Lancet Microbe        ISSN: 2666-5247


Introduction

Many studies of COVID-19 in long-term care facilities (LTCFs) have reported high mortality.1, 2, 3 Possible explanations for this finding include recognised risk factors such as increased age and comorbidities.2, 4 In England and Wales, it has been estimated that nearly 30% (15 819 of 54 325 total in the week ending Oct 16, 2020) of all deaths due to COVID-19 occurred in LTCFs with outbreaks reported in 45% of all LTCFs. Northern Ireland reported even higher rates; 37% (363 of 988 total in the week ending Oct 23, 2020) of deaths in LTCFs were due to COVID-19. Globally, 24·9% of superspreading events were linked to LTCFs. The drivers for the introduction and transmission of SARS-CoV-2 in the care sector are under investigation and are incompletely understood. There are multiple tools available to investigate and manage SARS-CoV-2 outbreaks. These include: surveillance-based testing, where PCR testing is preferred to the serology testing of staff and residents; the testing of individuals who are symptomatic with PCR testing; the identification and self-isolation of close contacts; environmental measures such as disinfection; personal protective equipment use; and the self-isolation of individuals who test positive. The genome sequencing of SARS-CoV-2 has been established as a powerful supplementary tool to characterise the transmission dynamics in health-care settings.10, 11, 12 This method entails the investigation of the genetic relatedness of appropriately assembled SARS-CoV-2 sequences, with multiple tools available to identify clusters of infection. The genomic epidemiological investigations of outbreaks are effective in ruling out links between clusters suggested through contact tracing. However, SARS-CoV-2 sequencing alone can encounter limitations, such as difficulty in proving the directionality of the spread or little genomic diversity falsely showing possible transmission events; these limitations can be overcome by integrating this information with epidemiological data.14, 15 This integration can enable the investigation of the dynamics of outbreaks within and between LTCFs and the wider community. Several studies have used genomic epidemiology to advance the understanding of the transmission of SARS-CoV-2 within LTCFs. These studies vary in size, methods used, and quality. Here, we review the available genomic epidemiology studies on COVID-19 in LTCFs that have been done to date and provide a summary and interpretation of the key findings (table 1 ; appendix).
Table 1

Overview of studies using SARS-CoV-2 genome sequencing of samples taken for routine surveillance or during investigation of outbreaks in LTCFs

LocationStart and end date of study (in 2020)Type of study*Number of LTCFsTotal number of residents and staff testedNumber of residents testing positiveNumber of staff testing positiveCases sequencedNumber of clusters
Dautzenberg et al (2020)16Southeast NetherlandsMarch–AprilSurveillance2621NR133223
van den Besselaar et al (2021)17South HollandMay–JuneOutbreak14251135660§1
Hamilton et al (2021)18East of England, UKFebruary–MaySurveillance29266001167NR700409
Page et al (2021)19Norfolk, UKMarch–AugustSurveillance61035769 and 3892
Graham et al (2020)20London, UKAprilOutbreak4383126319NR
Ladhani et al (2020)21 and Ladhani et al (2020)22London, UKAprilOutbreak651810553992
Lemieux et al (2020)23Boston, MA, USAJanuary–MaySurveillance11948236833
Zhang et al (2020)24CA, USAMarch–AprilSurveillance2106 and 1**31921
Gallichote et al (2020)25CO, USAUnknownSurveillance5454NR70381
Taylor et al (2020)26MN, USAApril–JuneOutbreak26001651141054
Arons et al (2020)27WA, USAMarchOutbreak1895726342

LTCFs=long-term care facilities. NR=not reported.

Surveillance studies are defined as those which involve serial testing to identify positive cases, and outbreak investigations are those which involve the testing or sequencing, or both, of positivity after a case (or a defined number of cases) of SARS-CoV-2 have been identified.

Clusters are not uniformly defined in all papers.

Preprint before peer review.

Six of these samples were from an epidemiologically linked hospital outbreak.

Family members of a single staff member.

Paper states both 17 and 19 samples sequenced, so it is not clear which is correct.

Family member of resident.

Overview of studies using SARS-CoV-2 genome sequencing of samples taken for routine surveillance or during investigation of outbreaks in LTCFs LTCFs=long-term care facilities. NR=not reported. Surveillance studies are defined as those which involve serial testing to identify positive cases, and outbreak investigations are those which involve the testing or sequencing, or both, of positivity after a case (or a defined number of cases) of SARS-CoV-2 have been identified. Clusters are not uniformly defined in all papers. Preprint before peer review. Six of these samples were from an epidemiologically linked hospital outbreak. Family members of a single staff member. Paper states both 17 and 19 samples sequenced, so it is not clear which is correct. Family member of resident.

Study screening

The database searches identified 110 studies. After the removal of duplicates and the addition of papers identified through the hand-searching of preprint servers, there were 55 studies remaining. An independent review by authors of this study (AJP and NMT) of titles and abstracts identified 27 studies for full text review. After a full text review, 11 genomic epidemiology studies in LTCFs were identified for inclusion in this analysis.

Study characteristics and quality of outcome measures

Of the 11 studies included, five were done in the USA, four in the UK, and two in the Netherlands. These studies included a wide range of the number of LTCFs (1–292), participants (10–6600), and positive cases (6–1167). Six studies reported findings from the prospective surveillance of individuals and five studies reported genomic sequencing findings that occurred in relation to an outbreak investigation. The serial sampling of residents and health-care workers provided information about the duration of infection in individuals, the duration of outbreaks in LTCFs, and the reproducibility of genome sequencing and lineage identification. Nine studies sequenced both staff and patients to better understand the transmission dynamics within a LTCF. One study assessed transmission in staff alone and one study in patients alone. The studies were done between February and August, 2020. The study characteristics are detailed in table 2 . Bioinformatics methods differed between studies, tailored to the sequencing technologies (Illumina, San Diego, CA, USA, or Oxford Nanopore Technologies, Oxford, UK) and sample preparation methods (ARTIC amplicon,29, 30 metagenomic, and whole genome sequencing), meaning that direct comparisons cannot be made because of the differing methods. Three studies were found to display deficiencies relating to the bioinformatics methods used or the results presented. These deficiencies included assembling amplicons, using poor-quality sequencing data in phylogenetic analysis, and imputing reference bases to replace missing bases; the effect of these methods on downstream analysis is unknown.
Table 2

Sequencing and bioinformatics methods used in the long-term care facilities genomic epidemiology studies

LocationSample preparation methodSequencingMethod of genome construction strategySoftware used to infer phylogenetic treesData availability*
Dautzenberg et al (2020)16Southeast NetherlandsAmpliconNanoporeConsensusNRNot available
van den Besselaar et al (2021)17South HollandAmpliconNanoporeConsensusIQ-TREENot available
Hamilton et al (2021)18East of England, UKAmpliconNanopore or IlluminaConsensusIQ-TREE and PhyMLAvailable but not linked
Page et al (2021)19Norfolk, UKAmpliconIlluminaConsensusIQ-TREEAvailable
Graham et al (2020)20LondonAmpliconIlluminaReference-guided assemblyIQ-TREENot available
Ladhani et al (2020)21LondonWhole genome sequencingIlluminaConsensusIQ-TREEAvailable but not linked
Lemieux et al (2020)23Boston, MA, USAMetagenomicIlluminaReference-guided assemblyIQ-TREE and Bayesian Evolutionary Analysis Sampling TreesAvailable
Zhang et al (2020)24CA, USAMetagenomicIlluminaConsensusIQ-TREEAvailable
Gallichote et al (2020)25CO, USAAmpliconIlluminaConsensus gap filled with referenceGeneiousNot available
Taylor et al (2020)26MN, USAAmpliconNRNRIQ-TREEAvailable but not linked
Arons et al (2020)27WA, USANRNanoporeConsensusGeneiousAvailable

NR=not reported. The companies for the sequencing methods are: Nanopore, Oxford Nanopore Technologies, Oxford, UK, and Illumina, San Diego, CA, USA.

If data are present in the Global Initiative on Sharing Avian Influenza Data or the International Nucleotide Sequence Database Collaboration database they are labelled as available, and when there is no linkage information between the samples used in the article and the data in the public archives, they are labelled as not linked.

Amplicon sequencing uses the ARTIC protocol.

Sequencing and bioinformatics methods used in the long-term care facilities genomic epidemiology studies NR=not reported. The companies for the sequencing methods are: Nanopore, Oxford Nanopore Technologies, Oxford, UK, and Illumina, San Diego, CA, USA. If data are present in the Global Initiative on Sharing Avian Influenza Data or the International Nucleotide Sequence Database Collaboration database they are labelled as available, and when there is no linkage information between the samples used in the article and the data in the public archives, they are labelled as not linked. Amplicon sequencing uses the ARTIC protocol. Most studies did phylogenetic analysis of their datasets as a final step and presented the results as a dendrogram. Of the 11 included studies, the open source bioinformatics software IQ-TREE was used in eight studies, PhyML was used in one study, a combination of Molecular Evolutionary Genetics Analysis across computing platforms software and commercial software Geneious (Biomatters, Auckland, New Zealand) was used in two studies, and the software used was not stated in one study. One study also did an analysis with Bayesian Evolutionary Analysis Sampling Trees version 2.6.2. The way in which the analysis was done differed greatly, which had an effect on the granularity presented and largely prevented direct comparisons. Three studies defined a consistent method in their study to identify clusters.18, 19, 27 These studies used Phylogenetic Assignment of Named Global Outbreak Lineages software or a more granular clustering algorithm in the COVID-19 Genomics UK (COG-UK) Consortium pipeline, transcluster algorithm, or single nucleotide polymorphism (SNP) distance. One study provides a rationale for how clusters were identified without providing a defined criteria for selecting genomic clusters. Furthermore, assuming a mutation rate of approximately 2·5 SNPs per month allows for the estimation of the amount of variation expected in a phylogeny at any particular timepoint in a series. Additionally, only two studies mention the use of negative controls18, 19 and no studies have released the sequencing reads found in the negative controls publicly. Many sample preparation protocols use amplification techniques that can also amplify contamination and give false results, particularly when the viral load in the source material is low.18, 19, 20, 25 Without having all underlying raw data (including controls) alongside the sample preparation and sequencing metadata, reanalysis and comparison between studies is difficult and prone to error. Many studies publicly release their raw sequencing data or consensus or assembled genomes, or a combination,18, 19, 23, 24 through the International Nucleotide Sequence Database Collaboration (INSDC) and the Global Initiative on Sharing Avian Influenza Data (GISAID). This sharing allows for independent reanalysis, overcoming the effect of variation among methods between studies. However, in genomics it is a common poor practice to not release data publicly or to provide insufficient metadata, such as accession numbers, thus making reanalysis unfeasible. The use of open metadata standards that are internationally agreed upon for SAR-CoV-2 genomics enables genomic epidemiology on a global scale. In some cases, there are legitimate reasons to withhold data, such as to maintain patient confidentiality where identification might be possible (eg, as a result of small sample numbers or the location of LTCFs). In other cases, even if the authors wish to deposit all clinically important samples, some of their samples might not meet the minimum quality-control thresholds (>90% completeness for GISAID) enforced by the public databases (to aid high-quality phylogenetic analysis). For example, samples with a low viral load often sequence poorly, leading to incomplete datasets and making reanalysis impossible. Researchers might have a well meaning desire to make publicly available data that does not meet the stringent quality-control thresholds by imputing missing data from a reference genome, a common technique in human genetics, but this leads to erroneous results in the phylogenetic analysis of SARS-CoV-2. These high-quality thresholds on data inhibit reanalysis and reduce available data. The COG-UK Consortium has overcome this challenge by making these data available on their website and through the INSDC, which has lower quality-control thresholds than GISAID (>50% genome completeness). Convergence on a small number of open-source bioinformatics workflows using best practices should mitigate future issues in this regard (eg, a Nextflow pipeline with a focus on COVID-19, one global resource on the Galaxy platform for the analysis of SARS-CoV-2 data, and a workspace on the Terra app with COVID-19 genomic data and workflows.

Summary of findings

To date, genomic epidemiology studies of SARS-CoV-2 in LTCFs provide many insights into transmission in this susceptible population. The diversity of studies ranged from outbreak investigations with detailed epidemiological data in single LTCFs to the prospective surveillance of hundreds of LTCFs, as summarised earlier. Key findings from the included studies are shown in the panel . Community or hospital acquisition of SARS-CoV-2 in residents of LTCFs Most LTCF infections were community-acquired (moderate).18, 19 Approximately 6% of residents with SARS-CoV-2 infection in LTCFs had suspected or confirmed hospital-acquired infections in one UK region (moderate). Little genomic diversity among the SARS-CoV-2 infections in staff and residents from the same LTCFs. This finding indicated a small number of introductions rather than a series of seeding events from the community (weak).16, 19, 21, 23, 26 Shared clusters between separate LTCFs could be identified (moderate).18, 19, 21 Transmission and outcomes within LTCFs The use of genomic data allowed independent clusters of infections to be identified within LTCFs (strong).17, 18, 20, 21, 23, 26, 27 In LCTF outbreaks, initial sequencing was useful to identify whether genomes were similar, but the subsequent sequencing of large numbers of samples did not add much value (moderate).23, 26 Once two symptomatic individuals were identified in a LTCF, the outbreak was already widespread (moderate). The sequencing of samples taken consecutively from the same residents of LTCFs showed that viral lineages persisted over an extended period of time despite infection prevention and control measures. It also showed that the sequencing technique was reproducible (moderate). Residents of LTCFs were more likely to be infected with identical genome sequences if their bedrooms were in close proximity to each other (moderate). The mortality rate among residents of LTCFs was high in all facilities, with no link to particular lineages (moderate).20, 21, 26 The temporal analysis of genomic data allows for the estimation of when an introduction was likely to have occurred (moderate). Reproducibility of genomic analysis The genomic studies reviewed commonly misapplied bioinformatics methods (strong).20, 21, 25 Minimum quality thresholds set by public archives on SARS-CoV-2 data limit data availability and reproducibility (moderate).18, 19, 21 Most studies did not provide adequate epidemiological data or metadata to allow analysis to be reproduced (strong).16, 17, 18, 20, 21, 25, 26 Strength of findings: strong indicates multiple sources of evidence, supported by in-depth analysis or experiments; moderate indicates one or more sources of evidence, supported by analysis or experiments; weak indicates one or more sources of evidence that are potentially contradictory. LTCFs=long-term care facilities. Large outbreaks in LTCFs, such as in the study by Lemieux and colleagues, generally shared the same characteristics: a single cluster with rapid expansion, resulting in most samples being identical or near identical (only a one SNP difference). Residents and staff, including staff who had no contact with residents, were usually infected with the same (identical) genome sequence. The direction of transmission cannot be established from genomic data alone, but the addition of traditional epidemiological data (such as sample dates and the co-location of individuals) might allow inferences to be drawn. In many outbreaks more than one cluster was observed, but these sporadic introductions usually represented only a few facilities. The temporal analysis of genomic data allows estimation of when an introduction into a LTCF is likely to have occurred, making genomics useful for providing an estimate for when an outbreak began. The paper by Lemieux and colleagues in Boston, MA, USA, estimated that, after an introduction, 85% of residents were infected within 2–3 weeks, despite extensive infection prevention and control measures being in place. By the time two symptomatic individuals are identified in a LTCF, the outbreak is likely to already be widespread.21, 27 An analysis of lineages circulating in a region compared with lineages found within LTCFs19, 23 show that there is little diversity within LTCFs, indicating a small number of introductions rather than repeated introductions from the community. Distinct clusters are usually (but not always) seen between LTCFs, with genomics identifying a small number of shared genomic clusters in different LTCFs.18, 19, 21 Taking the sequence diversity found in 292 LTCFs in a region as a whole and comparing it to a similar number of residents not in a LTCF in the same region, similar numbers of SNP differences were identified in the genomes (the median number of SNP differences in residents in a LTCF was eight, and in residents not in a LTCF was nine). When looking at a single LTCF, knowing the diversity of the circulating lineages within the locality helped to rule out local inward transmission. Looking more closely at the dynamics of an outbreak, the study by Arons and colleagues in WA, USA, overlayed unique sequences to a map of the residents' bedrooms and showed a clear spatial signal, with residents more likely to be infected with identical genome sequences if their bedrooms were in close proximity, even with strict infection controls. Genome sequencing also identified examples of links between outbreaks at LTCFs located in the same geographical areas.19, 21 In one study, two LTCFs located within 1 km of each other had residents infected with identical genomes; a paramedic who visited both facilities also tested positive. In another study a genetically distinct sub-lineage was found in six different LTCFs within one small region. Genomics reveals that the inter-LTCF transmission of SARS-CoV-2 is a real risk, and is potentially enabled by the use of shared staff or temporary agency workers. When there is an outbreak at a LTCF, the genomes identified in residents and staff, including non-health-care workers, are usually the same. A high percentage of asymptomatic individuals is common, with staff usually accounting for a higher percentage of those who are asymptomatic. Therefore, the same SARS-CoV-2 genome can result in both symptomatic and asymptomatic infections. It is important to include staff in testing, although it has been noted that participation rates are often low. Even with intensive consecutive testing every week, enhanced infection control, and the transfer of residents who test positive to dedicated isolation units, the outbreak continued in the study in MN, USA, with the genomically similar clusters found over an extended period of time. The intensive sequencing of all residents and staff in an outbreak does not appear to provide additional genomic information after the first few sequences. The strategic subsampling of staff and residents should be adequate to understand the number of clusters and their relative proportions. However, inadequate sampling does have a large effect on the usefulness of genomics. Genomics has reduced usefulness once there is a large outbreak; however, it does provide useful information about how SARS-CoV-2 might enter a care home, such as via staff or patient movements, and continued ongoing monitoring using genomics can identify new sources of infection (new seeding events), which can help to inform policy.18, 27 Because visitors were restricted from visiting LTCFs early in the pandemic, no data are available on their role as a source of introduction. Genomes sequenced through prospective surveillance have proven to be useful for identifying linked outbreaks that might have been missed otherwise.10, 18, 19, 24 The limitation is that it might take time for an outbreak to be recognised through surveillance activities, where even if the intention is to sequence every positive sample, a large percentage of genome sequences are not available.18, 19 Residents of LTCFs who develop severe COVID-19 are often admitted to hospital, which might be the first indication of an outbreak. Samples that are sequenced as part of surveillance studies can provide early insight into outbreaks in LTCFs.18, 19, 23, 24 5·8% of COVID-19 infections in residents of LTCFs were suspected to be acquired in hospital. Furthermore, 33·1% of patients were discharged within 7 days of their first positive test and could therefore have been infectious at the time of hospital discharge. These findings have important implications for infection control in LTCFs and for public health policies. So far, most genomic epidemiology studies of SARS-CoV-2 in LTCFs have been done in the UK, the Netherlands, and the USA. Furthermore, two thirds of the global SARS-CoV-2 genomes sequenced to date have been generated by the COG-UK Consortium. This endeavour has enabled detailed analyses on a large scale, but also introduced a risk of bias. The dynamics of SARS-CoV-2 transmission in LTCFs in other countries might be different.

Recommendations

It is clear from the studies summarised here that genomics play a crucial role in understanding the transmission dynamics within LTCFs. Having reviewed the available literature, we have drafted some recommendations for the use of genomics to evaluate SARS-CoV-2 in LTCFs, which are summarised in table 3 .
Table 3

Recommendations for measures derived from the use of SARS-CoV-2 genomics in LTCFs

Point from Key findings panelEffect of these measures
Transmission of SARS-CoV-2
Limiting the spread of SARS-CoV-2 between hospitals, health-care workers, and residents of LTCFs is an urgently needed infection control measure and public health priority2–5, 8–10Control transmission
All staff, not just individuals with direct contact with residents, should be treated as one cohort and subject to the same infection prevention and control measures3Control transmission
Genomics identifies transmission between staff, between staff and residents, and between care facilities. Findings should direct future control measures2–4Control transmission
Clustering based on physical proximity to the bedroom of a resident infected with SARS-CoV-2 supports its use as an additional factor to identify at-risk individuals and prioritise testing9Control transmission and resource allocation
LTCF sequencing strategy
A targeted approach weighted towards sequencing early positive samples in an outbreak coupled with potential epidemiological links can help to highlight the source of introduction; widespread sequencing within a care home is unlikely to yield substantially more information3–4, 6, 8Control transmission and resource allocation
Genomic surveillance in a proportion of samples from LTCFs should be done including both patients and staff, allowing the genomic epidemiology of a LTCF to be put into context3–4, 6–8, 11Control transmission and resource allocation
Residents with a recent hospital admission who subsequently test positive should have their genome sequenced to identify the hospital seeding of outbreaks in LTCFs2, 5Control transmission and resource allocation
Ongoing community surveillance with SARS-CoV-2 sequencing allows outbreaks in LTCFs to be better characterised1–2, 3–4Control transmission and resource allocation
Recommendations for future research
Modelling of subsampling strategies within LTCFs is needed to optimally use genomic surveillance6Control transmission and research need
Epidemiological and genomic data should be released to public archives with sufficient metadata to enable genomic epidemiology13–14Control transmission and research need
Appropriate and validated bioinformatics methods should be applied to genomic analysis with domain experts reviewing results to avoid erroneous results12Control transmission and research need
A focus on rapid integrated epidemiological and genomic analysis will have the most clinical benefit4–5, 7–10, 14Control transmission and resource allocation

LTCFs=long-term care facilities.

Recommendations for measures derived from the use of SARS-CoV-2 genomics in LTCFs LTCFs=long-term care facilities. All staff working in a LTCF (regardless of their role) should be treated as a single cohort and subject to infection prevention and control measures that are uniform, including the appropriate use of personal protective equipment, regular screening for SARS-CoV-2, and genome sequencing of any positive samples. Genome sequencing has shown that staff who do not have direct contact with residents have the same lineages in an outbreak as residents and staff with direct contact with residents. The early identification and exclusion of asymptomatic staff through regular screening might reduce the risk of transmission to residents and other staff. However, it should be noted that the regular screening of staff for asymptomatic infections might still sometimes be unable to identify an individual who is infectious, and that could lead to a superspreading event. Sequencing every genome in an outbreak is not recommended because it provides rapidly diminishing returns. Instead, the strategic sequencing of a subset of samples should be undertaken. The strategy for sequencing positive samples should be weighted towards staff rather than residents because they are at risk of community acquisition and subsequent transmission, whereas residents are less likely to have external contact. The modelling of subsampling strategies within LTCFs is needed. Once a resident in a LTCF tests positive, other residents with bedrooms in close proximity should be considered to be at a high risk of infection, regardless of contact patterns and other infection control measures, because genomics shows identical genomes are more likely to be found in those in close proximity. Residents who have had a recent hospital admission and who subsequently test positive (within 14 days of hospital discharge) should have their viral genomes sequenced to distinguish hospital-acquired acquisition from care-home acquisition, thus informing outbreak investigation and management. Limiting the spread of COVID-19 between residents in LTCFs, health-care workers, and hospitals should be a key target for infection control and prevention. Raw sequencing data and consensus or assembled genomes should be made available in the public archives in a timely manner with the internationally recommended minimal set of metadata to enable genomic epidemiological analysis at local, national, and international levels. This data sharing is essential to provide context for transmission analysis and outbreak investigations. Bioinformatics analysis of viral data requires additional considerations compared with other organisms. To increase the quality of the analysis, and reduce the probability of missing one of these domain-specific considerations, we recommend the use of validated and tested SARS-CoV-2 pipelines, with the involvement of domain-specific experts to assist with the analysis and review of results. A follow-up of the study done in London, UK, by Ladhani and colleagues using serological testing showed that by 5 weeks, most individuals had seroconverted, including 66·4% of staff and 67·0% of residents who were asymptomatic and tested negative by RT-PCR. This finding highlights the need to combine various surveillance methods, including genomic epidemiology, to accurately characterise the dynamics of transmission within LTCFs; this is planned in a prospective study across 105 care homes in the UK. Ultimately, genomics provides the most clinical benefit and insight if it is integrated with detailed epidemiological data in a timely fashion. Meredith and colleagues established weekly infection prevention control meetings, combining phylogenetic analysis to assist outbreak investigation and contact-tracing efforts at a health-care facility. Furthermore, although the routine genomic surveillance of hospital patients and staff, residents and staff at LTCFs, and community cases will provide greater insight into transmission dynamics, integrating additional epidemiological information such as hospital discharges, patient movements, and discharge locations would provide a much more informative approach. We recommend a focus not only on rapidly generating and analysing sequencing data, but also on rapidly collecting and integrating epidemiological data, which is often held in many different databases in different organisations. The ability to combine genomic and epidemiological analysis in a clinically actionable time frame (days rather than months) is crucial for leveraging the clinical benefits of sequencing.

Conclusions

We have presented findings from multiple genomic epidemiology studies of transmission in LTCFs in an evolving pandemic. We have amalgamated the data to provide clinical recommendations from the findings and recommendations for refining methods for such studies in the future. Genomics can help to understand the initial seeding of outbreaks in LTCFs. For example, they can link existing outbreaks to other LTCFs, identify the likelihood of inter-LTCF transmission, and link outbreaks to hospital cases, indicating nosocomial infection. Placing these outbreaks in the context of the wider circulating lineages in the locality also provides information about routes of transmission. For example, this method can separate local community transmission from other routes of transmission, which informs policy and helps to limit future outbreaks. The genome sequencing of SARS-CoV-2 has been proven to provide useful insights into the transmission and dynamics of outbreaks. Prospective genomic surveillance provides a backbone of information, helping to inform outbreak analysis. Hidden transmission links are uncovered using genomics that help with the interpretation of epidemiology and with contract-tracing efforts. Consecutive sampling provides yet more insight into virus longevity and transmission within LTCFs, and the reproducibility of genome sequencing for lineage identification when the same patient is sampled and genome sequencing is done repeatedly. The ability to integrate epidemiological and genomic analysis in a clinically actionable timeframe is a major challenge to realising the clinical benefits of genomics.

Search strategy and selection criteria

The studies we included in this Review were identified by searches of PubMed, Web of Science, and Scopus from Jan 1 to Nov 3, 2020. We used the search terms (“COVID-19” OR “SARS-CoV-2”) and (“long term care facility”, “care home”, “skilled nursing facility”, “nursing home” or “residential home”) and (“sequenc*” or “genom*” or “WGS”) to identify relevant English-language publications and preprints since January, 2020. Because of the limitations of the systematic search functionality on medRxiv and bioRxiv, these servers were hand-searched for additional papers that met the inclusion criteria. We focused particularly on studies where genomic epidemiology was used to enhance the interpretation of outbreaks. Articles that did not use genomic sequencing as a method were excluded. Studies were screened by authors NMT and AJP. The subsequent reported outcomes were extracted (where documented): the location of the study, the time period, number of long-term care facilities involved, the total number of positive cases broken down by staff and residents, the total number of genomes sequenced, the wider effect, the sample preparation methods, the sequencing equipment used, the genome creation method, the phylogeny, and the data availability. We report on the practical difficulties if reanalysis was attempted, and do not formally attempt to reproduce the analyses presented. Papers were classified as follows: outbreak investigation, surveillance, first case, and genomic reanalysis of public data. Because this topic is a rapidly emerging field of research, preprints were included but it should be noted that these are not peer-reviewed.

Declaration of interests

We declare no competing interests.
  30 in total

1.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

Authors:  Stéphane Guindon; Jean-François Dufayard; Vincent Lefort; Maria Anisimova; Wim Hordijk; Olivier Gascuel
Journal:  Syst Biol       Date:  2010-03-29       Impact factor: 15.683

2.  Genomic epidemiology of COVID-19 in care homes in the east of England.

Authors:  William L Hamilton; Gerry Tonkin-Hill; Emily R Smith; Dinesh Aggarwal; Charlotte J Houldcroft; Ben Warne; Luke W Meredith; Myra Hosmillo; Aminu S Jahun; Martin D Curran; Surendra Parmar; Laura G Caller; Sarah L Caddy; Fahad A Khokhar; Anna Yakovleva; Grant Hall; Theresa Feltwell; Malte L Pinckert; Iliana Georgana; Yasmin Chaudhry; Colin S Brown; Sonia Gonçalves; Roberto Amato; Ewan M Harrison; Nicholas M Brown; Mathew A Beale; Michael Spencer Chapman; David K Jackson; Ian Johnston; Alex Alderton; John Sillitoe; Cordelia Langford; Gordon Dougan; Sharon J Peacock; Dominic P Kwiatowski; Ian G Goodfellow; M Estee Torok
Journal:  Elife       Date:  2021-03-02       Impact factor: 8.140

3.  Are presymptomatic SARS-CoV-2 infections in nursing home residents unrecognized symptomatic infections? Sequence and metadata from weekly testing in an extensive nursing home outbreak.

Authors:  Judith H van den Besselaar; Reina S Sikkema; Fleur M H P A Koene; Laura W van Buul; Bas B Oude Munnink; Ine Frénay; René Te Witt; Marion P G Koopmans; Cees M P M Hertogh; Bianca M Buurman
Journal:  Age Ageing       Date:  2021-05-07       Impact factor: 10.668

4.  The International Nucleotide Sequence Database Collaboration.

Authors:  Guy Cochrane; Ilene Karsch-Mizrachi; Toshihisa Takagi
Journal:  Nucleic Acids Res       Date:  2015-12-10       Impact factor: 16.971

5.  GISAID: Global initiative on sharing all influenza data - from vision to reality.

Authors:  Yuelong Shu; John McCauley
Journal:  Euro Surveill       Date:  2017-03-30

6.  A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology.

Authors:  Andrew Rambaut; Edward C Holmes; Áine O'Toole; Verity Hill; John T McCrone; Christopher Ruis; Louis du Plessis; Oliver G Pybus
Journal:  Nat Microbiol       Date:  2020-07-15       Impact factor: 17.745

7.  Increased risk of SARS-CoV-2 infection in staff working across different care homes: enhanced CoVID-19 outbreak investigations in London care Homes.

Authors:  Shamez N Ladhani; J Yimmy Chow; Roshni Janarthanan; Jonathan Fok; Emma Crawley-Boevey; Amoolya Vusirikala; Elena Fernandez; Marina Sanchez Perez; Suzanne Tang; Kate Dun-Campbell; Edward Wynne-Evans; Anita Bell; Bharat Patel; Zahin Amin-Chowdhury; Felicity Aiano; Karthik Paranthaman; Thomas Ma; Maria Saavedra-Campos; Richard Myers; Joanna Ellis; Angie Lackenby; Robin Gopal; Monika Patel; Meera Chand; Kevin Brown; Susan Hopkins; CoG Consortium; Nandini Shetty; Maria Zambon; Mary E Ramsay
Journal:  J Infect       Date:  2020-07-29       Impact factor: 6.072

8.  Risk Factors Associated With Mortality Among Residents With Coronavirus Disease 2019 (COVID-19) in Long-term Care Facilities in Ontario, Canada.

Authors:  David N Fisman; Isaac Bogoch; Lauren Lapointe-Shaw; Janine McCready; Ashleigh R Tuite
Journal:  JAMA Netw Open       Date:  2020-07-01

9.  Serial Testing for SARS-CoV-2 and Virus Whole Genome Sequencing Inform Infection Risk at Two Skilled Nursing Facilities with COVID-19 Outbreaks - Minnesota, April-June 2020.

Authors:  Joanne Taylor; Rosalind J Carter; Nicholas Lehnertz; Lilit Kazazian; Maureen Sullivan; Xiong Wang; Jacob Garfin; Shane Diekman; Matthew Plumb; Mary Ellen Bennet; Tammy Hale; Snigdha Vallabhaneni; Sarah Namugenyi; Deborah Carpenter; Darlene Turner-Harper; Marcus Booth; E John Coursey; Karen Martin; Melissa McMahon; Amanda Beaudoin; Alan Lifson; Stacy Holzbauer; Sujan C Reddy; John A Jernigan; Ruth Lynfield
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2020-09-18       Impact factor: 17.586

10.  Large-scale sequencing of SARS-CoV-2 genomes from one region allows detailed epidemiology and enables local outbreak management.

Authors:  Andrew J Page; Alison E Mather; Thanh Le-Viet; Emma J Meader; Nabil-Fareed Alikhan; Gemma L Kay; Leonardo de Oliveira Martins; Alp Aydin; David J Baker; Alexander J Trotter; Steven Rudder; Ana P Tedim; Anastasia Kolyva; Rachael Stanley; Muhammad Yasir; Maria Diaz; Will Potter; Claire Stuart; Lizzie Meadows; Andrew Bell; Ana Victoria Gutierrez; Nicholas M Thomson; Evelien M Adriaenssens; Tracey Swingler; Rachel A J Gilroy; Luke Griffith; Dheeraj K Sethi; Dinesh Aggarwal; Colin S Brown; Rose K Davidson; Robert A Kingsley; Luke Bedford; Lindsay J Coupland; Ian G Charles; Ngozi Elumogo; John Wain; Reenesh Prakash; Mark A Webber; S J Louise Smith; Meera Chand; Samir Dervisevic; Justin O'Grady
Journal:  Microb Genom       Date:  2021-06
View more
  5 in total

1.  Unraveling the hurdles of a large COVID-19 epidemiological investigation by viral genomics.

Authors:  Regina Sá; Joana Isidro; Vítor Borges; Sílvia Duarte; Luís Vieira; João P Gomes; Sofia Tedim; Judite Matias; Andreia Leite
Journal:  J Infect       Date:  2022-05-21       Impact factor: 38.637

2.  Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package.

Authors:  Emma J Griffiths; Ruth E Timme; Catarina Inês Mendes; Andrew J Page; Nabil-Fareed Alikhan; Dan Fornika; Finlay Maguire; Josefina Campos; Daniel Park; Idowu B Olawoye; Paul E Oluniyi; Dominique Anderson; Alan Christoffels; Anders Gonçalves da Silva; Rhiannon Cameron; Damion Dooley; Lee S Katz; Allison Black; Ilene Karsch-Mizrachi; Tanya Barrett; Anjanette Johnston; Thomas R Connor; Samuel M Nicholls; Adam A Witney; Gregory H Tyson; Simon H Tausch; Amogelang R Raphenya; Brian Alcock; David M Aanensen; Emma Hodcroft; William W L Hsiao; Ana Tereza R Vasconcelos; Duncan R MacCannell
Journal:  Gigascience       Date:  2022-02-16       Impact factor: 6.524

3.  Optimization of the SARS-CoV-2 ARTIC Network V4 Primers and Whole Genome Sequencing Protocol.

Authors:  Arnold W Lambisia; Khadija S Mohammed; Timothy O Makori; Leonard Ndwiga; Maureen W Mburu; John M Morobe; Edidah O Moraa; Jennifer Musyoki; Nickson Murunga; Jane N Mwangi; D James Nokes; Charles N Agoti; Lynette Isabella Ochola-Oyier; George Githinji
Journal:  Front Med (Lausanne)       Date:  2022-02-17

4.  Genomic epidemiology of SARS-CoV-2 in a university outbreak setting and implications for public health planning.

Authors:  Sema Nickbakhsh; Joseph Hughes; Nicolaos Christofidis; Emily Griffiths; Sharif Shaaban; Jessica Enright; Katherine Smollett; Kyriaki Nomikou; Natasha Palmalux; Lily Tong; Stephen Carmichael; Vattipally B Sreenu; Richard Orton; Emily J Goldstein; Rachael M Tomb; Kate Templeton; Rory N Gunson; Ana da Silva Filipe; Catriona Milosevic; Emma Thomson; David L Robertson; Matthew T G Holden; Christopher J R Illingworth; Alison Smith-Palmer
Journal:  Sci Rep       Date:  2022-07-19       Impact factor: 4.996

5.  Reconstruction of transmission chains of SARS-CoV-2 amidst multiple outbreaks in a geriatric acute-care hospital: a combined retrospective epidemiological and genomic study.

Authors:  Mohamed Abbas; Anne Cori; Samuel Cordey; Florian Laubscher; Tomás Robalo Nunes; Ashleigh Myall; Julien Salamun; Philippe Huber; Dina Zekry; Virginie Prendki; Anne Iten; Laure Vieux; Valérie Sauvan; Christophe E Graf; Stephan Harbarth
Journal:  Elife       Date:  2022-07-19       Impact factor: 8.713

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.