| Literature DB >> 35798029 |
Smruthi Karthikeyan1, Joshua I Levy2, Peter De Hoff3,4,5, Greg Humphrey1, Amanda Birmingham6, Kristen Jepsen7, Sawyer Farmer1, Helena M Tubb1, Tommy Valles1, Caitlin E Tribelhorn1, Rebecca Tsai1, Stefan Aigner3, Shashank Sathe3, Niema Moshiri8, Benjamin Henson7, Adam M Mark6, Abbas Hakim3,4,5, Nathan A Baer3, Tom Barber3, Pedro Belda-Ferre3, Marisol Chacón3, Willi Cheung3,4,5, Evelyn S Cresini3, Emily R Eisner3, Alma L Lastrella3, Elijah S Lawrence3, Clarisse A Marotz3, Toan T Ngo3, Tyler Ostrander3, Ashley Plascencia3, Rodolfo A Salido3, Phoebe Seaver3, Elizabeth W Smoot3, Daniel McDonald1, Robert M Neuhard9,10, Angela L Scioscia4,11, Alysson M Satterlund12, Elizabeth H Simmons13, Dismas B Abelman10, David Brenner10, Judith C Bruner10, Anne Buckley10, Michael Ellison10, Jeffrey Gattas10, Steven L Gonias14, Matt Hale10, Faith Hawkins10, Lydia Ikeda10, Hemlata Jhaveri10, Ted Johnson10, Vince Kellen10, Brendan Kremer10, Gary Matthews10, Ronald W McLawhon10, Pierre Ouillet10, Daniel Park10, Allorah Pradenas10, Sharon Reed10, Lindsay Riggs10, Alison Sanders10, Bradley Sollenberger10, Angela Song9,10, Benjamin White10, Terri Winbush10, Christine M Aceves2, Catelyn Anderson2, Karthik Gangavarapu2, Emory Hufbauer2, Ezra Kurzban2, Justin Lee2, Nathaniel L Matteson2, Edyth Parker2, Sarah A Perkins2, Karthik S Ramesh2, Refugio Robles-Sikisaka2, Madison A Schwab2, Emily Spencer2, Shirlee Wohl2, Laura Nicholson15, Ian H McHardy15, David P Dimmock16, Charlotte A Hobbs16, Omid Bakhtar17, Aaron Harding17, Art Mendoza17, Alexandre Bolze18, David Becker18, Elizabeth T Cirulli18, Magnus Isaksson18, Kelly M Schiabor Barrett18, Nicole L Washington18, John D Malone19, Ashleigh Murphy Schafer19, Nikos Gurfield19, Sarah Stous19, Rebecca Fielding-Miller20,21, Richard S Garfein20, Tommi Gaines21, Cheryl Anderson20, Natasha K Martin21, Robert Schooley21, Brett Austin17, Duncan R MacCannell22, Stephen F Kingsmore16, William Lee18, Seema Shah19, Eric McDonald19, Alexander T Yu5, Mark Zeller2, Kathleen M Fisch4,6, Christopher Longhurst1,23, Patty Maysent24, David Pride14,25, Pradeep K Khosla8, Louise C Laurent3,4,26, Gene W Yeo3,26,27, Kristian G Andersen2, Rob Knight28,29,30.
Abstract
As SARS-CoV-2 continues to spread and evolve, detecting emerging variants early is critical for public health interventions. Inferring lineage prevalence by clinical testing is infeasible at scale, especially in areas with limited resources, participation, or testing and/or sequencing capacity, which can also introduce biases1-3. SARS-CoV-2 RNA concentration in wastewater successfully tracks regional infection dynamics and provides less biased abundance estimates than clinical testing4,5. Tracking virus genomic sequences in wastewater would improve community prevalence estimates and detect emerging variants. However, two factors limit wastewater-based genomic surveillance: low-quality sequence data and inability to estimate relative lineage abundance in mixed samples. Here we resolve these critical issues to perform a high-resolution, 295-day wastewater and clinical sequencing effort, in the controlled environment of a large university campus and the broader context of the surrounding county. We developed and deployed improved virus concentration protocols and deconvolution software that fully resolve multiple virus strains from wastewater. We detected emerging variants of concern up to 14 days earlier in wastewater samples, and identified multiple instances of virus spread not captured by clinical genomic surveillance. Our study provides a scalable solution for wastewater genomic surveillance that allows early detection of SARS-CoV-2 variants and identification of cryptic transmission.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35798029 PMCID: PMC9433318 DOI: 10.1038/s41586-022-05049-6
Source DB: PubMed Journal: Nature ISSN: 0028-0836 Impact factor: 69.504
Fig. 1Campus sampling locations and SARS-CoV-2 testing statistics.
a, Geospatial distribution of the 131 actively deployed wastewater autosamplers and the corresponding 360 university buildings on the campus sewer network. Building-specific data have been de-identified in accordance with university reporting policies. b, Campus wastewater (WW) and diagnostic testing statistics over the 295-day sampling period (positivity is the fraction of WW samplers with a positive qPCR signal). c, Virus diversity in wastewater and clinical samples; boxplots of Shannon entropy (top) and richness (bottom) for each sample type (n = 153 WW—a subset chosen to maximize sample independence; see Methods—and n = 5,888 clinical). Box edges specify the first and third quartiles, the solid line indicates the median, and the whiskers delimit the maximum and minimum values. Map in a is the intellectual property of Esri and its licensors and are used herein under license. Copyright © 2022 Esri and its licensors. All rights reserved.
Extended Data Fig. 1Relationship of daily UCSD campus wastewater sampler positivity and campus clinical positives.
Black line indicates the linear regression fit (slope = 1.88 %/clinical positive, intercept = −0.45%) to the data (n = 321), with bootstrap 95% confidence interval (resampled 1000 times with replacement) shown in gray (median slope = 1.88%/clinical positive, intercept = −0.47%).
Extended Data Fig. 2Relationship between genome coverage and cycle quantification values.
10x genome coverage (fraction of sites with 10 reads or greater) remains high, even for Cq values of nearly 38 (n = 786). Points indicate median value in each bin, while error bars indicate the median absolute deviation.
Fig. 2Sample deconvolution robustly recovers relative virus abundance.
a, Subset of lineage defining mutation ‘barcode’ matrix. Each row represents one lineage (out of more than 1,000 lineages included in the UShER global phylogenetic tree), and individual nucleotide mutations are represented as columns. b, Single-nucleotide variant (SNV) frequencies obtained from iVar used for recovering relative abundance of each lineage. c, Schematic of the spike-in validation experiment. d, Depth-weighted demixing estimates of the virus abundance versus expected or known abundance. Details on lineage-specific predictions are provided in Extended Data Fig. 3. Error bars indicate s.d. of estimates across mixture replicates. e, Comparison of wastewater sample deconvolution with VOC qPCR panel, with lookup table (bottom) showing amino acid mutations corresponding to each variant.
Extended Data Fig. 3Lineage-specific prediction of variant abundance in spike-in validation samples.
A. Schematic of “spike-in” sample design. B-F. Lineage specific prediction. Proportions of each lineage in the sample are shown as a pie chart marker (Grey = Lineage A, Orange = Alpha, Pink = Beta, Turquoise = Delta, and Purple = Gamma) with error bars indicating the standard deviation from the mean, across four replicates (n = 380, four samples per mixture type).
Plate map of spike-in mixtures used for method validation
Platemap of spike-in mixtures used for method validation
Extended Data Fig. 4Freyja more accurately estimates virus abundance, with fewer false positives.
A-B. Estimated vs expected fraction of each lineage in the mixture (n = 95, one sample per mixture type). The Kallisto-based approach from Baaijens et. al shows a wider range of estimates for each known mix fraction, and generally underestimates the fraction. C. False positives with abundance greater than 0.5%.
Extended Data Fig. 5The rise of the Delta variant during Summer 2021.
A. Mean SARS-CoV-2 viral gene copies/L of raw sewage (blue) collected from the Point Loma Wastewater Treatment Plant and caseload (gray) reported by the county during the same period. SARS-CoV-2 concentrations were normalized by PMMoV (pepper mild mottle virus) concentration to adjust for load changes. B. Lineage distribution in UCSD campus wastewater. C. Monthly lineage averages for wastewater collected at Point Loma Wastewater Treatment Plant during the Delta surge (N = 5, 20, 25, 7).
Fig. 3Freyja recovers early and cryptic transmission of SARS-CoV-2 variants of concern.
a, Timeline and normalized epidemiological curves for VOC detection in both wastewater and clinical sequences from San Diego County (includes wastewater samples collected from Point Loma wastewater treatment plant, UCSD, as well as public schools in the San Diego districts) for the three major VOCs in circulation during the sampling period (n = 475 wastewater, n = 22,504 clinical). Both Alpha and Delta variants are detected first in wastewater before clinical samples. Markers for clinical detections correspond to the ceiling of the daily detection count divided by 30 (for example, 1–30 samples = one marker, 31–60 = two markers), whereas wastewater markers correspond to a single detection. b, Timeline and epidemiological curves for VOC detection in the campus samples (n = 364 wastewater, n = 333 clinical). Markers correspond to a single detection event for both clinical and wastewater surveillance. All wastewater detections correspond to an estimated VOC prevalence of at least 10%.
Extended Data Fig. 6Quantification of deconvolution uncertainty in first detection of VOCs.
A-D. Bootstrap distributions of Freyja abundance estimates obtained by resampling read data from each sample corresponding to the first detection of that VOC in San Diego 1000 times with replacement. In all boxplots, box edges specify the first and third quartiles, solid line indicates the median, and whiskers delimit the maximum and minimum values within 1.5 times the inter-quartile range (IQR) of box edges. Outliers are denoted with individual markers. Two samplers were found to contain Delta on the same day. First detections were also confirmed using a VOC qPCR panel, as shown in Fig. 2 and Extended Data Table 3. 95% Confidence intervals for variant prevalence for each first detection event: A. Alpha: (0.232, 0.278), B. Delta: (0.336, 0.397), C. Delta: (0.676, 0.772), D. Omicron: (0.017, 0.021). E. Estimated proportion of Omicron sequences in clinical data. Omicron estimates tracked via S-gene target failure, SGTF (characteristic of Omicron lineage BA.1 and its descendants) qPCR assays for clinical samples in San Diego between November 27th, 2021-February 7th, 2022. First detection of Omicron through clinical genomic sequencing in San Diego was December 8th. Dotted line shows a rolling average with a window size of seven days.
Fig. 4Deconvolution recovers a fine-grained estimate of virus population dynamics.
a,b, Prevalence of SARS-CoV-2 variants in UCSD clinical surveillance (a) and variant prevalence in all clinical samples collected in San Diego County (b). c,d, Variant prevalence in wastewater at UCSD (c) and the greater San Diego County (d). Further analysis of Point Loma wastewater samples is shown in Extended Data Fig. 5. All curves show the rolling average, with a window of ±10 days. ‘Other’ contains all lineages not designated as VOCs. The bottom panels show the number of sequenced samples per day.
Fig. 5Community wastewater enables early Omicron detection and reveals lineage dynamics.
a, Prevalence of SARS-CoV-2 VOCs in wastewater collected from the Point Loma wastewater treatment plant from late September 2021 to early February 2022. b, Estimated VOC concentrations; prevalence estimates were scaled by normalized viral load in wastewater. c,d, Lineage-specific estimates of prevalence (c) and concentration (d). All curves show an adaptive rolling average calculated using a local linear approximation (Savitzky–Golay filter) of virus copies per litre, with a window size of ±1 sampling date.
Omicron surveillance at Point Loma Wastewater Treatment Plant
Omicron surveillance at Point Loma Wastewater Treatment Plant
Fig. 6Wastewater identifies clinically known and unknown virus transmission.
a–c, Maximum likelihood phylogenetic trees for each of the dominant VOCs (Epsilon (a), Alpha (b) and Delta (c)) using high-quality samples obtained at UCSD, as well as a representative set of sequences from the entire United States. Wastewater sequences from the same sampler that differ by one or fewer SNPs are denoted with a red asterisk. For all sequences, consensus bases were called at sites with more than 50% nucleotide frequency. Location information is provided for select outbreaks. d, Pairwise comparison of collection date for matching and near-matching wastewater and nasal swab samples obtained at UCSD. Positive values indicate earlier collection in nasal swabs and negative values indicate earlier detection in wastewater.
Extended Data Fig. 7Temporal and spatial dynamics of an Epsilon outbreak at UCSD.
After initial detection on January 3rd 2021, infected individuals were transferred to isolation housing where they continued to shed virus. At the end of January, a matching virus was detected in a residence nearby the original site of detection. All four samples have perfectly matching virus genomes. Maps are the intellectual property of Esri and its licensors and are used herein under license. Copyright © 2022 Esri and its licensors. All rights reserved.
Consistency of Lineage A Cq values across repeated measurements
Consistency of Lineage A Cq values across repeated measurements