| Literature DB >> 33971608 |
Yongtao Cao1, Roland Francis2.
Abstract
The building of an effective wastewater-based epidemiological model that can translate SARS-CoV-2 concentrations in wastewater to the prevalence of virus shedders within a community is a significant challenge for wastewater surveillance. The objectives of this study were to investigate the association between SARS-CoV-2 wastewater concentrations and the COVID-19 cases at the community-level and to assess how SARS-CoV-2 wastewater concentrations should be integrated into a wastewater-based epidemiological statistical model that can provide reliable forecasts for the number of COVID-19 infections and the evolution over time as well. Weekly variations on the SARS-CoV-2 wastewater concentrations and COVID-19 cases from April 29, 2020 through February 17, 2021 were obtained in Borough of Indiana, PA. Vector autoregression (VAR) model with different data forms were fitted on this data from April 29, 2020 through January 27, 2021, and the performance in three weeks ahead forecasting (February 3, 10, and 17) were compared with measures of Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). A stationary block bootstrapping VAR method was also presented to reduce the variability in the forecasting values. Our results demonstrate that VAR(1) estimated with the logged data has the best interpretation of the data, but a VAR(1) estimated with the original data has a stronger forecasting ability. The forecast accuracy, measured by MAPE, for 1 week, 2 weeks, and 3 weeks in the future can be as low as 11.85%, 8.97% and 21.57%. The forecasting performance of the model on a short time span is unfortunately not very impressive. Also, a single increase in the SARS-CoV-2 concentration can impact the COVID-19 cases in an inverted-U shape pattern with the maximum impact occur in the third week after. The flexibility of this approach and easy-to-follow explanations are suitable for many different locations where the wastewater surveillance system has been implemented.Entities:
Keywords: COVID-19; Community wastewater surveillance; Forecasting; SARS-CoV-2
Mesh:
Substances:
Year: 2021 PMID: 33971608 PMCID: PMC8084610 DOI: 10.1016/j.scitotenv.2021.147451
Source DB: PubMed Journal: Sci Total Environ ISSN: 0048-9697 Impact factor: 7.963
Fig. 115,701 and 15,705 (white area in the center) Zip Code Postal Districts. Please note, 7.55% of 15,701 Zip District is not sewered.
Part of the wastewater-based epidemiology data from Indiana Borough WWTP.
| Time | Ave daily flow (MGD) | Influent BOD (MG/L) | COVID-19 viral copies (X1000) | Weekly clinical change (15701 and 15705) |
|---|---|---|---|---|
| 4/29/2020 | 6.09 | 59 | 0.1 | 21 |
| 5/6/2020 | 8.55 | 114 | 0 | 1 |
| 5/13/2020 | 3.7 | 137 | 0 | 1 |
| ⁝ | ⁝ | ⁝ | ⁝ | ⁝ |
| 2/3/2021 | 3.47 | 169 | 180.80 | 17 |
| 2/10/2021 | 3.26 | 207 | 397.27 | 37 |
| 2/17/2021 | 5.23 | 83.0 | 159.75 | 31 |
Note: There are two missing values in this data set, which occurred on November 4, 2020 and January 6, 2021.
Data from Green Bay MSD and Salt lake City (WRF) are available from https://www.dhs.wisconsin.gov/covid-19/wastewater.htm and https://deq.utah.gov/water-quality/sars-cov-2-sewage-monitoring, respectively.
The non-parametric stationary block bootstrapping algorithm with a VAR model.
| Step 0 | Estimate a VAR model of order |
| Step 1 | Draw random samples with the stationary resampling procedure from the original data with the same size, estimate the VAR model of order |
| Step 2 | Repeat step 1 for |
| Step 3 | Combine the |
| Step 4 | Calculate the point forecast(s) using mean if the bootstrap distribution is symmetric and median if the bootstrap distribution is skewed. Construct the 95% CI by taking the 2.5th percentile and 97.5th percentile together. |
Summary statistics of SARS-CoV-2 Concentrations and COVID-19 Cases for the three studied locations in the U.S.
| SARS-CoV-2 concentrations | COVID-19 Cases | |||
|---|---|---|---|---|
| Mean (sd) | Median (IQR) | Mean (sd) | Median (IQR) | |
| Indiana Borough | 318.37 (463.98) | 137.9 (458.72) | 45.5 (50) | 25.5 (61.25) |
| Salt Lake City | 116.8 (163.98) | 69 (141.8) | 231.68 (157.43) | 185.6 (238.55) |
| Green Bay | 14.043 (8.15) | 11.717 (11.636) | 55.71 (21.45) | 54.12 (37.97) |
Notes: (1) For Indiana Borough data: the timespan is 4/29/2020 through 1/27/2021 (n = 38, due to 2 missing values); the unit for SARS-CoV-2 Concentrations is thousand genome copies per liter of sewage; the unit for COVID-19 cases is the weekly confirmed cases. (2) For Salt Lake City data: the timespan is 5/7/2020 through 1/19/2021 (n = 38); the unit for SARS-CoV-2 Concentrations is million gene copies per liter of sewage; the unit for COVID-19 cases is the weekly confirmed cases per 100,000 people. (3) For Green Bay data: the timespan is 8/31/2020 through 1/13/2021 (n = 20); the unit for SARS-CoV-2 Concentrations is million gene copies per person; the unit for COVID-19 cases is the 7-day rolling average per 100, 000 people.
Fig. 2Distributions of the COVID-19 cases, logged COVID-19 cases, SARS-CoV-2 concentrations, and logged SARS-CoV-2 concentrations in Indiana Borough data.
Fig. 3Time series plot of SARS-CoV-2 and logged SARS-CoV-2 concentration and their ACF and PACF plots for Indiana Borough data.
Fig. A1
Fig. A2
Fig. 4Time series plot of COVID-19 cases and logged COVID-19 cases concentration and their ACF and PACF plots for Indiana Borough data.
Fig. A3
Fig. A4
Fig. 5The correlation between weekly change in COVID-19 cases and SARS-CoV-2 concentration; and logged weekly change in COVID-19 cases and logged SARS-CoV-2 concentration in IB data.
Fig. A5
Fig. A6Comparison of the model fitting and forecasting performance of a VAR(1) model in the original form and log-log form on the three data sets.
| IB data | SL data | GB data | ||||
|---|---|---|---|---|---|---|
| Measures | VAR(1) | VAR(1) | VAR(1) | VAR(1) | VAR(1) | VAR(1) log-log |
| AIC | 869.42 | 190.54 | 878.31 | 125.5 | 273.3 | 21.63 |
| Actual.1 | 17 | 17 | 277.6 | 277.6 | 32.35 | 32.35 |
| Forecast.1 | 27 | 25 | 310.5 | 349.3 | 42.6 | 47 |
| Actual.2 | 37 | 37 | 264.3 | 264.3 | 27.66 | 27.66 |
| Forecast.2 | 30 | 26 | 280.4 | 340.4 | 49.22 | 51.4 |
| Actual.3 | 31 | 31 | 180 | 180 | 17.76 | 17.76 |
| Forecast.3 | 32 | 27 | 264.2 | 333.6 | 51 | 53.5 |
| MAE.1 | 10 | 8 | 32.9 | 71.7 | 10.25 | 14.65 |
| MAPE.1 | 58.8% | 47.06% | 11.85% | 25.83% | 31.65% | 45.29% |
| MAE.2 | 8.5 | 9.5 | 24.5 | 73.9 | 15.9 | 19.19 |
| MAPE.2 | 38.85% | 38.4% | 8.97% | 27.31% | 54.8% | 65.56% |
| MAE.3 | 6 | 5.67 | 44.4 | 100.47 | 21.68 | 24.71 |
| MAPE.3 | 27% | 29.9% | 21.57% | 69.97% | 98.87% | 110.8% |
Notes: (1) VAR(1) means the model was estimated with the original data, while VAR(1) log-log means the model was estimated using the logged data. (2) Actual.1, Actual.2, and Actual.3 denote the actual values in the first, second, and third week after the model fitting data. For IB, these three weeks are 2/3/2021, 2/10/2021, and 2/17/2021; for SL these three weeks are 1/25/2021, 2/1/2021, and 2/8/2021; for GB these three weeks are 1/20/2021, 1/27/2021, and 2/3/2021. (3) Forecast.1, Forecast.2, and Forecast.3 are the model forecast values for the first, second, and third week ahead. (4) MAE.1, MAE.2, and MAE.3 are the MAE measures for the forecasting accuracy for 1 week, 2 weeks and 3 weeks in the future, while MAPE.1, MAPE.2, and MAPE.3 are the MAPE measures for the forecasting accuracy for 1 week, 2 weeks and 3 weeks in the future.
Fig. 6The orthogonal impulse response of logged weekly change in COVID-19 cases from the logged change in SARS-CoV-2 concentration with 95% bootstrap CI (b = 500).
Fig. 7The Bootstrap distributions for the first week, second week and third week forecasts. Results are from 1000 non-parametric stationary block bootstraps. The black dot denotes the median value in the distribution.