| Literature DB >> 35915104 |
Estee Y Cramer1, Yuxin Huang1, Yijin Wang1, Evan L Ray1, Matthew Cornell1, Johannes Bracher2,3, Andrea Brennen4, Alvaro J Castro Rivadeneira1, Aaron Gerding1, Katie House1, Dasuni Jayawardena1, Abdul Hannan Kanji1, Ayush Khandelwal1, Khoa Le1, Vidhi Mody1, Vrushti Mody1, Jarad Niemi5, Ariane Stark1, Apurv Shah1, Nutcha Wattanchit1, Martha W Zorn1, Nicholas G Reich6.
Abstract
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.Entities:
Mesh:
Year: 2022 PMID: 35915104 PMCID: PMC9342845 DOI: 10.1038/s41597-022-01517-w
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Time series of weekly incident deaths at the national level and forecasts from the COVID-19 Forecast Hub ensemble model for selected weeks in 2020 and 2021. Ensemble forecasts (blue) with 50%, 80% and 95% prediction intervals shown in shaded regions and the ground-truth data (black) for incident cases (A), incident hospitalizations (B), incident deaths (C) and cumulative deaths (D). The truth data come from JHU CSSE (panels A, C, D) and HealthData.gov (panel B).
Fig. 2Schematic of the data storage and related infrastructure surrounding the COVID-19 Forecast Hub. (A) Forecasts are submitted to the COVID-19 Forecast Hub GitHub repository and undergo data format validation before being accepted into the system. (B) A continuous integration service ensures that the GitHub repository and PostgreSQL database stay in sync with mirrored versions of the data. (C) Truth data for visualization, evaluation, and ensemble building are retrieved once per week using both the covidHubUtils and the covidData R packages. Truth data are stored in both repositories. (D) Once per week, an ensemble forecast submission is made using the covidEnsembles R package. It is submitted to the GitHub repository and undergoes the same validation as other submissions. (E) Using the covidHubUtils R package, forecast and truth data may be extracted from either the GitHub or PostgreSQL database in a standard format for tasks such as scoring or plotting.
Forecast characteristics for all four outcomes.
| Outcome | Scale | Locations | Horizons Stored | Number of quantiles for probabilistic forecasts | Earliest Forecast Date | First date of standardized truth data | Date of first ensemble forecast | ||
|---|---|---|---|---|---|---|---|---|---|
| County | State | National | |||||||
| Incident Cases | Weekly | X | X | X | 1 - 8 weeks | 7 | 2020-07-05 | 2020-03-15 | 2020-07-18 |
| Incident Hospitalizations | Daily | X | X | 1 - 130 days | 23 | 2020-03-27 | 2020-11-16 | 2020-12-05 | |
| Incident Deaths | Daily | X | X | 1 - 130 days | 23 | 2020-03-15 | 2020-03-15 | NA | |
| Incident Deaths | Weekly | X | X | 1-20 weeks | 23 | 2020-03-15 | 2020-03-15 | 2020-06-20 | |
| Cumulative Deaths | Daily | X | X | 1 - 130 days | 23 | 2020-03-15 | 2020-03-15 | NA | |
| Cumulative Deaths | Weekly | X | X | 1-20 weeks | 23 | 2020-03-15 | 2020-03-15 | 2020-04-13 | |
The table shows the temporal scale, spatial scale of locations, horizons stored, number of quantiles, and dates of the earliest forecast, earliest standardized truth data, and earliest ensemble build.
Fig. 3Number of primary forecasts submitted for each outcome per week from April 27th, 2020 through May 3rd, 2022. In the initial weeks of submission, fewer than 10 models provided forecasts. Over time, the number of teams submitting forecasts for each forecasted outcome increased into early 2021 and then saw a small decline through the end of 2021, with some renewed interest in 2022.
Fig. 4Visualization tool updated weekly by the US COVID-19 Forecast Hub displays model forecasts and truth data at selected forecast dates, locations, forecast outcomes and PI levels. US national level incident death forecasts from 39 models are shown with point values and a 50% PI. These forecasts are for 1 through 4 week ahead horizons. Data used for forecasting were generated on July 24th, 2021. The visualization tool is available at: https://viz.covid19forecasthub.org.