| Literature DB >> 35351921 |
Abstract
This database provides the daily time-series of COVID-19 cases, deaths, recovered people, tests, vaccinations, and hospitalizations, for more than 230 countries, 760 regions, and 12,000 lower-level administrative divisions. The geographical entities are associated with identifiers to match with hydrometeorological, geospatial, and mobility data. The database includes policy measures at the national and, when available, sub-national levels. The data acquisition pipeline is open-source and fully automated. As most governments revise the data retrospectively, the database always updates the complete time-series to mirror the original source. Vintage data, immutable snapshots of the data taken each day, are provided to ensure research reproducibility. The latest data are updated on an hourly basis, and the vintage data are available since April 14, 2020. All the data are available in CSV files or SQLite format. By unifying the access to the data, this work makes it possible to study the pandemic on a global scale with high resolution, taking into account within-country variations, nonpharmaceutical interventions, and environmental and exogenous variables.Entities:
Mesh:
Year: 2022 PMID: 35351921 PMCID: PMC8964767 DOI: 10.1038/s41597-022-01245-1
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
The table reports the epidemiological variables included in the database and their coverage as of November 27, 2021.
| Field | Description | N. of Administrative Areas (N. of Countries) | ||
|---|---|---|---|---|
| Level 1 | Level 2 | Level 3 | ||
| confirmed | Cumulative number of confirmed cases | 226 (226) | 759 (33) | 12,119 (16) |
| deaths | Cumulative number of deaths | 214 (214) | 672 (29) | 11,657 (12) |
| recovered | Cumulative number of recovered people | 206 (206) | 482 (21) | 1,762 (5) |
| tests | Cumulative number of tests | 139 (139) | 382 (19) | 686 (6) |
| vaccines | Cumulative number of total doses administered | 223 (223) | 305 (16) | 6,837 (7) |
| people_vaccinated | Cumulative number of people who received at least one vaccine dose | 223 (223) | 310 (17) | 11,269 (10) |
| people_fully_vaccinated | Cumulative number of people who received all doses prescribed by the vaccination protocol | 223 (223) | 310 (17) | 11,269 (10) |
| hosp | Number of hospitalized patients on date | 43 (43) | 195 (10) | 164 (3) |
| icu | Number of hospitalized patients in intensive therapy on date | 39 (39) | 202 (9) | 164 (3) |
| vent | Number of patients in intensive therapy requiring invasive ventilation on date | 8 (8) | 62 (5) | 11 (1) |
The last three columns report the number of countries (level 1), sub-national regions (level 2), and lower-level administrative areas (level 3) for which the corresponding variable is available. The number of unique countries is given in parentheses.
Fig. 1The figure displays the spatial coverage and the granularity of the database as of November 27, 2021. In light blue: countries for which only national-level data are available. In blue: regions for which sub-national data are available. In dark blue: lower level areas for which finer-grained data are available.
The table reports the covariates, other than epidemiological variables, included in the database and their coverage as of November 27, 2021.
| Field | Description | N. of Administrative Areas | ||
|---|---|---|---|---|
| Level 1 | Level 2 | Level 3 | ||
| id | Unique identifier for the geographical entity | 236 | 763 | 12,124 |
| administrative_area_level | Level of the administrative area: 1 for countries; 2 for states, regions, cantons, or local equivalent; 3 for cities, municipalities, or local equivalent | 236 | 763 | 12,124 |
| administrative_area_level_1 | Name of the administrative area of level 1 | 236 | 763 | 12,124 |
| administrative_area_level_2 | Name of the administrative area of level 2 | 763 | 12,124 | |
| administrative_area_level_3 | Name of the administrative area of level 3 | 12,124 | ||
| latitude | Latitude | 229 | 763 | 12,124 |
| longitude | Longitude | 229 | 763 | 12,124 |
| population | Total population | 235 | 763 | 12,124 |
| iso_alpha_3 | 3-letter code of the country according to the standard ISO 3166-1 Alpha-3 | 233 | 763 | 12,124 |
| iso_alpha_2 | 2-letter code of the country according to the standard ISO 3166-1 Alpha-2 | 233 | 763 | 12,124 |
| iso_numeric | Numeric code of the country according to the standard ISO 3166-1 Numeric | 232 | 763 | 12,124 |
| iso_currency | 3-letter code of the currency used in the country according to the standard ISO 4217 | 233 | 763 | 12,124 |
| key_local | The administrative area identifier used by the local authorities, usually the national institute of statistics or local equivalent. E.g., FIPS codes for United States, IBGE codes for Brazil, ISTAT codes for Italy, etc. | 568 | 12,053 | |
| key_google_mobility | The place_id used in Google Mobility Reports. The identifier also allows to interact with the Google Places and Google Maps API | 135 | 539 | 6,825 |
| key_apple_mobility | The administrative area identifier identifier used in Apple Mobility Reports. This is constructed by concatenating region and, when not null, sub-region in Apple Mobility Reports: i.e., <region>, <sub-region> | 67 | 475 | 2,159 |
| key_jhu_csse | The administrative area identifier identifier used in the JHU CSSE Unified COVID-19 Dataset. In particular, this enables to match administrative areas with the Hydromet dataset | 192 | 508 | 10,708 |
| key_nuts | The 2021 NUTS codes for Europe, by Eurostat | 232 | 1,864 | |
| key_gadm | The identifier (GID) used in the GADM database version 3.6 | 233 | 760 | 11,977 |
The last three columns report the number of countries (level 1), sub-national regions (level 2), and lower-level administrative areas (level 3) for which the corresponding variable is available.
Fig. 2Workflow design. For a given level of granularity and for each country, the epidemiological data are downloaded from several sources, mapped into a standardized data frame, and merged using the lookup tables. Then, a top-level function collects the data for all the countries at the desired level of granularity. Finally, the epidemiological data are merged with additional covariates and with policy measures. Examples of data sources, variables, and countries are given in parentheses.
| Measurement(s) | cases • deaths • recovered • tests • vaccines • hospitalizations • population |
| Technology Type(s) | digital curation • digital curation |
| Sample Characteristic – Organism | coronaviridae |
| Sample Characteristic – Location | global |