| Literature DB >> 32632375 |
Abstract
The presented cross-sectional dataset can be employed to analyze the governmental, trade, and competitiveness relationships of official COVID-19 reports. It contains 18 COVID-19 variables generated based on the official reports of 138 countries (European Centre for Disease Prevention and Control, 2020 [1] and Beltekian et al. [2]), as well as an additional 2203 governance, trade, and competitiveness indicators from the World Bank Group GovData360(World Bank Group, 2020 [3]) and TCdata360(World Bank Group, 2020 [4]) platforms. From these platforms, only annual indicators from 2015 and later were collected, and their missing values were replaced with previous annual values, in descending order by year, until 2015. During preprocessing, indicators (columns) were filtered out when the ratio of missing values exceeded 50%. Then, the same filtration was applied for the ratio of missing values above 25% in the case of countries (rows). Finally, duplicated variables were removed from the dataset. As a result of these steps, the missing value rate of the employed indicators was reduced to 4.25% on average. In addition to the database, the Kendall rank correlation matrix is provided to facilitate subsequent analysis. The dataset and the correlation matrix can be updated and customized with an R Notebook file, which is also available publicly in Mendeley Data (Kurbucz, 2020 [5]).Entities:
Keywords: COVID-19; Competitiveness; Data driven approach; Governance; Trade
Year: 2020 PMID: 32632375 PMCID: PMC7303609 DOI: 10.1016/j.dib.2020.105881
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Fig. 2The relationship between the COVID-19, GovData360, and TCdata360 variables
(COVID-19 variables (except for dyssincefstcase, dyssincefstdeath, and dyssincefsttest) are divided by population).
Fig. 3An example: Relationship of a COVID-19 variable to air transport indicators
(For more information about GCI indicators, see metadata or [12]).
Variables description.
| Variable ID | Type | Description | Missing | Source | Dataset |
|---|---|---|---|---|---|
| char | ISO3 country code. | 0% | a, c | ||
| char | ISO2 country code. | 0% | a | ||
| char | The capital city of the country. | 0% | a | ||
| float | The latitude coordinates of the country's capital. | 0% | a | ||
| float | The longitude coordinates of the country's capital. | 0% | a | ||
| int | The population of the countries (2018). | 0% | a | ||
| char | The ID of the indicator. | 0% | [ | b | |
| char | The name of the indicator. | 0% | [ | b | |
| char | The definition of the indicator. | 0% | [ | b | |
| char | The type of the indicator. | 0% | [ | b | |
| char | Type of the sub-indicator. | 0% | [ | b | |
| char | The unit of the indicator. | 0% | [ | b | |
| char | The ID of the dataset that contains the indicator. | 0% | [ | b | |
| char | The name of the dataset that contains the indicator. | 0% | [ | b | |
| char | The URL of the dataset that contains the indicator. | 0% | [ | b | |
| int | The number of days since the first case. | 0%* | c, d | ||
| int | The number of days since the first death. | 12.3%* | c, d | ||
| int | The number of days since the first test. | 42.8%* | c, d | ||
| int | The total number of cases after 15 days from the first case. | 0.7%* | c, d | ||
| int | The total number of deaths after 15 days from the first death. | 14.5%* | c, d | ||
| int | The total number of tests after 15 days from the first test. | 42.8%* | c, d | ||
| int | The total number of cases after 30 days from the first case. | 1.4%* | c, d | ||
| int | The total number of deaths after 30 days from the first death. | 19.6%* | c, d | ||
| int | The total number of tests after 30 days from the first test. | 44.2%* | c, d | ||
| int | The total number of cases after 45 days from the first case. | 1.4%* | c, d | ||
| int | The total number of deaths after 45 days from the first death. | 22.5%* | c, d | ||
| int | The total number of tests after 45 days from the first test. | 47.1%* | c, d | ||
| int | The total number of cases after 60 days from the first case. | 5.1%* | c, d | ||
| int | The total number of deaths after 60 days from the first death. | 50.7%* | c, d | ||
| int | The total number of tests after 60 days from the first test. | 55.1%* | c, d | ||
| int | The total number of cases. | 0%* | c, d | ||
| int | The total number of deaths | 0%* | c, d | ||
| int | The total number of tests. | 42.8%* | c, d | ||
| int, float, boolean | The IDs of indicators obtained from | 3.30% | [ | c, d | |
| int, float, boolean | The IDs of indicators obtained from | 5.22% | [ | c, d |
*These variables were generated by the author. Note that if the given number of days has not yet elapsed since the specified event, the value is missing. The R Notebook is used to update the dataset. **The complete list of GovData360 and TCdata360 indicators is contained by the metadata. For these variables, the averages of the ratio of missing values are indicated.
The steps of the data generation.
| Step | Description | Remark |
|---|---|---|
| 1 | Installing packages and loading libraries | The program recognizes installed packages. |
| 2 | Setting parameters | Default settings: |
| 3 | Collecting | With missing value imputation. |
| 4 | Collecting | With missing value imputation. |
| 5 | Collecting COVID-19 variables | |
| 6 | Generating new COVID-19 variables | |
| 7 | Compiling and preprocessing the joint dataset | |
| 8 | Compiling the correlation matrix | Kendall |
| 9 | Compiling the country dataset and metadata | |
| 10 | Writing datasets into TSV files | New files have the same name as uploaded ones. |
*The data generation process can be customized with these parameters. lastyr marks the last year whose values were still taken into account when indicators were collected from the GovData360 and TCdata360 platforms and their missing values were replaced. During preprocessing, we filtered out those indicators for which the missing value ratio exceeds cmaxmissing. Then, the same filtration was applied above rmaxmissing in the case of countries.
Kendall rank correlation between COVID-19 variables.
| 1.00 | ||||||||||||||||||
| -0.19 | 1.00 | |||||||||||||||||
| -0.06 | 0.72 | 1.00 | ||||||||||||||||
| 0.07 | 0.56 | 0.78 | 1.00 | |||||||||||||||
| 0.14 | 0.44 | 0.62 | 0.80 | 1.00 | ||||||||||||||
| 0.02 | 0.02 | 0.05 | 0.07 | 0.05 | 1.00 | |||||||||||||
| -0.12 | 0.37 | 0.35 | 0.30 | 0.25 | 0.20 | 1.00 | ||||||||||||
| -0.11 | 0.27 | 0.27 | 0.24 | 0.20 | 0.33 | 0.77 | 1.00 | |||||||||||
| -0.11 | 0.26 | 0.26 | 0.23 | 0.20 | 0.39 | 0.69 | 0.89 | 1.00 | ||||||||||
| -0.03 | 0.26 | 0.29 | 0.29 | 0.26 | 0.29 | 0.60 | 0.80 | 0.92 | 1.00 | |||||||||
| 0.25 | -0.06 | 0.00 | 0.05 | 0.06 | -0.14 | 0.00 | 0.02 | 0.01 | 0.07 | 1.00 | ||||||||
| 0.04 | 0.28 | 0.25 | 0.27 | 0.31 | 0.07 | 0.11 | 0.00 | 0.02 | −0.04 | -0.41 | 1.00 | |||||||
| 0.07 | 0.31 | 0.31 | 0.35 | 0.38 | 0.05 | 0.15 | 0.04 | 0.05 | 0.00 | -0.30 | 0.84 | 1.00 | ||||||
| 0.13 | 0.34 | 0.36 | 0.40 | 0.41 | 0.05 | 0.17 | 0.06 | 0.07 | 0.03 | -0.17 | 0.72 | 0.85 | 1.00 | |||||
| 0.13 | 0.31 | 0.35 | 0.43 | 0.47 | 0.10 | 0.20 | 0.12 | 0.13 | 0.08 | -0.13 | 0.67 | 0.73 | 0.83 | 1.00 | ||||
| 0.36 | 0.25 | 0.39 | 0.56 | 0.72 | 0.05 | 0.16 | 0.12 | 0.11 | 0.20 | 0.05 | 0.37 | 0.41 | 0.44 | 0.47 | 1.00 | |||
| 0.35 | 0.17 | 0.35 | 0.49 | 0.60 | 0.08 | 0.10 | 0.12 | 0.14 | 0.25 | 0.12 | 0.21 | 0.25 | 0.25 | 0.29 | 0.72 | 1.00 | ||
| 0.19 | 0.30 | 0.35 | 0.44 | 0.51 | 0.00 | 0.22 | 0.15 | 0.13 | 0.19 | 0.14 | 0.38 | 0.50 | 0.63 | 0.76 | 0.55 | 0.36 | 1.00 |
Fig. 1The relationship between uploaded files
(Without raw data of figures and tables).
| Social Sciences | |
| The role of governmental, trade and competitiveness considerations in the formation of official COVID-19 data | |
| Tab separated text files (.txt) and a R Notebook file (.Rmd). | |
| Datasets are compiled in R. | |
| Preprocessed and preanalyzed secondary data. | |
| 2015 was the last year for which the values were taken into account during the collection of | |
| To obtain the | |
| Today's data on the geographic distribution of COVID-19 cases worldwide | |
| Repository name: Mendeley Data |