| Literature DB >> 34198627 |
Carlos Martin-Barreiro1,2, John A Ramirez-Figueroa1,2, Xavier Cabezas2, Víctor Leiva3, M Purificación Galindo-Villardón1.
Abstract
In this paper, we group South American countries based on the number of infected cases and deaths due to COVID-19. The countries considered are: Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Peru, Paraguay, Uruguay, and Venezuela. The data used are collected from a database of Johns Hopkins University, an institution that is dedicated to sensing and monitoring the evolution of the COVID-19 pandemic. A statistical analysis, based on principal components with modern and recent techniques, is conducted. Initially, utilizing the correlation matrix, standard components and varimax rotations are calculated. Then, by using disjoint components and functional components, the countries are grouped. An algorithm that allows us to keep the principal component analysis updated with a sensor in the data warehouse is designed. As reported in the conclusions, this grouping changes depending on the number of components considered, the type of principal component (standard, disjoint or functional) and the variable to be considered (infected cases or deaths). The results obtained are compared to the k-means technique. The COVID-19 cases and their deaths vary in the different countries due to diverse reasons, as reported in the conclusions.Entities:
Keywords: R software; SARS-Cov2; data science; disjoint and functional components; infectious diseases; k-means clustering; multivariate statistical methods; sensing and data extraction
Mesh:
Year: 2021 PMID: 34198627 PMCID: PMC8232170 DOI: 10.3390/s21124094
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Data matrix of the number of COVID-19 infected cases per million inhabitants for the indicated month and country.
| Month | ARG | BOL | BRA | CHI | COL | ECU | PER | PRY | URY | VEN |
|---|---|---|---|---|---|---|---|---|---|---|
| 2020-03 | 23.320 | 9.168 | 26.887 | 148.671 | 17.808 | 126.961 | 32.300 | 9.112 | 97.303 | 4.748 |
| 2020-04 | 74.653 | 90.807 | 383.281 | 777.248 | 110.077 | 1286.283 | 1089.142 | 28.181 | 87.802 | 6.963 |
| 2020-05 | 274.872 | 755.160 | 2011.963 | 5537.081 | 449.583 | 802.809 | 3866.934 | 100.945 | 51.820 | 41.393 |
| 2020-06 | 1054.943 | 1990.654 | 4173.853 | 8152.400 | 1345.499 | 982.482 | 3661.822 | 173.150 | 32.533 | 151.992 |
| 2020-07 | 2804.951 | 3732.537 | 5929.842 | 3990.017 | 3884.646 | 1639.341 | 3708.588 | 437.013 | 94.426 | 448.096 |
| 2020-08 | 5010.048 | 3410.341 | 5860.889 | 2932.538 | 6280.816 | 1610.378 | 7269.051 | 1727.860 | 95.286 | 990.088 |
| 2020-09 | 7373.831 | 1603.095 | 4246.639 | 2681.757 | 4217.234 | 1319.497 | 4992.196 | 3238.125 | 129.832 | 998.524 |
| 2020-10 | 9202.697 | 552.215 | 3409.259 | 2472.506 | 4805.250 | 1765.282 | 2681.440 | 3144.328 | 310.331 | 594.001 |
| 2020-11 | 5699.844 | 252.801 | 3764.939 | 2170.254 | 4768.245 | 1388.254 | 1891.010 | 2697.360 | 786.763 | 365.067 |
| 2020-12 | 4446.897 | 1320.651 | 6304.567 | 2993.796 | 6406.262 | 1123.783 | 1595.511 | 3576.292 | 3817.801 | 392.603 |
| 2021-01 | 6675.956 | 4858.291 | 7192.142 | 6179.886 | 8885.288 | 2171.731 | 3733.550 | 3546.429 | 6511.452 | 470.146 |
| 2021-02 | 3985.461 | 2756.354 | 6334.831 | 5101.220 | 3081.706 | 2002.317 | 5629.776 | 3679.898 | 4679.701 | 428.651 |
| 2021-03 | 1379.088 | 638.395 | 3062.888 | 2266.347 | 673.508 | 605.678 | 1703.270 | 2038.407 | 2444.060 | 147.877 |
Loading matrices for data of the number of COVID infected cases with two components (C1 and C2) for the indicated country and method.
| Varimax | DPCA | |||
|---|---|---|---|---|
| Country | C1 | C2 | C1 | C2 |
| ARG | 0.943 | 0.068 | 0.484 | 0.000 |
| BOL | 0.300 | 0.874 | 0.000 | 0.519 |
| BRA | 0.585 | 0.743 | 0.000 | 0.519 |
| CHI | −0.163 | 0.871 | 0.000 | 0.415 |
| COL | 0.827 | 0.401 | 0.466 | 0.000 |
| ECU | 0.649 | 0.554 | 0.417 | 0.000 |
| PER | 0.316 | 0.687 | 0.000 | 0.416 |
| PRY | 0.883 | 0.121 | 0.438 | 0.000 |
| URY | 0.381 | 0.446 | 0.000 | 0.340 |
| VEN | 0.794 | 0.224 | 0.428 | 0.000 |
Figure 1Space of countries for data of the number of COVID-19 (a) infected cases and (b) deaths.
Loading matrix for data on the number of COVID infected cases with three components (C1, C2, and C3) for the indicated country using the DPCA.
| DPCA | |||
|---|---|---|---|
| Country | C1 | C2 | C3 |
| ARG | 0.524 | 0.000 | 0.000 |
| BOL | 0.000 | 0.541 | 0.000 |
| BRA | 0.000 | 0.525 | 0.000 |
| CHI | 0.000 | 0.446 | 0.000 |
| COL | 0.515 | 0.000 | 0.000 |
| ECU | 0.472 | 0.000 | 0.000 |
| PER | 0.000 | 0.483 | 0.000 |
| PRY | 0.000 | 0.000 | 0.707 |
| URY | 0.000 | 0.000 | 0.707 |
| VEN | 0.487 | 0.000 | 0.000 |
Data matrix of the number of COVID-19 deaths per million inhabitants for the indicated month and country.
| Month | ARG | BOL | BRA | CHI | COL | ECU | PER | PRY | URY | VEN |
|---|---|---|---|---|---|---|---|---|---|---|
| 2020-03 | 0.598 | 0.514 | 0.946 | 0.625 | 0.315 | 4.25 | 0.912 | 0.42 | 0.288 | 0.105 |
| 2020-04 | 4.228 | 4.802 | 27.309 | 11.245 | 5.446 | 46.759 | 30.97 | 0.98 | 4.608 | 0.456 |
| 2020-05 | 7.103 | 21.505 | 109.652 | 43.261 | 12.695 | 139.319 | 104.785 | 0.14 | 1.44 | 1.46 |
| 2020-06 | 16.992 | 69.389 | 142.453 | 242.41 | 47.07 | 66.258 | 156.833 | 0.84 | 1.44 | 1.3 |
| 2020-07 | 49.472 | 158.827 | 154.691 | 197.164 | 133.072 | 66.599 | 283.394 | 4.481 | 2.304 | 3.982 |
| 2020-08 | 113.22 | 175.617 | 135.991 | 95.837 | 187.824 | 48.405 | 296.222 | 38.837 | 2.592 | 7.807 |
| 2020-09 | 183.136 | 251.691 | 106.187 | 75.957 | 124.52 | 272.006 | 109.426 | 74.447 | 1.152 | 8.511 |
| 2020-10 | 311.202 | 65.108 | 74.954 | 76.687 | 104.477 | 74.534 | 61.111 | 76.691 | 2.88 | 5.981 |
| 2020-11 | 170.991 | 19.877 | 62.272 | 62.93 | 107.144 | 44.833 | 45.854 | 49.35 | 5.472 | 3.489 |
| 2020-12 | 99.897 | 17.817 | 102.695 | 62.666 | 126.704 | 32.477 | 53.289 | 70.943 | 29.943 | 4.615 |
| 2021-01 | 104.633 | 103.999 | 139.043 | 96.464 | 211.664 | 46.757 | 101.482 | 63.931 | 73.409 | 5.663 |
| 2021-02 | 88.304 | 108.795 | 143.198 | 110.903 | 113.657 | 53.959 | 159.924 | 64.912 | 49.515 | 5.453 |
| 2021-03 | 30.843 | 20.133 | 73.927 | 33.165 | 19.789 | 16.664 | 56.533 | 28.879 | 20.151 | 1.934 |
Loading matrices for data on the number of COVID deaths with two components for the indicated country and method.
| Country | Varimax | DPCA | ||
|---|---|---|---|---|
| ARG | 0.849 | −0.063 | 0.438 | 0.000 |
| BOL | 0.502 | 0.716 | 0.000 | 0.473 |
| BRA | 0.227 | 0.891 | 0.000 | 0.512 |
| CHI | −0.098 | 0.847 | 0.000 | 0.452 |
| COL | 0.727 | 0.517 | 0.476 | 0.000 |
| ECU | 0.365 | 0.26 | 0.000 | 0.247 |
| PER | 0.011 | 0.935 | 0.000 | 0.499 |
| PRY | 0.964 | −0.074 | 0.509 | 0.000 |
| URY | 0.394 | 0.025 | 0.259 | 0.000 |
| VEN | 0.880 | 0.418 | 0.505 | 0.000 |
Loading matrix for data on the number of COVID deaths with three components (C1, C2, and C3) for the indicated country using the DPCA.
| DPCA | |||
|---|---|---|---|
| Country | C1 | C2 | C3 |
| ARG | 0.438 | 0.000 | 0.000 |
| BOL | 0.000 | 0.000 | 0.707 |
| BRA | 0.000 | 0.594 | 0.000 |
| CHI | 0.000 | 0.566 | 0.000 |
| COL | 0.476 | 0.000 | 0.000 |
| ECU | 0.000 | 0.000 | 0.707 |
| PER | 0.000 | 0.571 | 0.000 |
| PRY | 0.509 | 0.000 | 0.000 |
| URY | 0.259 | 0.000 | 0.000 |
| VEN | 0.505 | 0.000 | 0.000 |
Figure 2Cumulative variance plots of the number of COVID-19 (a) infected cases and (b) deaths.
Figure 3Plots of the number of COVID-19 (a) infected cases and (b) deaths for the indicated country and month.
Figure 4Cluster plots with (a,b) and (c,d) 3 of the number of COVID-19 (a,c) infected cases and (b,d) deaths.
Figure 5Harmonic components (a,c) 1 and (b,d) 2 of the FPCA for data on the number of COVID-19 (a,b) infected cases, and (c,d) deaths.
Loading matrix for data on the number of COVID infected cases with two components (C1 and C2) for the indicated country using the FPCA.
| FPCA | ||
|---|---|---|
| Country | C1 | C2 |
| ARG | 7707.767 | −4791.927 |
| BOL | −3319.535 | 2075.921 |
| BRA | 5959.808 | 2033.667 |
| CHI | 1755.468 | 5492.041 |
| COL | 5692.081 | −1991.679 |
| ECU | −4878.641 | −249.683 |
| PER | 2686.448 | 3222.517 |
| PRY | −2023.569 | −3145.071 |
| URY | −5732.866 | −1702.523 |
| VEN | −7846.961 | −943.263 |
Figure 6FPCA space of countries for data on the number of COVID-19 (a) infected cases and (b) deaths.
Loading matrix for data on the number of COVID-19 deaths with two components (C1 and C2) for the indicated country using the FPCA.
| FPCA | ||
|---|---|---|
| Country | C1 | C2 |
| ARG | 64.104 | 266.481 |
| BOL | 100.787 | −6.533 |
| BRA | 91.109 | −47.293 |
| CHI | 86.967 | −101.372 |
| COL | 90.490 | 61.045 |
| ECU | 1.621 | 36.494 |
| PER | 243.162 | −133.709 |
| PRY | −162.396 | 43.453 |
| URY | −252.150 | −55.442 |
| VEN | −263.694 | −63.125 |
Center of the listed cluster for the number of COVID-19 infected cases in the indicated country using the k-means method.
| Cluster | ARG | BOL | BRA | CHI | COL | ECU | PER | PRY | URY | VEN |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 561.375 | 696.837 | 1931.774 | 3376.349 | 519.295 | 760.843 | 2070.694 | 469.959 | 542.704 | 70.595 |
| 2 | 6018.274 | 1910.198 | 4642.314 | 2849.414 | 4791.238 | 1544.550 | 4108.457 | 2248.937 | 283.328 | 679.155 |
| 3 | 5036.105 | 2978.432 | 6610.513 | 4758.301 | 6124.419 | 1765.944 | 3652.946 | 3600.873 | 5002.985 | 430.467 |
Membership of the listed cluster for the number of COVID-19 infected cases in the indicated month using the k-means method.
| Month | 20-03 | 20-04 | 20-05 | 20-06 | 20-07 | 20-08 | 20-09 | 20-10 | 20-11 | 20-12 | 21-01 | 21-02 | 21-03 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cluster | 1 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 1 |
Center of the listed cluster for the number of COVID-19 deaths in the indicated country using the k-means method.
| Cluster | ARG | BOL | BRA | CHI | COL | ECU | PER | PRY | URY | VEN |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 247.169 | 158.400 | 90.571 | 76.322 | 114.499 | 173.270 | 85.269 | 75.569 | 2.016 | 7.246 |
| 2 | 52.277 | 14.108 | 62.800 | 35.649 | 45.349 | 47.384 | 48.724 | 25.119 | 10.317 | 2.010 |
| 3 | 74.524 | 123.325 | 143.075 | 148.556 | 138.657 | 56.396 | 199.571 | 34.600 | 25.852 | 4.841 |
Membership of the listed cluster for the number of COVID-19 deaths in the indicated month using the k-means method.
| Month | 20-03 | 20-04 | 20-05 | 20-06 | 20-07 | 20-08 | 20-09 | 20-10 | 20-11 | 20-12 | 21-01 | 21-02 | 21-03 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cluster | 2 | 2 | 2 | 3 | 3 | 3 | 1 | 1 | 2 | 2 | 3 | 3 | 2 |