| Literature DB >> 35203815 |
Julián Riaño-Moreno1,2, Jhoana P Romero-Leiton3, Kernel Prieto4.
Abstract
This work aims to explain the behavior of the multi-drug resistance (MDR) percentage of Pseudomonas aeruginosa in Europe, through multivariate statistical analysis and machine learning validation, using data from the European Antimicrobial Resistance Surveillance System, the World Health Organization, and the World Bank. We ran a multidimensional data panel regression analysis and used machine learning techniques to validate a pooling panel data case. The results of our analysis showed that the most important variables explaining the MDR phenomena across European countries are governance variables, such as corruption control and the rule of law. The models proposed in this study showed the complexity of the antibiotic drugs resistance problem. The efforts controlling MDR P. aeruginosa, as a well-known Healthcare-Associated Infection (HCAI), should be focused on solving national governance problems that impact resource distribution, in addition to individual guidelines, such as promoting the appropriate use of antibiotics.Entities:
Keywords: corruption index; data panel; governance index; machine learning; multi-drug resistance
Year: 2022 PMID: 35203815 PMCID: PMC8868180 DOI: 10.3390/antibiotics11020212
Source DB: PubMed Journal: Antibiotics (Basel) ISSN: 2079-6382
Geographical distribution of the countries of the EU/EEA regions used in this study.
| Region | Countries |
|---|---|
| Northern | Estonia, Finland, Ireland, Iceland, Norway, Sweden, United Kingdom, Lithuania, Latvia, Denmark. |
| Southern | Cyprus, Greece, Spain, Croatia, Malta, Slovenia, Italy, Portugal. |
| Eastern | Bulgaria, Czechia, Hungary, Poland, Romania, Slovakia. |
| Western | Austria, Germany, France, Luxembourg, Netherlands, Belgium. |
Study variables: description.
| Variable Name | Definition |
|---|---|
| R_multi (or MDR-Pa) | Antimicrobial multi-drug resistance MDR percentages. Defined as combined resistance to at least three antibiotics groups reported by EARSS out of piperacillin-tazobactam, ceftazidime, fluoroquinolones, aminoglycosides, and carbapenems. |
| Year | Years from 2005 to 2018 (time in data panel ( |
| Country | Country name (cross-section geographical units in data panel ( |
| Region | Region name (eastern, northern, southern, western) |
| GDP_total | Gross domestic product |
| GOV_effect | Government effectiveness index (−2.5 weak; 2.5 strong) |
| GDP_health | Gross domestic product for health |
| CTRL_corrup | Control of corruption index (2.5 weak; 2.5 strong) |
| Rule_law | Rule of law (2.5 weak; 2.5 strong) |
| Per_cap_US | Current health expenditure per capita in the US |
| Out_pocket_exp | Out-of-pocket expense |
| HDI | Human development index (0–1) |
Figure 1Clusters using -means method.
Coefficients for the initial panel data for the TWFE of the MDR-Pa model.
| Variable |
| SE | CI 95% | |
|---|---|---|---|---|
| Gdp_health | −0.0259 | 0.0106 | 0.0149 *** | [−0.0468, −0.0468] |
| GDP_total | −1.07 × 10−6 | 1.59 × 10−6 | 0.5019 | [−4.18 × 10−6, 2.05 × 10−6 |
| HDI | −1.0664 | 0.5871 | 0.0702 * | [−2.2212, 0.0883] |
| GOV_effect | 0.02323 | 0.0272 | 0.3936 | [−0.0302,0.0767] |
| Ctrl_corrup | −0.0818 | 0.0278 | 0.0035 *** | [−0.1365, −0.0271] |
| Rule law | 0.1139 | 0.034 | 0.0009 *** | [0.0468, 0.1809] |
| Per_cap_US | 4.02 × 10−5 | 1.79 × 10−6 | 0.0249 *** | [5.09 × 10−6, 7.54 × 10−5 |
| Out_pocket_ exp | 0.006 | 0.0019 | 0.0023 *** | [0.0021, 0.0098] |
*** p-value < 0.05, * p-value < 0.1.
Coefficients for the final panel data for the TWFE of the MDR-Pa model.
| Variable |
| SE |
| CI 95% |
|---|---|---|---|---|
| Gdp_health | −0.018 | 0.0065 | 0.0061 *** | [−0.0311, −0.0052] |
| Ctrl_corrup | −0.079 | 0.0270 | 0.0035 *** | [−0.1325, −0.0263] |
| Rule law | 0.1177 | 0.0317 | 0.0002 *** | [0.0552, 0.1801] |
| Per_cap_US | 2.58 × 10−5 | 9.28 × 10−6 | 0.0056 *** | [7.60 × 10−6, 4.41 × 10−5] |
| Out_pocket_ exp | 0.0053 | 0.0018 | 0.0052 *** | [0.0015, 0.009] |
*** p-value < 0.05.
Two-way fixed effects (TWFE) of the final MDR model validation tests.
| Test | Statistic | |
|---|---|---|
| Durbin–Watson | 1.366244 | 0.0000 *** |
| Breusch–Pagan | 649.0717 | 0.0000 *** |
| Jarque–Bera | 822.4368 | 0.0000 *** |
*** p-value < 0.05.
and for the final TWFE of the MDR-Pa model.
|
| 0.816533 |
|
| 0.792106 |
Figure 2TWFE panel MDR-Pa model by geographical unit: country effects.
Hausman test between fixed and random effects.
|
| DF | |
|---|---|---|
| 96.50 | 9 | 2.2 × |
*** p-value < 0.05.
F-test between pooled OLS and fixed effects.
| F | DF1 | DF2 | |
|---|---|---|---|
| 1.099 | 29 | 362 | 0.3342 |
Performance comparison of polynomial features and the threshold value concerning the covariance value on the target variable .
| m |
| Selected Vars | R2 (Test) | RMSE (Test) | R2 (Train) | R_adjusted2 (Train) |
|---|---|---|---|---|---|---|
| 3 | 0.5 | 68 | 0.392 | 0.094 | 0.778 | 0.718 |
| 3 | 0.3 | 111 | 0.637 | 0.073 | 0.876 | 0.810 |
| 4 | 0.4 | 83 | 0.222 | 0.107 | 0.805 | 0.737 |
| 4 | 0.6 | 34 | −0.417 | 0.144 | 0.722 | 0.688 |
| 5 | 0.7 | 6 | 0.496 | 0.086 | 0.627 | 0.620 |
| 5 | 0.5 | 207 | −7.278 | 0.349 | 0.931 | 0.803 |
| 2 | 0.3 | 11 | 0.510 | 0.085 | 0.662 | 0.650 |
| 2 | 0.2 | 19 | 0.558 | 0.080 | 0.705 | 0.686 |
| 2 | 0.1 | 33 | 0.567 | 0.079 | 0.730 | 0.699 |
Performance comparison of polynomial features, type of algorithm, and the threshold value on the target variable .
| m | Algorithm |
| Selected Vars | R2 (Test) | RMSE (Test) |
|---|---|---|---|---|---|
| 3 | LR | 0.01 | 125 | 0.739 | 0.062 |
| 3 | kNN | 0.01 | 125 | 0.734 | 0.062 |
| 3 | DT | 0.01 | 125 | 0.538 | 0.082 |
| 3 | LR | 0.03 | 84 | 0.747 | 0.061 |
| 3 | kNN | 0.03 | 84 | 0.732 | 0.062 |
| 3 | DT | 0.03 | 84 | 0.563 | 0.080 |
| 3 | LR | 0.05 | 42 | 0.603 | 0.076 |
| 3 | kNN | 0.05 | 42 | 0.738 | 0.062 |
| 3 | DT | 0.05 | 42 | 0.381 | 0.095 |
| 2 | LR | 0.01 | 41 | 0.547 | 0.081 |
| 2 | kNN | 0.01 | 41 | 0.733 | 0.062 |
| 2 | DT | 0.01 | 41 | 0.608 | 0.076 |
| 2 | LR | 0.03 | 35 | 0.564 | 0.080 |
| 2 | kNN | 0.03 | 35 | 0.735 | 0.062 |
| 2 | DT | 0.03 | 35 | 0.527 | 0.083 |
| 5 | LR | 0.05 | 77 | 0.576 | 0.079 |
| 5 | kNN | 0.05 | 77 | 0.732 | 0.062 |
| 5 | DT | 0.05 | 77 | 0.406 | 0.093 |
LR: linear regression algorithm; KNN: k-nearest neighbors algorithm; DT: decision tree (DT) algorithm.
Performance of the k-best variable selection and the recursive feature elimination methods on the target variable .
| m | Algorithm | Filter Method | Selected Vars | R2 (Test) | RMSE (Test) |
|---|---|---|---|---|---|
| 3 | LR | kBVS | 60 | 0.653 | 0.071 |
| 3 | kNN | kBVS | 60 | 0.749 | 0.060 |
| 3 | DT | kBVS | 60 | 0.547 | 0.081 |
| 3 | LR | kBVS | 70 | 0.672 | 0.069 |
| 3 | kNN | kBVS | 70 | 0.759 | 0.059 |
| 3 | DT | kBVS | 70 | 0.669 | 0.069 |
| 3 | LR | kBVS | 80 | 0.699 | 0.066 |
| 3 | kNN | kBVS | 80 | 0.765 | 0.058 |
| 3 | DT | kBVS | 80 | 0.657 | 0.071 |
| 5 | LR | kBVS | 60 | 0.639 | 0.069 |
| 5 | kNN | kBVS | 60 | 0.713 | 0.065 |
| 5 | DT | kBVS | 60 | 0.671 | 0.077 |
| 3 | LR | RFE | 55 | 0.717 | 0.077 |
| 3 | DT | RFE | 7 | 0.662 | 0.084 |
| 5 | LR | RFE | 36 | 0.715 | 0.077 |
| 5 | DT | RFE | 64 | 0.716 | 0.077 |
LR: Linear regression algorithm; KNN: k-nearest neighbors algorithm; DT: decision tree (DT) algorithm; kBVS: k-best variable selection; RFE: recursive feature elimination.
and root mean square error (RMSE) for the ML XGBoost and random forest algorithms models.
| Train (XGBoost) | Test (XGBoost) | Random Forest | |
|---|---|---|---|
|
| 0.92738 | 0.76828 | 0.80342 |
| RMSE | 0.03417 | 0.06279 | 0.05564 |
Figure 3Machine learning using the XGBoost method. (a) Feature importance measured by SHAP values in the training dataset on the target variable MDR-Pa, respectively. (c) Feature importance measured by SHAP values in the testing dataset on the target variable MDR-Pa. (b–d) Impact of features for SHAP values for each feature for the XGBoost method in training (b) and testing (d) the dataset. Every observation is represented by one dot in each feature. The dot’s position on the x-axis represents the impact of that feature on the model’s prediction for the observation, and the dot’s color represents the value of that feature for the observation.
Figure 4Machine learning using the random forest method. (a) Feature’s relevance/importance, measured by SHAP values of in the training dataset on the target variable MDR-Pa. (b) Feature importance measured by SHAP values in feature relevance of the testing dataset on the target variable MDR-Pa.