| Literature DB >> 36128042 |
Syed Muhammad Ishraque Osman1, Ahmed Sabit2.
Abstract
In this study, we examine state-level features and policies that are most important in achieving a threshold level vaccination rate to curve the effects of the COVID-19 pandemic. We employ CHAID, a decision tree algorithm, on three different model specifications to answer this question based on a dataset that includes all the states in the United States. Workplace travel emerges as the most important predictor; however, the governors' political affiliation (PA) replaces it in a more conservative feature set that includes economic features and the growth rate of COVID-19 cases. We also employ several alternative algorithms as a robustness check. Results from these checks confirm our original findings regarding workplace travels and political affiliation. The accuracy under different model specifications ranges from 80%-88%, whereas the sensitivity is between 92.5%-100%. Our findings provide actionable policy insights to increase vaccination rates and combat the COVID-19 pandemic.Entities:
Keywords: COVID-19; Decision tree; Health policy; Machine learning; Vaccination; Vaccine hesitancy
Year: 2022 PMID: 36128042 PMCID: PMC9479385 DOI: 10.1016/j.mlwa.2022.100408
Source DB: PubMed Journal: Mach Learn Appl ISSN: 2666-8270
Definition of the features.
| Features | Description |
|---|---|
| Economic Index | A single statistic that summarizes economic condition of a state |
| Grocery Travels | Change in visits to grocery stores relative to the baseline period |
| Covid Growth Rate | Growth rate of COVID cases |
| Mask Mandate State | State issued mask mandate |
| Mask Mandate School | Requirement of wearing a mask in school |
| Park Visits | Change in visits to parks relative to the baseline |
| Political Affiliation | Political affiliation of the Governor of a state |
| Residential Travels | Change in visits to places of residence relative to the baseline |
| Retail & Recreation | Change in visits to places like restaurants, shopping centers and libraries relative to the baseline |
| Retail Sales | State-level retail sales |
| State Emergency | Whether a state declared emergency or not |
| Transit | Change in visits to transit stations (subway, bus/train stations) relative to the baseline |
| Vaccine Mandate State | Any type of vaccine mandate by a state |
| Vaccine Mandate School | Vaccine mandate for school employees |
| Workplace Travels | Change in visits to workplaces relative to the baseline |
Summary statistics of the features (Vaccination threshold not met).
| Features | Type | Mean | Median | Standard dev. |
|---|---|---|---|---|
| Economic Index | Float | 131.23 | 130.87 | 13.80 |
| Grocery Travels | Float | 10.83 | 10.71 | 6.69 |
| Covid Growth Rate | Float | 0.27 | 0.22 | 0.32 |
| Mask Mandate State | Categorical (0/1) | 0.25 | 0.0 | 0.63 |
| Mask Mandate School | Categorical (0/1) | 0.47 | 0.0 | 0.71 |
| Park Visits | Float | 60.56 | 63.82 | 45.85 |
| Political Affiliation (PA) | Categorical (0/1) | 0.62 | 1.0 | 0.49 |
| Residential Travels | Float | 3.39 | 3.43 | 1.34 |
| Retail Recreation | Float | 6.90 | 8.75 | 7.20 |
| Retail Sales | Float | 31.77 | 31.40 | 3.94 |
| State Emergency | Categorical (0/1) | 0.50 | 0.50 | 0.50 |
| Transit | Float | 5.90 | 8.21 | 14.18 |
| Vaccine Mandate State | Categorical (0/1) | 0.82 | 1.0 | 0.84 |
| Vaccine Mandate School | Categorical (0/1) | 0.07 | 0.0 | 0.26 |
| Workplace Travels | Float | −18.55 | −18.42 | 3.05 |
Robustness check: Top features (Chronologically ranked) by importance across ML algorithms.
| Algorithm | Model 1 | Model 2 | Model 3 |
|---|---|---|---|
| LCV | Workplace, Economic Index, Retail Sales | Workplace, Economic Index, Retail Sales | PA |
| RCV | Workplace, Vaccine School, Growth Cases | Workplace,Growth Cases, Residential | PA |
| ECV | Workplace, Retail Sales, Economic Index | Workplace, Retail Sales, Economic Index | PA |
| LR | Vaccination School, PA, Workplace | PA, Growth Cases,Workplace | PA |
| SVM | Workplace, Vaccine Mandate, PA | Workplace, Residential, PA | PA |
| XGB | Workplace, Vaccine Schools, Parks | Workplace, Parks, Growth Cases | PA |
Fig. 1Summary Statistics of Continuous Value Features by Vaccination Threshold.
Summary statistics of the features (Vaccination threshold met).
| Features | Type | Mean | Median | Standard dev. |
|---|---|---|---|---|
| Economic Index | Float | 127.64 | 124.15 | 11.04 |
| Grocery Travels | Float | 6.42 | 6.20 | 5.15 |
| Covid Growth Rate | Float | 0.04 | 0.05 | 0.07 |
| Mask Mandate State | Categorical (0/1) | 0.60 | 0.0 | 0.84 |
| Mask Mandate School | Categorical (0/1) | 0.80 | 0.0 | 0.42 |
| Park Visits | Float | 75.64 | 69.40 | 30.44 |
| Political Affiliation (PA) | Categorical (0/1) | 0.10 | 0.0 | 0.31 |
| Residential Travels | Float | 5.81 | 5.83 | 1.82 |
| Retail Recreation | Float | 2.51 | 2.22 | 6.90 |
| Retail Sales | Float | 37.43 | 33.45 | 7.20 |
| State Emergency | Categorical (0/1) | 0.40 | 0.0 | 0.51 |
| Transit | Float | −15.22 | −17.56 | 15.02 |
| Vaccine Mandate State | Categorical (0/1) | 0.90 | 1.0 | 0.31 |
| Vaccine Mandate School | Categorical (0/1) | 0.50 | 0.50 | 0.52 |
| Workplace Travels | Float | −24.56 | −25.17 | 3.12 |
Fig. 2CHAID model structure.
Fig. 3CHAID dendrogram for Model 1.
Fig. 4CHAID dendrogram for Model 2.
Fig. 5CHAID dendrogram for Model 3.
Models’ performances.
| Models | In-sample | Cross-validated | ||
|---|---|---|---|---|
| Accuracy | Sensitivity | Accuracy | Sensitivity | |
| Model 1 | 0.88 | 0.925 | 0.88 | 0.92 |
| Model 2 | 0.80 | 1.0 | 0.80 | 1.0 |
| Model 3 | 0.80 | 1.0 | 0.80 | 1.0 |
Algorithm performance comparison.
| Algorithm | MAE | RMSE | EV | R | MSE |
|---|---|---|---|---|---|
| LCV | 0.26 | 0.33 | 0.46 | 0.44 | 0.12 |
| RCV | 0.25 | 0.32 | 0.57 | 0.55 | 0.12 |
| ECV | 0.26 | 0.33 | 0.46 | 0.45 | 0.12 |
| LR | 0.18 | 0.35 | 0.50 | 0.72 | 0.18 |
| SVM | 0.18 | 0.37 | 0.35 | 0.52 | 0.18 |
| XGB | 0.20 | 0.36 | 0.70 | 0.84 | 0.20 |
Fig. 6Model 1 (Full Model) Feature Importance Scores (X-axis)across Algorithms.
Fig. 7Model 2 Feature Importance Scores (X-axis)across Algorithms.
Fig. 8Model 3 Feature Importance Scores (X-axis)across Algorithms.
Statistical significance of features across models.
| Features | Model 1 | Model 2 | Model 3 |
|---|---|---|---|
| Growth Case | 0.83 (6) | 0.5(2) | 1.0 (2) |
| Economic Index | 0.67 (3) | 0.67 (3) | 1.0 (2) |
| 0.67 (3) | 0.67 (3) | ||
| Retail Recreation | 1.0 (9) | 1.0 (8) | – |
| Grocery and Pharmacy | 1.0 (9) | 1.0 (8) | – |
| Parks | 0.83 (6) | 0.83 (7) | – |
| Transit | 1.0 (9) | 1.0 (8) | – |
| – | |||
| Residential | 1.0 (9) | 0.67 (3) | - |
| State Emergency | 1.0 (9) | – | – |
| Mask Mandate | 1.0 (9) | – | – |
| Vaccine Mandate | 0.83 (6) | – | – |
| Mask School | 1.0 (9) | – | – |
| 0.5 | – | – | |
| Retail Sales | 0.67 | 0.67 (3) | 1.0 (2) |
Note: “–” indicates that these variables were not included in the respective models, and the parenthesis associated to the p-values show the relative ranking of the features in terms of the-value.
Denotes significance at the 1%.
Fig. 9Distribution of p-values for features across model 1. model 2 and model 3.