| Literature DB >> 35937952 |
Evan V Goldstein1, Fernando A Wilson1,2,3.
Abstract
Introduction: The federal government legislated supplemental funding to support community health centers (CHCs) in response to the COVID-19 pandemic. Supplemental funding included standard base payments and adjustments for the number of total and uninsured patients served before the pandemic. However, not all CHCs share similar patient population characteristics and health risks. Objective: To use machine learning to identify the most important factors for predicting whether CHCs had a high burden of patients diagnosed with COVID-19 during the first year of the pandemic.Entities:
Keywords: COVID-19; community health; community health centers; health promotion; prevention; primary care
Year: 2022 PMID: 35937952 PMCID: PMC9354120 DOI: 10.1177/23333928221115894
Source DB: PubMed Journal: Health Serv Res Manag Epidemiol ISSN: 2333-3928
List of Features Used in the RF Model (n = 82).
| Feature |
|---|
| Percent of patient population non-Hispanic, black |
| Percent of patient population non-Hispanic, white |
| Percent of patient population non-Hispanic, Asian |
| Percent of patient population Hispanic |
| Percent of patient population female |
| Percent of patient population ages 18-64 |
| Percent of patient population ages 65 and older |
| Percent of patient population living at or below the Federal Poverty Line |
| Percent of patient population veterans |
| Percent of patient population without health insurance |
| Percent of patient population covered by Medicaid |
| Percent of patient population covered by Medicare |
| Percent of patient population covered by private health insurance |
| Medicaid managed care enrollment (member months) |
| Percent of patient population diagnosed with obesity |
| Percent of patient population diagnosed with depression or mood disorder |
| Percent of patient population diagnosed with anxiety |
| Percent of patient population diagnosed with diabetes |
| Percent of patient population diagnosed with heart disease |
| Percent of patient population diagnosed with hypertension |
| Percent of patient population diagnosed with asthma |
| Percent of patient population diagnosed with tobacco use disorder |
| Total patients; 1000s |
| Total annual Health Resources & Services Administration funding |
| Indicator if CHC was Health Care for the Homeless “special populations” provider |
| Indicator if CHC was Public Housing Primary Care “special populations” provider |
| Indicator if CHC served predominantly rural (vs urban) patient population |
| Indicator if state enacted any mask-wearing mandate policy in 2020 |
| COVID-19 cases per capita (state level) |
| Unemployment rate (state level) |
| Poverty rate (state level) |
| 51 geographic identifier indicators, one for each U.S. state and D.C. |
Notes: The RF model used the features shown in this table to predict whether CHCs had a high burden of patients diagnosed with COVID-19.
Performance Metrics of the Tuned Random Forest Model.
| Performance metric | |
|---|---|
| Accuracy (%) | 80.9 |
| Precision (%) | 80.1 |
| Sensitivity (%) | 25.0 |
| Specificity (%) | 98.1 |
| AUC-ROC | 78.2 |
Notes: K-fold CV with five folds was used to validate the model while optimizing the model’s hyperparameters. These performance metrics were calculated following the application of the model that had the best statistical fit to the held-out test set.
Figure 1.Relative importance of the top 20 features in predicting whether CHCs had a high burden of COVID-19 during the first year of the pandemic: 2020. Notes: Feature importance was computed using the Mean Decrease in Impurity method.