| Literature DB >> 31304329 |
Eun Kyong Shin1, Ruhi Mahajan1, Oguz Akbilgic1,2, Arash Shaban-Nejad1.
Abstract
The importance of social components of health has been emphasized both in epidemiology and public health. This paper highlights the significant impact of social components on health outcomes in a novel way. Introducing the concept of sociomarkers, which are measurable indicators of social conditions in which a patient is embedded, we employed a machine learning approach that uses both biomarkers and sociomarkers to identify asthma patients at risk of a hospital revisit after an initial visit with an accuracy of 66%. The analysis has been performed over an integrated dataset consisting of individual-level patient information such as gender, race, insurance type, and age, along with ZIP code-level sociomarkers such as poverty level, blight prevalence, and housing quality. Using this uniquely integrated database, we then compare the traditional biomarker-based risk model and the sociomarker-based risk model. A biomarker-based predictive model yields an accuracy of 65% and the sociomarker-based model predicts with an accuracy of 61%. Without knowing specific symptom-related features, the sociomarker-based model can correctly predict two out of three patients at risk. We systematically show that sociomarkers play an important role in predicting health outcomes at the individual level in pediatric asthma cases. Additionally, by merging multiple data sources with detailed neighborhood-level data, we directly measure the importance of residential conditions for predicting individual health outcomes.Entities:
Keywords: Population screening; Risk factors
Year: 2018 PMID: 31304329 PMCID: PMC6550159 DOI: 10.1038/s41746-018-0056-y
Source DB: PubMed Journal: NPJ Digit Med ISSN: 2398-6352
Fig. 1Analytic framework: Sociomarkers and biomarkers
Variables and operationalization
| Operationalization | ||
|---|---|---|
| Revisit dummy | 0: No revisit in 2016 | 2855 (77.62) |
| 1: Revisit in 2016 | 823 (22.38) | |
| Gender | 0: Female | 1442 (39.21) |
| 1: Male | 2236 (60.79) | |
| Race | 0: White | 389 (10.58) |
| 1: 1: African American | 3289 (89.42) | |
| Age | Age of a patient in years | 7.422 (4.66) |
| Length of hospitalization | Days of hospitalization | .90 (1.43) |
| Symptom severity | Severity of pediatric asthma (ICD 10 codes) | 1.57 (0.60) |
| Insurance type | 0: Non-medicaid Patient | 533 (14.49) |
| 1: Medicaid Patient | 3145 (85.51) | |
| Blight prevalence | The ratio of the unoccupied properties within a ZIP code area | .048 (.024) |
| Housing quality | Mean of ratings of the property qualities located within a ZIP code area (1: Excellent and 5: Severely dilapidated) | 1.90 (.283) |
| Neighborhood inequality | Standard deviation of housing quality data within a ZIP code area | .80(.18) |
| Poverty level | Percentage of individuals under the federal poverty level within a ZIP code area | .31(.11) |
| Trash presence | The ratio of the properties with dumped trash within a ZIP code area | .0050(.0047) |
Classification statistics (in %) for each model with RF and SVM techniques
| Test set | Training set | ||||||
|---|---|---|---|---|---|---|---|
| Acc. | Spec. | Sens. | Acc. | Spec. | Sens. | ||
| RF | Model 1 | 66.05 | 67.63 | 64.82 | 66.11 | 67.67 | 64.82 |
| Model 2 | 65.39 | 67.11 | 64.07 | 65.48 | 67.12 | 64.14 | |
| Model 3 | 61.17 | 62.59 | 60.11 | 61.28 | 62.70 | 60.16 | |
| SVM | Model 1 | 62.10 | 62.00 | 62.32 | 62.21 | 62.10 | 62.35 |
| Model 2 | 59.58 | 59.89 | 59.41 | 59.70 | 59.96 | 59.48 | |
| Model 3 | 57.83 | 59.07 | 56.98 | 57.97 | 59.17 | 57.08 | |
Two-tailed t-test results to compare accuracies of models
| Model 1 vs. Model 2 | Model 1 vs. Model 3 | Model 2 vs. Model 3 | |
|---|---|---|---|
| RF | 0.65 *** | 4.88*** | 4.22*** |
| SVM | 2.51*** | 4.26*** | 1.75*** |
***0.001; **0.01; *0.05
Feature importance results obtained from RF
| All-inclusive model (Model 1) | Biomarker (Model 2) | Sociomarker (Model 3) | |
|---|---|---|---|
| Age | 0.22 | 0.26 | 0.44 |
| Gender | 0.05 | 0.04 | 0.08 |
| Race | 0.02 | 0.03 | 0.04 |
| Duration | 0.34 | 0.61 | NA |
| Severity | 0.07 | 0.05 | NA |
| Blight | 0.05 | NA | 0.07 |
| Broken window | 0.04 | NA | 0.05 |
| Dumping trash | 0.04 | NA | 0.06 |
| Neighborhood quality | 0.05 | NA | 0.06 |
| Neighborhood inequality | 0.04 | NA | 0.08 |
| Poverty | 0.05 | NA | 0.06 |
| Medicaid | 0.03 | NA | 0.06 |