| Literature DB >> 36129944 |
Sonya Negriff1, Bistra Dilkina2, Laksh Matai2, Eric Rice3.
Abstract
OBJECTIVE: This study used machine learning (ML) to test an empirically derived set of risk factors for marijuana use. Models were built separately for child welfare (CW) and non-CW adolescents in order to compare the variables selected as important features/risk factors.Entities:
Mesh:
Year: 2022 PMID: 36129944 PMCID: PMC9491564 DOI: 10.1371/journal.pone.0274998
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Sample characteristics for Time 1 and 4.
| Child welfare | Non-Child welfare | |||
|---|---|---|---|---|
| Time 1 | Time 4 | Time 1 | Time 4 | |
| N | 303 | 222 | 151 | 128 |
| Age (std deviation) | 10.84 (1.15) | 18.28 (1.41) | 11.11 (1.15) | 18.15 (1.56) |
| Gender (%) | ||||
| Male | 50 | 47 | 60 | 56 |
| Female | 50 | 53 | 40 | 44 |
| Ethnicity (%) | ||||
| African American | 40 | 43 | 32 | 35 |
| Latino | 35 | 34 | 47 | 42 |
| White | 12 | 10 | 10 | 10 |
| Multi-racial | 13 | 13 | 11 | 13 |
| Living Arrangement (%) | ||||
| With Parent | 52 | 56 | 93 | 85 |
| Foster Care or Extended Family | 48 | 24 | 7 | 3 |
| Without caregiver | n/a | 20 | n/a | 12 |
| Marijuana use (% ever used) | — | 48.2 | — | 41.4 |
Individual predictor variables (features), domains, and descriptives.
| Feature Domain | Feature name | Coding | Range | % missingness | CW: Bivariate correlation with MJ use | nonCW: Bivariate correlation with MJ use |
|---|---|---|---|---|---|---|
| Demographics | Race | white = 0 minority = 1 | [0, 1] | none | .096 | .230 |
| Demographics | Gender | female = 0 male = 1 | [0, 1] | none | -.168 | -.166 |
| Demographics | Age | [14.71, 22.66] | none | .154 | .016 | |
| Mental health | Anxiety | continuous | [0, 85] | 0.57% | -.097 | .033 |
| Mental health | Depression | continuous | [2, 40] | 0.57% | .070 | .146 |
| Mental health | PTSD | continuous | [17, 66] | 1.70% | .050 | .317 |
| Parent/family social support | Parental closeness | continuous (low = risk) | [1,3] | none | .023 | -.297 |
| Parent/family social support | Parental monitoring | continuous (low = risk) | [0, 24] | none | -.258 | -.158 |
| Parent/family social support | Social support | continuous (low = risk) | [1,5] | 0.85% | -.092 | -.196 |
| Peer risk behavior | Peer delinquency | continuous | [0, 46] | none | .286 | .250 |
| Peer risk behavior | Peer alcohol use | 0 = none 1 = some 2 = a lot | [0,1,2] | 1.14% | .203 | .129 |
| Peer risk behavior | Peer marijuana use | 0 = none 1 = some 2 = a lot | [0,1,2] | 1.14% | .380 | .232 |
| Risk behavior | Sexual activity | score 0–11 | [0, 11] | 1.99% | .261 | .265 |
| Risk behavior | Delinquency | continuous | [0, 30.5] | none | .085 | .384 |
| Risk behavior | Externalizing | continuous | [0,36] | 1.70% | .165 | .461 |
| Self-esteem | Global self-worth | continuous (low = risk) | [8,20] | none | -.077 | -.215 |
| Self-esteem | Self-image | continuous (low = risk) | [38, 126] | 1.99% | .017 | -.093 |
| Self-report ACEs | Emotional abuse | no = 0; yes = 1 | [0, 1] | 1.42% | .025 | .303 |
| Self-report ACEs | Emotional neglect | no = 0; yes = 1 | [0, 1] | 1.42% | .079 | .167 |
| Self-report ACEs | Household substance use | no = 0; yes = 1 | [0, 1] | 1.42% | .038 | .095 |
| Self-report ACEs | Physical abuse | no = 0; yes = 1 | [0, 1] | 1.42% | .122 | .306 |
| Self-report ACEs | Physical neglect | no = 0; yes = 1 | [0, 1] | 1.42% | -.083 | .207 |
| Self-report ACEs | Sexual abuse | no = 0; yes = 1 | [0, 1] | 1.42% | .131 | .219 |
| Self-report ACEs | Witnessing IPV | no = 0; yes = 1 | [0, 1] | 1.42% | .076 | .098 |
| Self-report ACEs | Witnessing community violence | continuous | [0, 32] | 1.14% | .247 | .330 |
Note: CW = child welfare; ACEs = adverse childhood experiences; MJ = marijuana; IPV = intimate partner violence.
**p < .01,
*p < .05
Performance metrics for the three machine learning approaches.
| CW | Non-CW | |||||
|---|---|---|---|---|---|---|
| AUC | Precision | Recall | AUC | Precision | Recall | |
| Logistic Regression | 0.79 ± 0.004 | 0.72 ± 0.013 | 0.73 ± 0.009 | 0.87 ± 0.028 | 0.72 ± 0.003 | 0.68 ± 0.037 |
| Lasso | 0.80 ± 0.001 | 0.71 ± 0.015 | 0.75 ± 0.018 | 0.85 ± 0.021 | 0.72 ± 0.007 | 0.66 ± 0.014 |
| SVM | 0.80 ± 0.012 | 0.72 ± 0.01 | 0.74 ± 0.025 | 0.84 ± 0.010 | 0.73 ± 0.011 | 0.69 ± 0.057 |
Note: ± indicates the range across the 5 imputed datasets.
Fig 1Plot of individual predictors selected by model ranked by Permutation Feature Importance value for a) Child Welfare and b) non-Child Welfare groups.
Fig 2Plot of individual predictors selected by model ranked by coefficient for a) Child Welfare and b) non-Child Welfare groups.