| Literature DB >> 33167554 |
Joshua Cohen1, Jennifer Wright-Berryman2, Lesley Rohlfs1, Donald Wright1, Marci Campbell1, Debbie Gingrich3, Daniel Santel4, John Pestian4.
Abstract
BACKGROUND: As adolescent suicide rates continue to rise, innovation in risk identification is warranted. Machine learning can identify suicidal individuals based on their language samples. This feasibility pilot was conducted to explore this technology's use in adolescent therapy sessions and assess machine learning model performance.Entities:
Keywords: machine learning; mental health; natural language processing; risk assessment; suicidal ideation; suicidal risk; therapy
Year: 2020 PMID: 33167554 PMCID: PMC7663991 DOI: 10.3390/ijerph17218187
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Schematic of study procedure.
Summary of machine learning model training data.
| Site | No. Suicidal (%) | No. Mentally Ill (%) | No. Control (%) | Total (%) |
|---|---|---|---|---|
| ACT Study | ||||
| UCMC | 30 (18.6) | 0 | 30 (19.6) | 60 (13.9) |
| STM Study | ||||
| UCMC | 44 (27.5) | 42 (33.3) | 42 (27.5) | 128 (29.2) |
| CCHMC | 43 (26.9) | 42 (33.3) | 41 (26.8) | 126 (28.7) |
| PCH | 43 (26.9) | 42 (33.3) | 40 (26.1) | 125 (28.5) |
| Total | 160 (36.4) | 126 (28.6) | 153 (34.8) | 439 |
Suicide-related items on the Patient Health Questionnaire 9-Item Modified for Adolescents (PHQ-A).
| PHQ-A Item | Question | Response Options |
|---|---|---|
| Item 9 | How often in the past | Not at all (0), Several days (1), More than half the days (2), and Nearly every day (3) |
| Item 12 | Has there been a time in the | Yes or No |
| Item 13 | Have you | Yes or No |
Adolescent participant demographics and PHQ-A answer summaries.
| Sessions with Clinically Relevant Symptoms N = 249 | Participants N = 60 | |||||||
|---|---|---|---|---|---|---|---|---|
| Participants | Sessions | PHQ-A | Item 9 | Item 12 | Item 13 | Item 9 | Item 12 | Item 13 | Item 9 | Item 12 | Item 13 | |
|
| 60 | 267 | 77 (31) | 68 (27) | 39 (16) | 59 (24) | 96 (39) | 29 (48) |
|
| 12.8 (2.4) | 12.5 (2.5) | 13.6 (2.4) | 13.7 (2.5) | 14.7 (2.2) | 13.8 (2.5) | 13.5 (2.5) | 13.5 (2.5) |
|
| 50.0 | 41.6 | 28.6 | 33.8 | 35.9 | 59.3 | 39.6 | 37.9 |
|
| ||||||||
|
| 78.3 | 78.7 | 80.5 | 88.2 | 79.5 | 76.3 | 82.3 | 79.3 |
|
| 10.0 | 13.9 | 14.3 | 10.3 | 15.4 | 10.2 | 8.3 | 6.9 |
|
| 8.3 | 5.6 | 3.9 | 0.0 | 0.0 | 8.5 | 5.2 | 6.9 |
|
| 3.3 | 1.9 | 1.3 | 1.5 | 5.1 | 5.1 | 4.2 | 6.9 |
Note: Total scores ≥ 11 on the PHQ-A have been used for diagnosing depression with the greatest sensitivity and specificity in adolescents [41]. The suicide-related questions on the PHQ-A are broken out on a session and participant basis. The vertical bar | indicates a logical OR statement.
Summary of MHSAFE probe usage.
| No. of Probes Discussed | Zero | One | Two | Three | Four | Five |
|---|---|---|---|---|---|---|
|
| 3 (1.1) | 5 (1.9) | 11 (4.1) | 20 (7.5) | 29 (10.9) | 198 (74.2) |
|
| 532 (338) | 1737 (1430) | 1866 (1418) | 1469 (947) | 2117 (1430) | 1721 (1182) |
|
| N/A | 774 (611) | 1438 (1020) | 941 (690) | 1051 (1079) | 813 (740) |
|
| 3 (1.2) | 3 (1.2) | 6 (2.4) | 15 (6.0) | 25 (10.0) | 196 (78.7) |
|
| 3 (4.3) | 0 (0) | 2 (2.9) | 2 (2.9) | 10 (14.3) | 53 (75.7) |
Figure 2Leave-one-site-out results for training data with different machine learning (ML) models using (a) controls without mental illness and suicidal thoughts, and (b) controls with and without mental illness and suicidal thoughts. Error bars indicate a 95% confidence interval. ML models used include logistic regression (LR), support vector machines (SVM), and extreme gradient boosting (XGB). Studies and test sites include the ACT study (collected at UCMC) and the STM study collected at CCHMC, PCH, and UCMC.
Model performance predicting suicidal risk in pilot language data.
| Model | AUC (95% CI) | Optimal No. of Features | Top 5 Features (Feature Importance or Weight) |
|---|---|---|---|
|
| |||
|
| 0.78 (0.72–0.84) | 11 | feel like, me angry, i be angry, no no, depression |
|
| 0.76 (0.70–0.82) | 11 | yeah it (+), and i (−), play (+), no no (+), depression (−) |
|
| 0.75 (0.69–0.81) | 9 | and (−), yeah it (+), play, no no (+), depression (−) |
|
| |||
|
| 0.72 (0.65–0.79) | 22 | and i, anymore, because of, college, depression |
|
| 0.72 (0.66–0.79) | 27 | at my (−), you (+), yeah it (+), attempt (−), college and (−) |
|
| 0.72 (0.65–0.78) | 27 | you (+), yeah it (+), at my (−), attempt (−), college and (−) |
Note: Feature importance was determined from the training data and their root has replaced words (e.g., “am” is the first-person singular version of the verb “be”). Logistic regression and the support vector machine’s feature weights were positive or negative, indicating whether these features influenced the model’s prediction towards the case (+) or control (−). Extreme gradient boosting models’ feature importance is always positive and reflects how frequently a feature was used to make decisions.
Summary of therapist suicidal risk scores, participants, and sessions.
| Therapist | No. of Participants | No. of Sessions | No. of Cases | Average Suicidal Risk Score (SD) | Suicidal Risk Score Range (Min–Max) |
|---|---|---|---|---|---|
| A | 15 | 66 | 2 | 14.4 (3.1) | 8–26 |
| B | 14 | 54 | 9 | 6.9 (8.7) | 1–51 |
| C | 9 | 67 | 36 | 11.2 (7.3) | 4–43 |
| D | 6 | 26 | 10 | 12.2 (13.5) | 3–70 |
| E | 5 | 16 | 3 | 10.9 (13.2) | 3–54 |
| F | 4 | 18 | 3 | 10.2 (3.4) | 6–16 |
| G | 3 | 9 | 1 | 4.3 (1.9) | 2–8 |
| H | 2 | 4 | 1 | 9.8 (3.0) | 7–14 |
| I | 1 | 5 | 5 | 16.4 (7.1) | 7–24 |
| J | 1 | 2 | 0 | 13 (7.1) | 8–18 |
|
|
|
|
|
|
|
Note: Cases are defined as item 9 scores > 0 or answering “yes” to item 12 on the PHQ-A.
Figure 3Decision boundaries for (a) logistic regression (LR), (b) support vector machine (SVM), and (c) extreme gradient boosting (XGB) models. Controls without mental illness and suicidal language samples from ACT and STM studies were dimensionally reduced using singular value decomposition. ML models were trained on dimensionally reduced language samples and used to classify coordinate points to create decision boundaries. The red and blue regions indicate coordinates that correspond to case and control classification, respectively. The red and blue points show dimensionally reduced language samples collected in this pilot. The LR model (a) shows the simplest rules used for classification and the XGB model (c) creates the most complex rules. Model performance indicated in these figures does not represent performance on non-dimensionally reduced data.