| Literature DB >> 36040781 |
Cecilia Lao1, Jo Lane2, Hanna Suominen1,3.
Abstract
BACKGROUND: Effective suicide risk assessments and interventions are vital for suicide prevention. Although assessing such risks is best done by health care professionals, people experiencing suicidal ideation may not seek help. Hence, machine learning (ML) and computational linguistics can provide analytical tools for understanding and analyzing risks. This, therefore, facilitates suicide intervention and prevention.Entities:
Keywords: evaluation study; interdisciplinary research; linguistics; machine learning; mental health; natural language processing; social media; suicide risk
Year: 2022 PMID: 36040781 PMCID: PMC9472054 DOI: 10.2196/35563
Source DB: PubMed Journal: JMIR Form Res ISSN: 2561-326X
Example of a typical Reddit post from the data set and the suicide rating.
| Features | Value |
| Post ID | 1a2b3c |
| User ID | 45678 |
| Time stamp (Unix epoch) | 1.4E+09 |
| Subreddit | r/self-harm |
| Post body | “I’ve been feeling depressed for a while. I don’t know how to deal with it anymore...” |
| Label | Severe risk |
Figure 1Flowchart detailing the data preprocessing stages.
Mann-Whitney U test results for expert-annotated users.
| Feature | Examples | At-risk, mean (SD) | No-risk, mean (SD) | 95% CIs for differences between medians | |
| Clout | N/Aa | 36.81 (16.62) | 48.21 (11.48) | .005 | −17.83 to −6.590 |
| Authenticity | N/A | 64.82 (21.31) | 47.35 (20.86) | .005 | 10.04 to 27.09 |
| First-person singular pronouns | I, my, and mine | 7.105 (2.979) | 5.419 (2.194) | .04 | 0.6900 to 2.840 |
| Negation | Not, no, and never | 1.391 (0.8524) | 0.7924 (0.6987) | .01 | 0.2250 to 0.9650 |
aN/A: not applicable.
Mann-Whitney U test results for crowd-annotated users.
| Feature | Examples | At-risk, mean (SD) | No-risk, mean (SD) | 95% CIs for differences between medians | |
| Clout | N/Aa | 32.00 (15.71) | 40.48 (16.42) | <.001 | −12.29 to −5.315 |
| Authenticity | N/A | 71.66 (19.57) | 58.73 (20.18) | <.001 | 9.985 to 17.75 |
| First-person singular pronouns | I, my, and mine | 8.346 (2.902) | 6.738 (2.579) | <.001 | 1.120 to 2.195 |
| Negation | Not, no, and never | 1.717 (1.072) | 1.284 (1.031) | .001 | 0.1500 to 0.6000 |
aN/A: not applicable.
Figure 2Box plot for authenticity for at-risk and no-risk users (expert).
Figure 3Box plot for authenticity for at-risk and no-risk users (crowd).
Figure 4Box plot for clout for at-risk and no-risk users (expert).
Figure 5Box plot for clout for at-risk and no-risk users (crowd).
Summary of classification results of various machine learning modelsa.
| Models | AUCb | Accuracy | Precision | Recall | |
| Gradient boost | 0.67 | 0.62 | 0.61 | 0.67 | 0.62 |
| Random forest | 0.66 | 0.75 | 0.65 | 0.66 | 0.65 |
| Support vector machine | 0.68 | 0.53 | 0.64 | 0.68 | 0.52 |
aThe precision, recall, and F1-scores are the macroaverage of the different classes.
bAUC: area under the receiving operator curve.
Permutation importance results for AUCa, precision, and recall.
| Features | Gradient boost, mean (SD) | Random forest, mean (SD) | Support vector machine, mean (SD) | |
|
| ||||
|
| Authenticity | 0.071 (0.041) | 0.041 (0.027) | N/Ab |
|
| Negative emotion | 0.034 (0.024) | 0.017 (0.01) | N/A |
|
| Clout | 0.02 (0.016) | N/A | N/A |
|
| Whitespace | N/A | N/A | 0.01 (0.005) |
|
| ||||
|
| Authenticity | 0.057 (0.026) | 0.030 (0.013) | N/A |
|
| Clout | 0.018 (0.013) | 0.035 (0.012) | N/A |
|
| Negative emotion | 0.016 (0.014) | 0.020 (0.011) | N/A |
|
| First-person singular pronouns | N/A | 0.015 (0.008) | N/A |
|
| Quantitative processes | N/A | N/A | 0.014 (0.010) |
|
| Informality | N/A | N/A | 0.011 (0.008) |
|
| ||||
|
| Negative emotion | N/A | 0.022 (0.015) | N/A |
|
| Positive emotion | N/A | 0.021 (0.011) | N/A |
|
| Question mark | N/A | 0.016 (0.007) | N/A |
|
| Affect | N/A | 0.013 (0.008) | N/A |
|
| Function words | N/A | 0.013 (0.008) | N/A |
|
| Colon | N/A | N/A | 0.011 (0.004) |
|
| Ingest | N/A | N/A | 0.01 (0.005) |
aAUC: area under the receiving operator curve.
bN/A: not applicable.
Figure 6Screenshot of Reddit Care Resources.
Figure 7Screenshot of search results for “self-harm”.