| Literature DB >> 33245287 |
Hannah Yao1, Sina Rashidian1, Xinyu Dong1, Hongyi Duanmu1, Richard N Rosenthal2, Fusheng Wang1.
Abstract
BACKGROUND: In recent years, both suicide and overdose rates have been increasing. Many individuals who struggle with opioid use disorder are prone to suicidal ideation; this may often result in overdose. However, these fatal overdoses are difficult to classify as intentional or unintentional. Intentional overdose is difficult to detect, partially due to the lack of predictors and social stigmas that push individuals away from seeking help. These individuals may instead use web-based means to articulate their concerns.Entities:
Keywords: deep learning; machine learning; natural language processing; opioid epidemic; opioid-related disorders; social media; suicide
Year: 2020 PMID: 33245287 PMCID: PMC7732714 DOI: 10.2196/15293
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Example posts from subreddits belonging to data groups.
| Subreddit group | Title | Body text |
| Control | It becomes less and less acceptable to cry in public the older you get, despite the reasons for doing so becoming more and more valid | Kids just don’t understand |
| Opioid related | Happiness is…a big, dark shot of Heroin after being sick all week. <3 | And also knowing you have enough for not only the next morning but at least a few more days while you figure out what to do next! |
| Depression | I won’t commit suicide but I wouldn’t mind dying | So much shit has been piling on and on. I feel like I am not making the people I care about proud and the only reason they talk to me is because of pity. I will not take my own life, but if a car hit me, I got terminal illness or if something fell on me. I would not be sad about me being gone |
| SuicideWatch | I’m waiting for the courage to end my own life | I feel like I’m close to making a serious attempt sooner than later and I’m ok with that. My impulsive behaviors have gotten worse other the last few months and some of those ways include bodily harm. In September i impulsively jumped out of a friends window and injured myself and now I am cutting myself at random for the first time in years. No one around me understands how exhausting it is to wake up everyday in my own skin. In my own head. I’m sick of the stomach pangs and guilt and crying and disappointment. Some nights I just pray I’ll have the courage to end it. People die in the world all over, shouldn’t matter when I go |
Figure 1Overview of the convolutional neural network architecture.
Summary of the 2 models for the 2 cases.
| Case | C1 | C2 | |
|
| |||
|
| Goal | Distinguish between suicidal and nonsuicidal language; predict for suicidality among opioid users | Distinguish between language of opioid usage and depressed but nonopioid using; predict for opioid usage among suicidal individuals |
|
| Data set | 51,366 posts from r/suicidewatch and control subreddits | 59,940 posts from opioid relevant subreddits and r/depression |
| Vocabulary size | 70,082 | 64,078 | |
| Training | Trained and validated on data from r/suicidewatch and control subreddits | Trained and validated on data from opioid relevant subreddits and r/depression | |
|
| |||
|
| Predicts on | Data from r/opiates | Data from r/suicidewatch |
|
| Prediction goal | Predicts for posts containing suicide risk in r/opiates | Predicts for posts containing opioid abuse in r/suicidewatch |
|
| Total posts prediction data that are between 30 and 500 words | 23,740 posts from r/opiates | 21,719 posts from r/suicidewatch |
|
| Sample max 250 from data for prediction containing these keywords for MTURKa | Commit suicide, suicide, suicidality, suicidal, want to die, want to do, want to overdose | Benzodiazepines, benzos, cocaine, codeine, fentanyl, heroin, hydrocodine, hydrocodone, hydromorphone, hydros, kratom, methadone, morphine, narcotic, narcotics, opiates, opioid, oxycodone, oxycontin, oxycottin, oxycotton, oxymorphone, suboxone |
| Sample count containing keywords | 234 | 231 | |
aMTURK: Amazon Mechanical Turk.
F scores achieved for the given different combinations of input for classification.
| Model | LRa | RFb | SVMc | CNNd | |
|
| |||||
|
| TF-IDFe | 0.902 | 0.904 | 0.915 | 0.685 |
|
| word2vec | 0.928 |
|
| 0.961 |
|
| TF-IDF + word2vec |
| 0.921 | 0.941 |
|
|
| TF-IDF + GloVe | 0.927 | 0.829 | 0.886 | 0.923 |
|
| TF-IDF + word2vec + char2vec | 0.914 | 0.790 | 0.856 | 0.962 |
|
| |||||
|
| TF-IDF | 0.889 | 0.800 | 0.811 | 0.729 |
|
| word2vec | 0.852 |
|
| 0.961 |
|
| TF-IDF + word2vec |
| 0.815 | 0.880 | 0.965 |
|
| TF-IDF + GloVe | 0.860 | 0.494 | 0.765 | 0.814 |
|
| TF-IDF + word2vec + char2vec | 0.858 | 0.581 | 0.741 |
|
aLR: logistic regression.
bRF: random forest.
cSVM: support vector machine.
dCNN: convolutional neural network.
eTF-IDF: term frequency-inverse document frequency.
fThe best results achieved by the model are in italics.
Comparison of text classification neural network baselines with word embedding and convolutional neural network.
| Model | FASTa | RNNb | ATTENTIONc | CNNd | |
|
| |||||
|
| Accuracy | 0.950 | 0.944 | 0.939 |
|
|
| Precision | 0.958 | 0.953 | 0.934 |
|
|
| Recall | 0.957 | 0.951 |
| 0.953 |
|
| 0.957 | 0.952 | 0.949 |
| |
|
| |||||
|
| Accuracy | 0.971 | 0.957 | 0.969 | 0.971 |
|
| Precision | 0.964 | 0.964 | 0.967 | 0.970 |
|
| Recall | 0.958 | 0.923 | 0.951 |
|
|
| 0.961 | 0.943 | 0.959 |
| |
aFAST: FastText.
bRNN: recurrent neural network.
cATTENTION: attention-based bidirectional recurrent neural network.
dCNN: convolutional neural network.
eThe best score for each model is italicized.
Figure 2Visualization of word importance as determined from the resultant word embeddings learned from scratch by the neural network models for classification of suicidal vs non-suicidal text. CNN: convolutional neural network; LSTN: long short-term memory; RNN: recurrent neural network.
Count and accuracy for model predictions using Amazon Mechanical Turk labels.
| Model | LRa | RFb | SVMc | FASTd | RNNe | ATTENTIONf | CNNg | |
|
| ||||||||
|
| Predicted number of suicide risk | 24 | 12 | 11 | 97 | 93 | 98 | 103 |
|
| All data | 0.768 | 0.750 | 0.744 | 0.59 | 0.608 | 0.576 | 0.536 |
|
| Suicide risk only | 0.2 | 0.1 | 0.092 | 0.783 | 0.791 | 0.75 | 0.833 |
|
| Nonsuicide risk only | 0.947 | 0.959 | 0.963 | 0.529 | 0.55 | 0.521 | 0.432 |
|
| ||||||||
|
| Predicted number of opioid addiction | 88 | 92 | 105 | 92 | 158 | 110 | 127 |
|
| All data | 0.524 | 0.518 | 0.538 | 0.54 | 0.588 | 0.552 | 0.562 |
|
| Opioid addiction | 0.230 | 0.237 | 0.273 | 0.251 | 0.414 | 0.295 | 0.334 |
|
| Nonopioid addiction | 0.892 | 0.869 | 0.869 | 0.9 | 0.806 | 0.874 | 0.847 |
aLR: logistic regression.
bRF: random forest.
cSVM: support vector machine.
dFAST: FastText.
eRNN: recurrent neural network.
fATTENTION: attention-based bidirectional recurrent neural network.
gCNN: convolutional neural network.
Figure 3The top bar graph shows changes in prediction performance in C1 depending on data category ratios. The overlaying line plot shows accuracy achieved from using the four models as weak learners in AdaBoost. The bottom graph shows weak learner contribution.
Figure 4Model predictions for individual posts represented as a line. A prediction closer to 1 indicates no predicted suicidal language and a prediction closer to 0 indicates yes predicted suicidal language. Individual posts are color coded by their MTURK label. ATTENTION: attention-based bidirectional recurrent neural network; CNN: convolutional neural network; FAST_TEXT: FastText; LR: logistic regression: RF: random forest; RNN: recurrent neural network; SVM: support vector machine.
Model prediction accuracies with heuristic labels determined by the keyword presence.
| Model | LRa | RFb | SVMc | FASTd | RNNe | ATTENTIONf | CNNg |
| All data | 0.606 | 0.588 | 0.648 | 0.666 | 0.774 | 0.69 | 0.712 |
| Opioid addiction | 0.264 | 0.26 | 0.347 | 0.338 | 0.56 | 0.403 | 0.463 |
| Nonopioid addiction | 0.9 | 0.87 | 0.907 | 0.948 | 0.926 | 0.937 | 0.926 |
aLR: logistic regression.
bRF: random forest.
cSVM: support vector machine.
dFAST: FastText.
eRNN: recurrent neural network.
fATTENTION: attention-based bidirectional recurrent neural network.
gCNN: convolutional neural network.
Top words most similar to suicidal from the subset of opiates data that was predicted to belong to the category suicidal.
| Model | Top words |
| LRa | hour, dead, lack, asleep, wake up, sobriety, self, they are, cause, at least, however, hell, later, group, |
| RFb | easy, living, yet, hit, waiting, probably, cold, the same, such, by, tomorrow, body, constantly, saying, working |
| SVMc | making, everyone, once, pills, without, soon, lol, nothing, around, sorry, thing, withdrawal, start, mental, tolerance |
| FASTd | suicidal thoughts, depressive, extreme, emotional, severe anxiety, existing, depressed, irritable, insomnia, severe, nauseous, mood swings, paranoid, fatigued, having trouble |
| RNNe | severely, depressed, diagnosed with, isolated, unbearable, bipolar, suicidal thoughts, anxious, ptsd, an alcoholic, ocd, overwhelmed, irritable, lethargic, extreme |
| ATTENTIONf | diagnosed with, depressive, suicidal thoughts, social anxiety, bipolar, extreme, crippling, irritable, tremors, emotional, borderline, severe depression, brain fog, existing, gad |
| CNNg | paranoid, unhappy, depressed, isolated, apathetic, irritable, an alcoholic, suicidal thoughts, trauma, severely, diagnosed with, brain fog, anxious, manic, emotionless |
aLR: logistic regression.
bRF: random forest.
cSVM: support vector machine.
dFAST: FastText.
eRNN: recurrent neural network.
fATTENTION: attention-based bidirectional recurrent neural network.
gCNN: convolutional neural network.