| Literature DB >> 28327593 |
George Gkotsis1, Anika Oellrich1, Sumithra Velupillai1,2, Maria Liakata3, Tim J P Hubbard4, Richard J B Dobson1,5, Rina Dutta1.
Abstract
The number of people affected by mental illness is on the increase and with it the burden on health and social care use, as well as the loss of both productivity and quality-adjusted life-years. Natural language processing of electronic health records is increasingly used to study mental health conditions and risk behaviours on a large scale. However, narrative notes written by clinicians do not capture first-hand the patients' own experiences, and only record cross-sectional, professional impressions at the point of care. Social media platforms have become a source of 'in the moment' daily exchange, with topics including well-being and mental health. In this study, we analysed posts from the social media platform Reddit and developed classifiers to recognise and classify posts related to mental illness according to 11 disorder themes. Using a neural network and deep learning approach, we could automatically recognise mental illness-related posts in our balenced dataset with an accuracy of 91.08% and select the correct theme with a weighted average accuracy of 71.37%. We believe that these results are a first step in developing methods to characterise large amounts of user-generated content that could support content curation and targeted interventions.Entities:
Year: 2017 PMID: 28327593 PMCID: PMC5361083 DOI: 10.1038/srep45141
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overall workflow of our approach.
Summary and description of the mental health condition themes and their originating subreddits used in this study.
| Theme | #Posts | Description |
|---|---|---|
| BPD | 11,880 | Forum to discuss aspects of Borderline Personality Disorder either as a sufferer, someone closely related to a sufferer, or someone interested in this disorder |
| bipolar ( | 41,636 | Communities to discuss issues surrounding Bipolar Disorder; while bipolar and BipolarReddit focus on sufferers and their support, BipolarSOs invites contributions from people that are in a relationship with someone suffering from Bipolar Disorder |
| schizophrenia | 4,963 | Subreddit to discuss schizophrenia-type disorders and schizophrenia-related issues such as psychosis |
| Anxiety | 57,523 | Forum for anything that is related to an anxiety disorder; does not distinguish between sufferer or someone related to a sufferer |
| depression | 197,436 | A community for helping anyone struggling with depression; posters are not limited to those who have received a diagnosis by their GP/hospital doctor and the emphasis is on supporting others in their struggle with depression |
| selfharm ( | 17,102 | Forums to discuss aspects of people self-harming; while selfharm aims to build a community of sufferers, StopSelfHarm focusses on supporting anyone wanting to stop self-harming even if through a related person |
| SuicideWatch | 90,518 | Forum to support individuals thinking about suicide or people thinking of someone else being at risk of suicide |
| addiction | 4,360 | Community to discuss any physical or psychological dependence, e.g. drugs or video games; encourages self post, but does not exclude non-sufferers |
| cripplingalcoholism | 38,241 | Community for alcohol-dependent people, with an emphasis on the acceptance of the condition, also stretching to embracing their condition |
| Opiates ( | 65,143 | Forums to discuss opiate addiction; while opiates addresses all aspects of the addiction, OpiatesRecovery focusses strongly on supporting everyone wanting to withdraw from opiates; Posting to opiates is restricted to people aged over 18 years |
| autism | 9,470 | Forum for anything related to an Autism Spectrum Disorder; provides information and support to anyone facing a diagnosis whether for themselves or someone else |
| Non-mental health | 476,388 | Control dataset generated using posts from users on subreddits outlined above, who have posted on other subreddits that are not mental health related |
Where appropriate, multiple subreddits participating in one theme are presented in brackets.
Evaluation results for the binary classification of mental health posts using all classifiers.
| FF | CNN | Linear | SVM | |
|---|---|---|---|---|
| Precision | 92.05% | 91.76% | 87.31% | 88.50% |
| Recall | 88.83% | 89.83% | 83.57% | 81.75% |
| FM | 0.90 | 0.91 | 0.85 | 0.85 |
| Accuracy | 90.78% | 91.08% | 86.01% | 85.87% |
FF = Feed Forward, CNN = Convolutional Neural Network, SVM = Support Vector Machine. FM = F-measure (harmonic mean of precision and recall).
Confusion matrix for the binary classification of mental from non-mental health content using a Convolutional Neural Network classifier.
| Non-mental | Mental | |
|---|---|---|
| Non-mental | 87821 | 7361 |
| Mental | 9277 | 81986 |
Results concern the test dataset (20% posts held out from training). Rows = actual labels, Columns = predicted labels.
Figure 2Multiclass classification confusion matrix using a Convolutional Neural Network (CNN) classifier.
Multiclass classification evaluation metrics using a Convolutional Neural Network.
| Theme | Precision | Recall | FM |
|---|---|---|---|
| BPD | 0.88 | 0.46 | 0.60 |
| bipolar | 0.77 | 0.60 | 0.67 |
| schizophrenia | 0.75 | 0.48 | 0.58 |
| Anxiety | 0.83 | 0.75 | 0.79 |
| depression | 0.70 | 0.77 | 0.73 |
| selfharm | 0.70 | 0.58 | 0.64 |
| suicidewatch | 0.62 | 0.59 | 0.61 |
| addiction | 0.72 | 0.41 | 0.52 |
| cripplingalcoholism | 0.68 | 0.76 | 0.72 |
| Opiates | 0.76 | 0.86 | 0.80 |
| autism | 0.84 | 0.71 | 0.77 |
| Weighted average | 0.72 | 0.71 | 0.72 |
Overall evaluation results for four different classification approaches.
| FF | CNN | Linear | SVM | |
|---|---|---|---|---|
| Accuracy | 70.82% | 71.37% | 58.72% | 64.02% |
| MRR | 0.82 | 0.83 | 0.74 | 0.78 |
FF = Feed Forward, CNN = Convolutional Neural Network, SVM = Support Vector Machine. MRR = Mean Reciprocal Rank.
Figure 3Architecture for the Feed Forward (a) and Convolutional Neural Network (b) deep learning approaches.