| Literature DB >> 35206905 |
Rafael Salas-Zárate1, Giner Alor-Hernández1, María Del Pilar Salas-Zárate2, Mario Andrés Paredes-Valverde2, Maritza Bustos-López3, José Luis Sánchez-Cervantes4.
Abstract
Among mental health diseases, depression is one of the most severe, as it often leads to suicide; due to this, it is important to identify and summarize existing evidence concerning depression sign detection research on social media using the data provided by users. This review examines aspects of primary studies exploring depression detection from social media submissions (from 2016 to mid-2021). The search for primary studies was conducted in five digital libraries: ACM Digital Library, IEEE Xplore Digital Library, SpringerLink, Science Direct, and PubMed, as well as on the search engine Google Scholar to broaden the results. Extracting and synthesizing the data from each paper was the main activity of this work. Thirty-four primary studies were analyzed and evaluated. Twitter was the most studied social media for depression sign detection. Word embedding was the most prominent linguistic feature extraction method. Support vector machine (SVM) was the most used machine-learning algorithm. Similarly, the most popular computing tool was from Python libraries. Finally, cross-validation (CV) was the most common statistical analysis method used to evaluate the results obtained. Using social media along with computing tools and classification methods contributes to current efforts in public healthcare to detect signs of depression from sources close to patients.Entities:
Keywords: depression; sentiment analysis; social media
Year: 2022 PMID: 35206905 PMCID: PMC8871802 DOI: 10.3390/healthcare10020291
Source DB: PubMed Journal: Healthcare (Basel) ISSN: 2227-9032
Summary of related studies.
| Study Reference | Approach | Year | Studies Reviewed | Years Covered |
|---|---|---|---|---|
| Guntuku et al. [ | Predictive models | 2017 | 12 | 2013–2017 |
| Wang and Gorenstein [ | Beck Depression Inventory-II | 2013 | 70 | 1996–2012 |
| Gottlieb et al. [ | Social contexts | 2011 | 30 | 1997–2008 |
Figure 1Literature review process.
Research questions.
| Research Question (RQ) | Question |
|---|---|
| RQ1 | Which social media sites and features of datasets are mainly used in depression sign detection research? |
| RQ2 | Which are the main linguistic feature extraction methods used for detecting depression signs on social media? |
| RQ3 | Which are the main machine-learning algorithms used in depression sign detection from social media? |
| RQ4 | Which are the main computing tools applied in detecting depression signs on social media? |
| RQ5 | Which are the main statistical analysis methods used to validate results in detecting depression signs on social media? |
Keywords and related concepts of the literature review.
| Area | Keywords | Related Concepts |
|---|---|---|
| Mental health | Depression | Mental illness |
| Social media | Social media | Mental disorder |
| Social networks | ||
| Social web | ||
| Microblogs | ||
| NHANES |
Figure 2Research papers by digital libraries.
Figure 3PRISMA flow diagram for the literature search.
Figure 4Type of publication from 2016 to mid-2021.
Figure 5Geographical distribution.
Social media and corresponding features of datasets used in depression detection research.
| Social Media | Study | Features of Dataset |
|---|---|---|
| Leis et al. [ | 140,946 tweets | |
| Kr [ | 4000+ tweets | |
| Shen et al. [ | 36,993 depression-candidate dataset users | |
| Chen et al. [ | 585 and 6596 unique and valid users with their past tweets | |
| Arora and Arora [ | 3754 tweets | |
| Biradar and Totad [ | 60,400 tweets | |
| Ma et al. [ | 54 million tweets | |
| Nadeem [ | 1,253,594 documents (tweets) as control variables | |
| Yazdavar et al. [ | 8770 users, including 3981 depressed users and 4789 control subjects | |
| Titla-Tlatelpa et al. [ | 7999 users with Twitter submissions | |
| Chiong et al. [ | 22191 records | |
| Safa et al. [ | 570 users from the control group of 16,623,164 tweets | |
| Leiva and Freire [ | 135 depressive users, 752 control-group users | |
| Rissola et al. [ | 1,076,582 submissions from 1707 unique users | |
| Sadeque et al. [ | 531,453 submissions from 892 unique users | |
| Tadesse et al. [ | 1293 depression-indicative posts, 548 standard posts | |
| Wolohan et al. [ | Reddit posts from a sample of 12,106 users | |
| Burdisso et al. [ | 887 subjects with 531,394 submissions | |
| Trotzek et al. [ | 135 depressed users and a random control group of 752 users | |
| Titla-Tlatelpa et al. [ | 1707 users, Reddit eRisk 2018 task | |
| Martinez-Castaño et al. [ | eRisk collections containing up to 1000 posts and 1000 comments | |
| Tai et al. [ | 3599 diaries | |
| Katchapakirin et al. [ | 35 Facebook users | |
| Wongkoblap et al. [ | 509 users in the final dataset | |
| Wu et al. [ | 1294 students with their data | |
| Yang, Mcewen, et al. [ | 22,043,394 status updates from 153,727 users | |
| Aldarwish and Ahmad [ | 2287 posts | |
| Ophir et al. [ | 190 Facebook status updates of at-risk adolescents | |
| Chiong et al. [ | Facebook, Virahonda, 9178 records | |
| Ricard et al. [ | data from 749 participants | |
| Reece and Danforth [ | 43,950 user photographs and data | |
| Mann et al. [ | 221 students, mean of 16.73 posts per student (60 days) | |
| Chun et al. [ | 520 users from Instagram through the data collection method | |
| Li et al. [ | 15,879 Weibo posts from 10,130 distinct Weibo users | |
| Lixia Yu et al. [ | 7,116,958 posts | |
| NHANES, K-NHANES | Oh et al. [ | dataset of 28,280 participants with 157 variables for NHANES and 4949 participants with 314 variables for K-NHANES |
Figure 6Social media sites explored in depression sign detection research.
Linguistic feature extraction methods used for detecting depression signs on social media.
| Model | Study |
|---|---|
| Word embedding | Rissola et al. [ |
| Wongkoblap et al. [ | |
| Wu et al. [ | |
| Ma et al. [ | |
| Yazdavar et al. [ | |
| Trotzek et al. [ | |
| Mann et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Yueh et al. [ | |
| N-grams | Wolohan et al. [ |
| Rissola et al. [ | |
| Sadeque et al. [ | |
| Arora and Arora [ | |
| Wolohan et al. [ | |
| Nadeem [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Tokenization | Tadesse et al. [ |
| Arora and Arora [ | |
| Biradar and Totad [ | |
| Aldarwish and Ahmad [ | |
| Trotzek et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Bag of words | Ricard et al. [ |
| Rissola et al. [ | |
| Nadeem [ | |
| Mann et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Stemming | Tadesse et al. [ |
| Arora and Arora [ | |
| Emotion analysis | Leis et al. [ |
| Shen et al. [ | |
| Chen et al. [ | |
| Part-of-Speech (POS) tagging | Wu et al. [ |
| Leis et al. [ | |
| Chiong et al. [ | |
| Behavior features | Wu et al. [ |
| Yang, McEwen, et al. [ | |
| Sentiment polarity | Leis et al. [ |
| Rissola et al. [ |
Machine-learning algorithms.
| Machine-Learning Algorithm | Study |
|---|---|
| Support vector machine (SVM) | Leiva and Freire [ |
| Rissola et al. [ | |
| Katchapakirin et al. [ | |
| Sadeque et al. [ | |
| Chen et al. [ | |
| Tadesse et al. [ | |
| Arora and Arora [ | |
| Wolohan et al. [ | |
| Yang, McEwen, et al. [ | |
| Burdisso et al. [ | |
| Li et al. [ | |
| Nadeem [ | |
| Yazdavar et al. [ | |
| Oh et al. [ | |
| Aldarwish and Ahmad [ | |
| Mann et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Logistic regression | Leiva and Freire [ |
| Rissola et al. [ | |
| Chen et al. [ | |
| Tadesse et al. [ | |
| Reece and Danforth [ | |
| Yang, McEwen, et al. [ | |
| Burdisso et al. [ | |
| Li et al. [ | |
| Nadeem [ | |
| Yazdavar et al. [ | |
| Oh et al. [ | |
| Trotzek et al. [ | |
| Martinez-Cataño et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Neural networks | Kr [ |
| Sadeque et al. [ | |
| Wongkoblap et al. [ | |
| Wu et al. [ | |
| Biradar and Totad [ | |
| Yang, McEwen, et al. [ | |
| Li et al. [ | |
| Yazdavar et al. [ | |
| Trotzek et al. [ | |
| Mann et al. [ | |
| Yueh et al. [ | |
| Random forests | Leiva and Freire [ |
| Katckapakirin et al. [ | |
| Chen et al. [ | |
| Tadesse et al. [ | |
| Reece and Danforth [ | |
| Yang, McEwen, et al. [ | |
| Li et al. [ | |
| Yazdavar et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| Yueh et al. [ | |
| Bayesian statistics | Tai et al. [ |
| Chen et al. [ | |
| Arora and Arora [ | |
| Reece and Danforth [ | |
| Yang, McEwen, et al. [ | |
| Burdisso et al. [ | |
| Nadeem [ | |
| Decision trees | Yang, McEwen, et al. [ |
| Nadeem [ | |
| J Oh et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Safa et al. [ | |
| K-Nearest Neighbor | Leiva and Freire [ |
| Yang, McEwen, et al. [ | |
| Burdisso et al. [ | |
| Oh et al. [ | |
| Linear regression | Leiva and Freire [ |
| Ricard et al. [ | |
| Yu et al. [ | |
| Ensemble classifiers | Leiva and Freire [ |
| Oh et al. [ | |
| Multilayer Perceptron | Chiong et al. [ |
| Safa et al. [ | |
| Boosting | Tadeesse et al. [ |
| K-Means | Ma et al. [ |
Figure 7Machine-learning algorithms used for detecting depression signs on social media.
Computing tools used for detecting depression signs on social media.
| Computing Tool | Study |
|---|---|
| Python libraries | Kr [ |
| Leiva and Freire [ | |
| Rissola et al. [ | |
| Katchapakirin et al. [ | |
| Tadesse et al. [ | |
| Wongkoblap et al. [ | |
| Biradar and Totad [ | |
| Ma et al. [ | |
| Burdisso et al. [ | |
| Nadeem [ | |
| Yazdavar et al. [ | |
| Trotzek et al. [ | |
| Mann et al. [ | |
| Martinez-Cataño et al. [ | |
| Safa et al. [ | |
| Lu et al. [ | |
| LIWC | Shen et al. [ |
| Chen et al. [ | |
| Tadesse et al. [ | |
| Wolohan et al. [ | |
| Yang, McEwen, et al. [ | |
| Li et al. [ | |
| Yazdavar et al. [ | |
| Trotzek et al. [ | |
| Safa et al. [ | |
| Word2Vec | Shen et al. [ |
| Rissola et al. [ | |
| Wu et al. [ | |
| Ma et al. [ | |
| Yueh et al. [ | |
| Twitter APIs | Chen et al. [ |
| Biradar and Totad [ | |
| Leis et al. [ | |
| Kr [ | |
| WordNet | Shen et al. [ |
| Arora and Arora [ | |
| FastText | Rissola et al. [ |
| Trotzek et al. [ | |
| Weka | Katchapakirin et al. [ |
| Li et al. [ | |
| RapidMiner | Katchapakirin et al. [ |
| Aldarwish and Ahmad [ | |
| Google Apps | Katchapakirin et al. [ |
| Wu et al. [ | |
| Microsoft Excel | Li et al. [ |
| Aldarwish and Ahmad [ |
Figure 8Computing tools used for detecting depression signs on social media.
Statistical analysis methods used to validate results in detecting depression signs on social media.
| Statistical Analysis Method | Study |
|---|---|
| Cross-validation | Ricard et al. [ |
| Wongkoblap et al. [ | |
| Oh et al. [ | |
| Tai et al. [ | |
| Sadeque et al. [ | |
| Burdisso et al. [ | |
| Li et al. [ | |
| Nadeem [ | |
| Yazdavar et al. [ | |
| Mann et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Chiong et al. [ | |
| Term frequency/inverse | Leiva and Freire [ |
| document frequency (TF–IDF) | Tadesse et al. [ |
| Wolohan et al. [ | |
| Yang, McEwen, et al. [ | |
| Aldarwish and Ahmad [ | |
| Martinez-Cataño et al. [ | |
| Titla-Tlatelpa et al. [ | |
| Cohen’s kappa statistic | Rissola et al. [ |
| Li et al. [ | |
| Yazdavar et al. [ | |
| Yang, McEwen, et al. [ | |
| Mean/standard deviation | Chen et al. [ |
| Ricard et al. [ | |
| Mann et al. [ | |
| Mann–Whitney | Ricard et al. [ |
| Ophir et al. [ | |
| Likert scale | Kr [ |
| Ophir et al. [ | |
| Softmax function | Wongkoblap et al. [ |
| variance | Leis et al. [ |
| Direction method of multipliers | Shen et al. [ |
| Adam optimizer | Biradar and Totad [ |
| Pixel-level averages | Reece and Danforth [ |