| Literature DB >> 35432513 |
Sonam Gupta1, Lipika Goel2, Arjun Singh3, Ajay Prasad4, Mohammad Aman Ullah5.
Abstract
Rapid technological advancements are altering people's communication styles. With the growth of the Internet, social networks (Twitter, Facebook, Telegram, and Instagram) have become popular forums for people to share their thoughts, psychological behavior, and emotions. Psychological analysis analyzes text and extracts facts, features, and important information from the opinions of users. Researchers working on psychological analysis rely on social networks for the detection of depression-related behavior and activity. Social networks provide innumerable data on mindsets of a person's onset of depression, such as low sociology and activities such as undergoing medical treatment, a primary emphasis on oneself, and a high rate of activity during the day and night. In this paper, we used five machine learning classifiers-decision trees, K-nearest neighbor, support vector machines, logistic regression, and LSTM-for depression detection in tweets. The dataset is collected in two forms-balanced and imbalanced-where the oversampling of techniques is studied technically. The results show that the LSTM classification model outperforms the other baseline models in the depression detection healthcare approach for both balanced and imbalanced data.Entities:
Mesh:
Year: 2022 PMID: 35432513 PMCID: PMC9007657 DOI: 10.1155/2022/4395358
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Commonly used human emotions.
Figure 2Various methods for psychological classification.
Figure 3Methodology.
Literature survey.
| Author | Dataset | Model | Result |
|---|---|---|---|
| Nguyen et al. [ | Clinical group and control group | Lasso | Accuracy 93 |
| Pennebaker et al. [ | Knn | Accuracy 65 | |
| Gaikar et al. [ | Bipolar disorder and depression illness | SVM | Accuracy 85% |
| Hanai et al. [ | Audio and text based | LSTM | F1: 0.44 and precision: 0.59% |
| Lang and Cao [ | Naive Bayes | Accuracy 86% | |
| Trotzek et al. [ | Reddit message | CNN | Accuracy 87% |
| Lang and Cao [ | Speech data | Deep CNN | RMSE: 9.0001 and MAE: 7.4211 |
| Islam et al. [ | Facebook comments | KNN | Accuracy 73% |
| Li et al. [ | Weibo microblogs | Logistic regression | Accuracy: 77% and precision: 77% |
| Li et al. [ | Shandong mental health center | Kinetic captured skeleton | Accuracy 96.47% |
| Asare Kennedy et al. [ | Smart phone dataset | SVM | Accuracy 96.44–98.14% |
| Tadalagi and Joshi [ | Facial expression | SVM | Accuracy 72.8% |
Figure 4Proposed methodology for handling the data of tweets.
Results of performance measures with imbalanced dataset.
| Classifier | Precision | Recall |
| Accuracy |
|---|---|---|---|---|
| Decision tree | 0.73 | 0.63 | 0.67 | 0.03 |
| SVM | 0.76 | 0.67 | 0.71 | 0.52 |
| KNN | 0.67 | 0.56 | 0.61 | 0.65 |
| LR | 0.75 | 0.68 | 0.71 | 0.71 |
| LSTM | 0.79 | 0.72 | 0.74 | 0.78 |
|
|
|
|
|
|
Results of performance measures using SMOTE.
| Classifier | Precision | Recall |
| Accuracy |
|---|---|---|---|---|
| Decision tree | 0.76 | 0.64 | 0.69 | 0.68 |
| SVM | 0.79 | 0.69 | 0.73 | 0.62 |
| KNN | 0.69 | 0.59 | 0.63 | 0.71 |
| LR | 0.77 | 0.72 | 0.74 | 0.76 |
| LSTM | 0.84 | 0.75 | 0.79 | 0.83 |
|
|
|
|
|
|
Result of performance measure using RUS.
| Classifier | Precision | Recall |
| Accuracy |
|---|---|---|---|---|
| DT | 0.72 | 0.64 | 1.80 | 0.67 |
| SVM | 0.77 | 0.67 | 0.71 | 0.59 |
| KNN | 0.67 | 0.57 | 0.61 | 0.68 |
| LR | 0.75 | 0.69 | 0.07 | 0.72 |
| LSTM | 0.82 | 0.73 | 0.77 | 0.80 |
|
|
|
|
|
|
Figure 5The precision value of imbalanced dataset.
Figure 9The precision value of balanced dataset using SMOTE.
Figure 13The precision value of balanced dataset using RUS.