| Literature DB >> 34458682 |
Hanjia Lyu1, Yubao Liu1, Yipeng Zhang1, Xiyang Zhang2, Yu Wang1, Jiebo Luo1.
Abstract
BACKGROUND: The COVID-19 pandemic has affected people's daily lives and has caused economic loss worldwide. Anecdotal evidence suggests that the pandemic has increased depression levels among the population. However, systematic studies of depression detection and monitoring during the pandemic are lacking.Entities:
Keywords: COVID-19; Twitter; data mining; depression; mental health; natural language processing; social media; transformers
Year: 2021 PMID: 34458682 PMCID: PMC8330892 DOI: 10.2196/26769
Source DB: PubMed Journal: JMIR Infodemiology ISSN: 2564-1891
Figure 1Density of Twitter coverage regarding “depression,” “ptsd,” “bipolar disorder,” and “autism.” ptsd: posttraumatic stress disorder.
Figure 2Distributions of positive and negative emotion scores among the depression and nondepression groups. VADER: Valence Aware Dictionary for Sentiment Reasoning.
Figure 3Linguistic profiles for the depression and nondepression tweets. LIWC: Linguistic Inquiry and Word Count.
Chunk-level performance (%) of all 5 models on the 500-user testing set using training-validation sets of different sizes.a
| Model and training-validation set | Accuracy | F1 | AUCb | Precision | Recall | ||||||
|
| |||||||||||
|
| 1000 users | 70.7 | 69.0 | 76.5 | 70.9 | 67.3 | |||||
|
| 2000 users | 70.3 | 68.3 | 77.4 | 70.7 | 66.1 | |||||
|
| 4650 users | 72.7 | 71.6 | 79.3 | 72.1 | 71.1 | |||||
|
| |||||||||||
|
| 1000 users | 71.8 | 72.6 | 77.4 | 72.7 | 72.6 | |||||
|
| 2000 users | 72.8 | 74.5 | 80.3 | 72.2 | 76.9 | |||||
|
| 4650 users | 74.0 | 70.9 | 81.0 | 77.4 | 68.9 | |||||
|
| |||||||||||
|
| 1000 users | 72.7 | 74.4 | 79.8 | 72.0 | 76.9 | |||||
|
| 2000 users | 75.7 | 76.3 | 82.9 | 76.1 | 75.7 | |||||
|
| 4650 users | 76.5 | 77.5 | 83.9 | 76.3 | 78.8 | |||||
|
| |||||||||||
|
| 1000 users | 74.4 | 75.7 | 82.0 | 74.2 | 77.3 | |||||
|
| 2000 users | 75.9 | 77.9 | 83.2 | 73.8 |
| |||||
|
| 4650 users | 76.2 |
| 84.1 | 74.4 | 81.9 | |||||
|
| |||||||||||
|
| 1000 users | 73.7 | 75.1 | 80.7 | 73.2 | 77.2 | |||||
|
| 2000 users | 74.6 | 76.8 | 82.6 | 72.6 | 81.5 | |||||
|
| 4650 users |
| 77.9 |
|
| 78.3 | |||||
aWe used 0.5 as the threshold when calculating the scores.
bAUC: area under the receiver operating characteristic curve.
cBiLSTM: bidirectional long short-term memory.
dCNN: convolutional neural network.
eBERT: Bidirectional Encoder Representations from Transformers.
fRoBERTa: Robustly Optimized BiLSTM Pretraining Approach.
gItalics indicate the best performing model in each column.
User-level performance (%) using different features.
| Featuresa | Accuracy | F1 | AUCb |
| VADERc | 54.9 | 61.7 | 54.6 |
| Demographics | 58.7 | 56.0 | 61.4 |
| Engagement | 58.7 | 62.3 | 61.7 |
| Personality | 64.8 | 67.8 | 72.4 |
| LIWCd | 70.6 | 70.8 | 76.0 |
| V + D + E + P + Le | 71.5 | 72.0 | 78.3 |
| XLNet | 78.1 | 77.9 | 84.9 |
| All (random forest) | 78.4 | 78.1 | 84.9 |
| All (logistic regression) | 78.3 | 78.5 |
|
| All (SVMg) |
|
| 86.1 |
aWe used SVM for classifying individual features.
bAUC: area under the receiver operating characteristic curve.
cVADER: Valence Aware Dictionary and Sentiment Reasoner.
dLIWC: Linguistic Inquiry and Word Count.
eV + D + E + P + L: VADER + demographics + engagement + personality + LIWC.
fItalics indicate the best performing model in each column.
gSVM: support vector machine.
Figure 4Permutation importance of different features. LIWC: Linguistic Inquiry and Word Count; VADER: Valence Aware Dictionary for Sentiment Reasoning.
Figure 5Aggregated depression level trends of the depression and nondepression groups from January 1 to May 22, 2020. Since users with depression have a substantially higher depression level, we used different y-axes for the 2 groups' depression levels to compare them side by side.
Figure 6Topic distributions of depression and nondepression groups before and after the announcement of the US National Emergency.
Figure 7Aggregated depression level trends of the United States, New York, Califoria, and Florida after the announcement of the US National Emergency.
Figure 8Distributions of the top 5 topics (state level) after the announcement of the US National Emergency.