| Literature DB >> 32723719 |
Xiaofeng Wang1, Shuai Chen2, Tao Li2, Wanting Li2, Yejie Zhou2, Jie Zheng2, Qingcai Chen2, Jun Yan3, Buzhou Tang2.
Abstract
BACKGROUND: Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions.Entities:
Keywords: Chinese microblogs; deep learning; depression risk prediction; pretrained language model
Year: 2020 PMID: 32723719 PMCID: PMC7424493 DOI: 10.2196/17958
Source DB: PubMed Journal: JMIR Med Inform
Examples of different depression risk levels in the dataset.
| Depression risk level | Microblog |
| 3 | Weibo: 不出意外的话,我打算死在今年 。 |
| 2 | Weibo: 我一直策划着如何自杀,可是放不下的太多了。 |
| 1 | Weibo: 如果我累,真的离开了。 |
| 0 | Weibo: 吃了个早餐应该能维持今天。 |
Dataset statistics.
| Depression level | Training set (n) | Test set (n) |
| 3 | 103 | 26 |
| 2 | 520 | 130 |
| 1 | 1103 | 276 |
| 0 | 9468 | 2367 |
| All | 11,194 | 2799 |
Hyperparameters for the deep-learning methods.
| Parameter | BERTa | RoBERTab | XLNETc |
| Learning rate | 1e-5 | 1e-5 | 2e-5 |
| Training steps | 7000 | 7000 | 7000 |
| Maximum length | 128 | 128 | 128 |
| Batch size | 16 | 16 | 16 |
| Warm-up steps | 700 | 700 | 700 |
| Dropout rate | 0.3 | 0.3 | 0.3 |
aBERT: bidirectional encoder representations from transformers.
bRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.
cXLNET: generalized autoregressive pretraining for language understanding.
Hyperparameters during further in-domain pretraining for the deep-learning methods.
| Parameter | BERTa | RoBERTab | XLNETc |
| Learning rate | 2e-5 | 2e-5 | 2e-5 |
| Training steps | 100,000 | 100,000 | 100,000 |
| Maximum length | 256 | 256 | 256 |
| Batch size | 16 | 16 | 16 |
| Warm-up steps | 10,000 | 10,000 | 10,000 |
aBERT: bidirectional encoder representations from transformers.
bRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.
cXLNET: generalized autoregressive pretraining for language understanding.
Performance of deep-learning methods with different language representation models.
| Model | Level-0 | Level-1 | Level-2 | Level-3 | MicroF1 | ||||||||||||
|
| Pa | Rb | F1 | P | R | F1 | P | R | F1 | P | R | F1 |
| ||||
| CNNc [ | 0.908 | 0.940 | 0.924 | 0.380 | 0.236 | 0.291 | 0.351 | 0.415 | 0.380 | 0.250 | 0.231 | 0.240 | 0.841 | ||||
| LSTMd [ | 0.896 | 0.936 | 0.916 | 0.294 | 0.288 | 0.257 | 0.324 | 0.262 | 0.289 | 0.714 | 0.192 | 0.303 | 0.832 | ||||
| BERTe [ | 0.942 | 0.894 | 0.917 | 0.323 | 0.502 | 0.393 | 0.468 | 0.489 | 0.478 | 0.574 | 0.152 | 0.240 | 0.834 | ||||
| BERT_IDPf [ | 0.929 | 0.938 |
| 0.394 | 0.446 | 0.418 | 0.568 | 0.385 | 0.459 | 0.667 | 0.231 | 0.343 |
| ||||
| RoBERTah | 0.931 | 0.920 | 0.925 | 0.355 | 0.464 | 0.402 | 0.556 | 0.385 | 0.455 | 0.600 | 0.231 | 0.333 | 0.843 | ||||
| RoBERTa_IDP | 0.933 | 0.920 | 0.926 | 0.371 | 0.489 |
| 0.578 | 0.400 | 0.473 | 0.636 | 0.269 | 0.333 | 0.847 | ||||
| XLNETi | 0.908 | 0.948 | 0.927 | 0.358 | 0.273 | 0.309 | 0.484 | 0.353 | 0.408 | 0.530 | 0.384 |
| 0.848 | ||||
| XLNET_IDP | 0.933 | 0.920 | 0.926 | 0.361 | 0.471 | 0.409 | 0.577 | 0.431 |
| 0.625 | 0.192 | 0.294 | 0.846 | ||||
aP: precision.
bR: recall.
cCNN: convolutional neural network.
dLSTM: long short-term memory network.
eBERT: bidirectional encoder representations from transformers.
f_IDP: The model is further trained on the in-domain unlabeled corpus.
gHighest F1 values are indicated in italics.
hRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.
iXLNET: generalized autoregressive pretraining for language understanding.
Performance of deep-learning methods with different language representation models on level 1, 2 and 3.
| Model | Macro-F1 | Macro-Pa | Macro-Rb |
| BERTc [ | 0.370 | 0.455 | 0.381 |
| BERT_IDPd [ | 0.406 |
| 0.354 |
| RoBERTaf | 0.396 | 0.503 | 0.360 |
| RoBERTa_IDP |
| 0.528 |
|
| XLNETg | 0.387 | 0.457 | 0.336 |
| XLNET_IDP | 0.398 | 0.521 | 0.364 |
aP: precision.
bR: recall.
cBERT: bidirectional encoder representations from transformers.
d_IDP: The model is further trained on the in-domain unlabeled corpus.
eHighest F1 values are indicated in italics.
fRoBERTa: robustly optimized bidirectional encoder representations from transformers pretraining approach.
gXLNET: generalized autoregressive pretraining for language understanding.
Confusion matrix of the deep-learning methods with in-domain training.
| Gold-standard method | Prediction method Level-0 | Prediction method Level-1 | Prediction method Level-2 | Prediction method Level-3 | |
|
|
|
|
|
| |
|
| Level-0 | 2221 | 131 | 14 | 1 |
|
| Level-1 | 137 | 123 | 16 | 0 |
|
| Level-2 | 26 | 52 | 50 | 2 |
|
| Level-3 | 6 | 6 | 8 | 6 |
|
|
|
|
|
| |
|
| Level-0 | 2177 | 176 | 13 | 1 |
|
| Level-1 | 128 | 135 | 15 | 0 |
|
| Level-2 | 26 | 47 | 52 | 3 |
|
| Level-3 | 3 | 6 | 10 | 7 |
|
|
|
|
|
| |
|
| Level-0 | 2177 | 176 | 13 | 1 |
|
| Level-1 | 128 | 130 | 18 | 0 |
|
| Level-2 | 26 | 46 | 56 | 2 |
|
| Level-3 | 3 | 8 | 10 | 5 |
eBERT_IDP: bidirectional encoder representations from transformers further trained on the in-domain unlabeled corpus.
bRoBERTa_IDP: robustly optimized bidirectional encoder representations from transformers pretraining approach further trained on the in-domain unlabeled corpus.
cXLNET_IDP: generalized autoregressive pretraining for language understanding further trained on the in-domain unlabeled corpus.