| Literature DB >> 35936260 |
Xiaowei Du1, Yunmei Sun1.
Abstract
Previous research mostly used simplistic measures and limited linguistic features (e.g., personal pronouns, absolutist words, and sentiment words) in a text to identify its author's psychological states. In this study, we proposed using additional linguistic features, that is, sentiments polarities and emotions, to classify texts of various psychological states. A large dataset of forum posts including texts of anxiety, depression, suicide ideation, and normal states were experimented with machine-learning algorithms. The results showed that the proposed linguistic features with machine-learning algorithms, namely Support Vector Machine and Deep Learning achieved a high level of performance in the detection of psychological state. The study represents one of the first attempts that uses sentiment polarities and emotions to detect texts of psychological states, and the findings may contribute to our understanding of how accuracy may be enhanced in the detection of various psychological states. Significance and suggestions of the study are also offered.Entities:
Keywords: classification; linguistic features; machine learning algorithms; mental disorders; psychological states
Year: 2022 PMID: 35936260 PMCID: PMC9355087 DOI: 10.3389/fpsyg.2022.955850
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
A summary of the data.
| Groups | Post numbers | Word counts |
| General | 1,050 | 223,495 |
| Anxiety | 614 | 221,687 |
| Depression | 554 | 206,488 |
| Suicide ideation | 327 | 132,340 |
Linguistic features.
| Categories | Descriptions |
| Absolutist words |
|
| Personal pronouns | First-person singular pronouns ( |
| Sentiments | Sentiment polarities |
| Emotions | Emotions of anger, anticipation, disgust, fear, joy, sadness, surprise, and trust |
Statistical results of group comparisons.
| General | Anxiety | Depression | Suicide | Sig. | ||||||
| Mean | SD | Mean | SD | Mean | SD | Mean | SD | |||
|
| 10.69 | 11.26 | 14.98 | 9.90 | 14.74 | 10.12 | 17.69 | 10.71 | 0.00 | |
| Pronouns | Singular | 51.63 | 46.32 | 84.970 | 26.936 | 85.48 | 28.98 | 97.97 | 28.82 | 0.00 |
| Plural | 5.35 | 10.709 | 2.05 | 6.82 | 2.35 | 6.14 | 2.30 | 5.11 | 0.00 | |
|
| 0.05 | 0.15 | −0.13 | 0.15 | −0.07 | 0.12 | −0.08 | 0.12 | 0.00 | |
| Emotion | Anger | 0.01 | 0.02 | 0.03 | 0.04 | 0.02 | 0.03 | 0.02 | 0.03 | 0.00 |
| Anticip. | 0.03 | 0.03 | 0.03 | 0.04 | 0.03 | 0.02 | 0.03 | 0.03 | 0.00 | |
| Disgust | 0.01 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.00 | |
| Fear | 0.02 | 0.02 | 0.04 | 0.04 | 0.03 | 0.03 | 0.03 | 0.03 | 0.00 | |
| Joy | 0.02 | 0.03 | 0.01 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.00 | |
| Sadness | 0.02 | 0.02 | 0.04 | 0.04 | 0.03 | 0.03 | 0.04 | 0.03 | 0.00 | |
| Surprise | 0.01 | 0.02 | 0.01 | 0.02 | 0.01 | 0.02 | 0.01 | 0.02 | 0.01 | |
| Trust | 0.03 | 0.03 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.00 | |
SD, standard deviation; Anticip., anticipation.
FIGURE 1Accuracy of different models in classifying texts of anxiety.
FIGURE 3Accuracy of different models in classifying texts of suicide ideation.
Performance of machine-learning models.
| Groups | Features | Naïve Bayes | GLM | Logistic Regression | Deep Learning | SVM | |||||||||||||||
| Acc. | Prec. | R | F1 | Acc. | Prec. | R | F1 | Acc. | Prec. | R | F1 | Acc. | Prec. | R | F1 | Acc. | Prec. | R | F1 | ||
| Anxiety | Aw. | 61.3 | 64.5 | 86.3 | 73.8 | 61.5 | 64.5 | 86.7 | 74.0 | 61.5 | 64.5 | 86.7 | 74.0 | 61.5 | 64.1 | 89.0 | 74.4 | 62.7 | 63.2 | 98.0 | 76.9 |
| Pron | 75.2 | 83.4 | 76.0 | 79.4 | 75.8 | 83.7 | 76.7 | 80.0 | 76.0 | 83.8 | 77.0 | 80.2 | 76.0 | 85.6 | 74.7 | 79.7 | 76.2 | 85.9 | 74.7 | 79.8 | |
| Senti. | 77.1 | 78.3 | 88.3 | 83.0 | 78.1 | 75.8 | 96.0 | 84.7 | 78.1 | 75.8 | 96.0 | 84.7 | 78.5 | 76.5 | 95.3 | 84.9 | 67.2 | 65.8 | 100 | 79.4 | |
| Aw. + Pron | 76.8 | 77.2 | 90.3 | 83.1 | 74.9 | 77.3 | 85.7 | 81.2 | 75.2 | 77.4 | 86.0 | 81.4 | 76.6 | 79.3 | 85.3 | 82.2 | 76.8 | 79.0 | 86.3 | 82.5 | |
| Senti. +Aw. | 75.6 | 74.6 | 93.3 | 82.9 | 79.4 | 85.3 | 81.7 | 83.3 | 80.2 | 86.4 | 81.7 | 83.3 | 83.8 | 85.4 | 89.7 | 87.5 | 82.7 | 80.4 | 96.3 | 87.6 | |
| Senti + Pron | 81.5 | 88.0 | 82.0 | 84.8 | 79.2 | 92.2 | 73.3 | 81.6 | 79.4 | 92.2 | 73.7 | 81.8 | 85.3 | 93.3 | 82.7 | 87.6 | 85.9 | 94.3 | 82.7 | 88.1 | |
| Senti + Aw. + Pron | 82.5 | 82.3 | 92.3 | 87.0 | 83.6 | 83.5 | 92.3 | 87.7 | 83.6 | 83.3 | 92.7 | 87.7 |
|
|
|
| 80.6 | 77.4 | 98.0 | 86.5 | |
| Depression | Aw. | 65.4 | 65.4 | 100 | 79.1 | 65.8 | 65.9 | 98.7 | 79.0 | 65.8 | 65.9 | 98.7 | 79.0 | 65.4 | 65.4 | 100 | 79.1 | 65.5 | 65.9 | 100 | 79.2 |
| Pron | 70.8 | 75.4 | 82.3 | 78.7 | 73.8 | 83.4 | 75.0 | 79.0 | 73.8 | 83.4 | 75.0 | 79.0 | 74.2 | 85.0 | 73.7 | 78.9 | 74.7 | 81.6 | 79.3 | 80.4 | |
| Senti. | 71.4 | 78.5 | 78.0 | 78.1 | 73.4 | 74.8 | 89.7 | 81.5 | 73.2 | 74.8 | 89.3 | 81.4 | 76.7 | 76.4 | 93.3 | 84.0 | 72.7 | 71.8 | 96.3 | 82.2 | |
| Aw. + Pron | 72.3 | 72.8 | 92.0 | 81.3 | 74.0 | 75.8 | 88.7 | 81.7 | 74.0 | 75.8 | 88.7 | 81.7 | 75.2 | 78.4 | 85.7 | 81.8 | 74.9 | 76.3 | 89.7 | 82.4 | |
| Senti. +Aw. | 73.2 | 74.2 | 90.7 | 81.6 | 75.5 | 83.4 | 78.3 | 80.7 | 75.3 | 83.4 | 78.0 | 80.5 | 77.6 | 80.7 | 86.7 | 83.5 | 74.5 | 73.4 | 96.0 | 83.1 | |
| Senti + Pron | 80.4 | 84.1 | 86.7 | 85.3 | 79.7 | 92.8 | 75.0 | 82.8 | 79.7 | 92.1 | 75.7 | 83.0 | 83.0 | 94.8 | 78.3 | 85.7 |
|
|
|
| |
| Senti + Aw. + Pron | 78.9 | 77.7 | 95.0 | 85.5 | 78.9 | 79.8 | 91.0 | 85.0 | 78.9 | 79.8 | 91.0 | 85.0 | 77.8 | 76.1 | 96.3 | 85.0 | 74.5 | 72.4 | 98.7 | 83.5 | |
| Suicide | Aw. | 73.7 | 75.5 | 95.3 | 84.2 | 73.7 | 75.5 | 95.3 | 84.2 | 73.7 | 75.5 | 95.3 | 84.2 | 74.4 | 76.4 | 94.7 | 84.5 | 73.9 | 73.9 | 100 | 85.0 |
| Pron | 73.9 | 73.9 | 100 | 85.0 | 82.6 | 87.9 | 88.7 | 88.3 | 82.6 | 87.9 | 88.7 | 88.3 | 80.8 | 83.9 | 91.7 | 87.6 | 82.3 | 88.1 | 88.0 | 88.0 | |
| Senti. | 80.5 | 87.0 | 87.0 | 86.8 | 79.3 | 82.0 | 92.3 | 86.8 | 79.6 | 82.0 | 92.6 | 87.0 | 82.1 | 81.1 | 98.0 | 88.7 | 81.6 | 81.1 | 98.0 | 88.7 | |
| Aw. + Pron | 77.4 | 77.1 | 98.7 | 86.6 | 79.6 | 80.5 | 95.7 | 87.4 | 79.6 | 80.5 | 95.7 | 87.4 | 79.6 | 79.4 | 97.7 | 87.6 | 81.1 | 81.3 | 96.7 | 88.3 | |
| Senti. +Aw. | 75.2 | 76.3 | 96.3 | 85.1 | 82.5 | 85.2 | 92.3 | 88.6 | 82.8 | 85.5 | 92.3 | 88.8 | 85.0 | 87.5 | 93.0 | 90.1 | 86.0 | 88.6 | 94.7 | 90.9 | |
| Senti + Pron | 85.8 | 88.6 | 92.7 | 90.6 | 80.6 | 92.1 | 80.7 | 86.0 | 80.8 | 92.8 | 80.3 | 86.1 |
|
|
|
| 87.9 | 94.1 | 89.3 | 91.6 | |
| Senti + Aw. + Pron | 86.5 | 86.6 | 96.7 | 91.3 | 84.7 | 84.6 | 97.0 | 90.4 | 84.3 | 84.3 | 96.7 | 90.1 | 86.2 | 85.1 | 98.7 | 91.4 | 84.7 | 83.8 | 98.3 | 90.5 | |
Acc., accuracy; Prec., precision; R, recall; GLM, Generalized Linear Model; SVM, Support Vector Machine; Senti., Emotion + Sentiment value; Aw., absolutist word; Pron., first-person pronouns. Best results for detection are in bold.