| Literature DB >> 35707672 |
Jun Shen1, Leping Jiang2.
Abstract
Literary therapy theory provides a new field for contemporary literature research and is of great significance for maintaining the physical and mental health of modern people. The quantitative evaluation of psychotherapy effects in Japanese healing literature is a hot research topic at present. In this study, a text convolutional neural network (Text-CNN) was selected to extract psychological therapy features with different levels of granularity by using multiple convolutional kernels of different sizes. Bidirectional threshold regression neural network (BiGRU) can characterize the relationship between literature research and the psychotherapy effect. On the basis of the CNN-BilSTM model, a parallel hybrid network integrated with the attention mechanism was constructed to analyze the correlation between literature and psychotherapy. Through experimental verification, the model in this study further improves the accuracy of correlation classification and has strong adaptability.Entities:
Keywords: BiGRU; Japanese literature; Text-CNN; attention mechanism; equation diagnostic algorithms; psychotherapy
Year: 2022 PMID: 35707672 PMCID: PMC9190781 DOI: 10.3389/fpsyg.2022.906952
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Figure 1The essence of the attention mechanism.
Figure 2Specific calculation process of attention mechanism.
Figure 3Model based on the parallel hybrid network integrating the attention mechanism.
Figure 4The attention mechanism model structure.
Data of the experiment in this chapter.
|
|
|
|
|
|
|---|---|---|---|---|
| IMDB | 32,000 | 8,000 | 10,000 | 50,000 |
| SST-2 | 8,544 | 1,101 | 2,210 | 11,855 |
SST-2: This data set is an extension of Stanford University's MR data set on film reviews. Positive comments are labeled 1, negative comments are labeled 0, and it contains 11,855 texts labeled with emotion classification.
Parameter setting of the attention mechanism model in the parallel hybrid network.
|
|
|
|
|---|---|---|
| Embedding_dim | Word vector dimension | 50, 100, 200, and 300 |
| Filters | Convolution kernel number | 32 |
| Kernel size | Convolution kernel size | 3,4,5 |
| Padding | Fill the way | Same |
| Activate on | The activation function | Relu |
| Pooling size | Pool layer size | 2. |
| BGRU (units) | BGRU Number of hidden layer nodes | 32, 64, 96, 128, 160, and 192 |
| Dropout | Random discard probability | 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, and 0.7 |
| Dense dim | Number of neural units in full connection layer | 100 |
| Batch size | Batch training sample number | 16, 32, 64, 128, and 256 |
| Epochs | Practice choosing the number of time steps | 10 |
| Optimizer | The optimizer | SGD, Adagrad, Adadelta, RMSProp, and Adam |
Experimental results of the different vector dimensions of words.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 50 | 88.37 | 88.46 | 88.27 | 88.35 | 88.18 | 88.43 | 87.95 | 88.19 |
| 100 | 89.82 | 90.67 | 88.77 | 89.69 | 88.61 | 88.69 | 88.57 | 88.63 |
| 200.65 | 91.46 | 91.36 | 89.48 | 88.51 | 92.32 | 90.45 | 90.8 | 189 |
| 300 | 91.10 | 93.36 | 88.42 | 90.78 | 88.48 | 89.80 | 86.88 | 88.32 |
Experimental results of the different numbers of hidden layer nodes.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 32 | 90.20 | 92.83 | 87.09 | 89.81 | 88.35 | 91.83 | 84.16 | 87.83 |
| 64 | 90.80 | 91.68 | 89.73 | 90.67 | 89.10 | 89.47 | 88.69 | 89.08 |
| 96 | 91.46 | 92.32 | 90.455 | 91.36 | 89.48 | 88.51 | 90.81 | 89.65 |
| 128 | 91.39 | 90.49 | 92.57 | 91.34 | 89.11 | 88.71 | 89.68 | 89.19 |
| 160 | 91.37 | 91.81 | 90.86 | 91.32 | 90.82 | 88.89 | 86.56 | 88.64 |
| 192 | 90.95 | 91.29 | 90.55 | 90.90 | 88.21 | 87.17 | 89.73 | 88.43 |
Experimental results of different Dropout values.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 0.1 | 90.03 | 90.02 | 90.04 | 90.03 | 87.38 | 87.64 | 85.79 | 86.71 |
| 0.2 | 90.27 | 90.25 | 90.31 | 90.28 | 87.56 | 88.65 | 86.15 | 87.38 |
| 0.3 | 90.93 | 90.87 | 90.97 | 90.92 | 88.60 | 89.42 | 88.03 | 88.72 |
| 0.4 | 91.02 | 90.78 | 90.06 | 90.42 | 89.07 | 89.36 | 88.78 | 89.07 |
| 0.5 | 91.46 | 92.32 | 90.45 | 91.36 | 89.48 | 88.51 | 90.81 | 89.65 |
| 0.6 | 89.81 | 89.88 | 89.71 | 89.79 | 88.46 | 87.87 | 89.28 | 88.57 |
| 0.7 | 88.46 | 88.63 | 88.48 | 88.55 | 88.28 | 87.16 | 89.95 | 88.53 |
Experimental results of different batch_size values.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 16 | 91.11 | 90.49 | 90.12 | 91.30 | 88.03 | 87.31 | 89.14 | 88.22 |
| 32 | 91.46 | 92.32 | 90.45 | 91.36 | 89.48 | 88.51 | 90.81 | 89.65 |
| 64 | 90.61 | 91.95 | 88.95 | 90.43 | 87.78 | 89.75 | 85.34 | 87.49 |
| 128 | 90.86 | 92.46 | 88.97 | 90.68 | 86.59 | 92.15 | 79.69 | 85.47 |
| 256 | 88.12 | 90.10 | 85.41 | 87.69 | 85.55 | 86.37 | 84.40 | 85.37 |
Experimental results of different optimizers.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| SGD | 78.83 | 79.36 | 77.87 | 78.61 | 87.32 | 87.11 | 87.77 | 87.44 |
| A dagrad | 87.92 | 94.84 | 80.19 | 86.90 | 88.25 | 90.98 | 84.67 | 87.71 |
| Adadelta | 89.63 | 89.65 | 89.61 | 89.63 | 88.42 | 90.42 | 85.93 | 88.12 |
| RMSProp | 91.22 | 89.56 | 90.16 | 89.86 | 88.89 | 87.09 | 90.65 | 88.83 |
| Adam | 91.46 | 92.32 | 90.45 | 91.36 | 89.48 | 88.51 | 90.81 | 89.65 |
Experimental results of different models.
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| CNN | 86.32 | 86.33 | 86.35 | 86.34 | 85.49 | 85.56 | 85.65 | 85.60 |
| Text-CNN | 87.51 | 87.41 | 87.68 | 87.54 | 86.22 | 86.21 | 87.85 | 87.02 |
| BiLSTM | 90.06 | 89.99 | 90.18 | 90.08 | 87.94 | 89.80 | 85.30 | 87.49 |
| BiGRU | 90.46 | 89.45 | 90.86 | 90.15 | 88.05 | 86.68 | 90.05 | 88.33 |
| Text-CNN-BiGRU | 90.48 | 90.46 | 90.50 | 90.48 | 88.60 | 86.94 | 90.43 | 88.65 |
| Text-CNN+BiGRU | 90.78 | 90.83 | 90.67 | 90.78 | 88.96 | 88.99 | 89.05 | 89.02 |
| Text-CNN+BiGRU+Attention | 91.46 | 92.32 | 90.45 | 91.36 | 89.48 | 88.51 | 90.81 | 89.65 |
Figure 5Variation in the accuracy of the verification set for different models.
Figure 6Variation of loss of the verification set for different models.