| Literature DB >> 35036528 |
Nazish Azam1, Tauqir Ahmad1, Nazeef Ul Haq2.
Abstract
Human feelings are fundamental to perceive the conduct and state of mind of an individual. A healthy emotional state is one significant highlight to improve personal satisfaction. On the other hand, bad emotional health can prompt social or psychological well-being issues. Recognizing or detecting feelings in online health care data gives important and helpful information regarding the emotional state of patients. To recognize or detection of patient's emotion against a specific disease using text from online sources is a challenging task. In this paper, we propose a method for the automatic detection of patient's emotions in healthcare data using supervised machine learning approaches. For this purpose, we created a new dataset named EmoHD, comprising of 4,202 text samples against eight disease classes and six emotion classes, gathered from different online resources. We used six different supervised machine learning models based on different feature engineering techniques. We also performed a detailed comparison of the chosen six machine learning algorithms using different feature vectors on our dataset. We achieved the highest 87% accuracy using MultiLayer Perceptron as compared to other state of the art models. Moreover, we use the emotional guidance scale to show that there is a link between negative emotion and psychological health issues. Our proposed work will be helpful to automatically detect a patient's emotion during disease and to avoid extreme acts like suicide, mental disorders, or psychological health issues. The implementation details are made publicly available at the given link: https://bit.ly/2NQeGET. ©2021 Azam et al.Entities:
Keywords: Emotion detection; Emotion guidance scale; Negative emotions mapping; Patient’s emotion; Supervised machine learning
Year: 2021 PMID: 35036528 PMCID: PMC8725656 DOI: 10.7717/peerj-cs.751
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Overall statistics of different features of the EmoHD dataset.
|
| ||
|---|---|---|
|
|
|
|
| 1 |
| 12090922 |
| 2 |
| 1634319 |
| 3 |
| 64543 |
| 4 |
| 91988 |
| 5 |
| 827475 |
| 6 |
| 1159467 |
Overall distribution of EmoHD dataset with respect to the emotion class label.
|
| ||
|---|---|---|
|
|
|
|
| 1 |
| 1343 |
| 2 |
| 1215 |
| 3 |
| 742 |
| 4 |
| 522 |
| 5 |
| 358 |
| 6 |
| 22 |
|
|
| |
Figure 1Distribution of the EmoHD dataset with respect to disease class label.
Figure 2Basic flow diagram of predicting emotion in patients during specific disease with the help of online available disease related news.
Figure 3A basic four layers example architecture of MultiLayer Perceptron.
Comparison and analysis of quantitative results of six different machine learning models using different four feature while including minority emotion class.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| MNB | Count vectors | 72% | 0.73 | 0.72 | 0.72 |
| WordLevel TF-IDF | 63% | 0.63 | 0.62 | 0.62 | ||
| N-Gram vectors TF-IDF | 60% | 0.61 | 0.60 | 0.60 | ||
| CharLevel vectors TF-IDF | 54% | 0.55 | 0.55 | 0.55 | ||
|
| MLP | Count vectors | 87% | 0.87 | 0.87 | 0.87 |
| WordLevel TF-IDF | 86% | 0.86 | 0.86 | 0.86 | ||
| N-Gram vectors TF-IDF | 85% | 0.85 | 0.85 | 0.85 | ||
| CharLevel vectors TF-IDF | 83% | 0.83 | 0.83 | 0.83 | ||
|
| LR | Count vectors | 86% | 0.86 | 0.86 | 0.86 |
| WordLevel TF-IDF | 75% | 0.75 | 0.75 | 0.75 | ||
| N-Gram vectors TF-IDF | 74% | 0.74 | 0.74 | 0.74 | ||
| CharLevel vectors TF-IDF | 59% | 0.60 | 0.60 | 0.60 | ||
|
| SVM | Count vectors | 76% | 0.76 | 0.77 | 0.76 |
| WordLevel TF-IDF | 83% | 0.83 | 0.83 | 0.83 | ||
| N-Gram vectors TF-IDF | 82% | 0.82 | 0.83 | 0.82 | ||
| CharLevel vectors TF-IDF | 62% | 0.63 | 0.64 | 0.63 | ||
|
| RF | Count vectors | 86% | 0.86 | 0.87 | 0.86 |
| WordLevel TF-IDF | 86% | 0.86 | 0.86 | 0.86 | ||
| N-Gram vectors TF-IDF | 86% | 0.86 | 0.86 | 0.86 | ||
| CharLevel vectors TF-IDF | 86% | 0.86 | 0.87 | 0.86 | ||
|
| XGB | Count vectors | 83% | 0.84 | 0.83 | 0.83 |
| WordLevel TF-IDF | 85% | 0.85 | 0.85 | 0.85 | ||
| N-Gram vectors TF-IDF | 84% | 0.85 | 0.84 | 0.85 | ||
| CharLevel vectors TF-IDF | 85% | 0.86 | 0.86 | 0.86 |
Comparison and analysis of quantitative results of six different machine learning models using different four feature while excluding minority emotion class.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| MNB | Count vectors | 67% | 0.67 | 0.66 | 0.66 |
| WordLevel TF-IDF | 55% | 0.55 | 0.54 | 0.54 | ||
| N-Gram vectors TF-IDF | 54% | 0.54 | 0.53 | 0.53 | ||
| CharLevel vectors TF-IDF | 46% | 0.47 | 0.46 | 0.46 | ||
| 2 | MLP | Count vectors | 83% | 0.83 | 0.83 | 0.82 |
| WordLevel TF-IDF | 83% | 0.82 | 0.82 | 0.82 | ||
| N-Gram vectors TF-IDF | 83% | 0.83 | 0.82 | 0.82 | ||
| CharLevel vectors TF-IDF | 83% | 0.83 | 0.82 | 0.82 | ||
|
| LR | Count vectors | 82% | 0.82 | 0.81 | 0.81 |
| WordLevel TF-IDF | 72% | 0.72 | 0.71 | 0.71 | ||
| N-Gram vectors TF-IDF | 72% | 0.72 | 0.71 | 0.71 | ||
| CharLevel vectors TF-IDF | 52% | 0.52 | 0.51 | 0.51 | ||
|
| SVM | Count vectors | 74% | 0.74 | 0.74 | 0.74 |
| WordLevel TF-IDF | 81% | 0.81 | 0.81 | 0.81 | ||
| N-Gram vectors TF-IDF | 80% | 0.80 | 0.80 | 0.80 | ||
| CharLevel vectors TF-IDF | 57% | 0.57 | 0.58 | 0.57 | ||
|
| RF | Count vectors | 86% | 0.86 | 0.86 | 0.86 |
| WordLevel TF-IDF | 84% | 0.84 | 0.84 | 0.84 | ||
| N-Gram vectors TF-IDF | 84% | 0.84 | 0.84 | 0.84 | ||
| CharLevel vectors TF-IDF | 86% | 0.86 | 0.86 | 0.86 | ||
|
| XGB | Count vectors | 84% | 0.84 | 0.83 | 0.83 |
| WordLevel TF-IDF | 84% | 0.84 | 0.84 | 0.84 | ||
| N-Gram vectors TF-IDF | 84% | 0.84 | 0.84 | 0.84 | ||
| CharLevel vectors TF-IDF | 85% | 0.85 | 0.85 | 0.85 |
Test data results.
|
|
|
|
|---|---|---|
|
| 10 | 8 |
|
| 10 | 7 |
|
| 10 | 8 |
|
| 10 | 9 |
|
| 10 | 9 |
|
|
|
|
Training and testing time of models in seconds.
|
|
|
|
|
|---|---|---|---|
|
| Count vectors | 0.10 | 0.04 |
| WordLevel TF-IDF | 0.05 | 0.007 | |
| N-Gram vectors | 0.01 | 0.000 | |
| CharLevel vectors | 0.10 | 0.02 | |
|
| Count vectors | 563.3 | 0.04 |
| WordLevel TF-IDF | 500.1 | 0.03 | |
| N-Gram vectors | 454.2 | 0.02 | |
| CharLevel vectors | 495.1 | 0.08 | |
|
| Count vectors | 10.6 | 0.007 |
| WordLevel TF-IDF | 1.92 | 0.002 | |
| N-Gram vectors | 1.98 | 0.001 | |
| CharLevel vectors | 8.99 | 0.01 | |
|
| Count Vectors | 63.0 | 13.1 |
| WordLevel TF-IDF | 48.8 | 9.2 | |
| N-Gram vectors | 49.8 | 9.4 | |
| CharLevel vectors | 397.5 | 77.8 | |
|
| Count vectors | 16.0 | 0.23 |
| WordLevel TF-IDF | 9.62 | 0.09 | |
| N-Gram vectors | 10.2 | 0.10 | |
| CharLevel vectors | 32.2 | 0.35 | |
|
| Count vectors | 27.7 | 0.25 |
| WordLevel TF-IDF | 43.7 | 0.10 | |
| N-Gram vectors | 10.2 | 0.12 | |
| CharLevel vectors | 337.9 | 0.59 |
Figure 4Negative emotions leading to psychological health issues.