| Literature DB >> 35955051 |
Saad Awadh Alanazi1, Ayesha Khaliq2,3, Fahad Ahmad4, Nasser Alshammari1, Iftikhar Hussain5, Muhammad Azam Zia3, Madallah Alruwaili6, Alanazi Rayan7, Ahmed Alsayat1, Salman Afsar3.
Abstract
Public feelings and reactions associated with finance are gaining significant importance as they help individuals, public health, financial and non-financial institutions, and the government understand mental health, the impact of policies, and counter-response. Every individual sentiment linked with a financial text can be categorized, whether it is a headline or the detailed content published in a newspaper. The Guardian newspaper is considered one of the most famous and the biggest websites for digital media on the internet. Moreover, it can be one of the vital platforms for tracking the public's mental health and feelings via sentimental analysis of news headlines and detailed content related to finance. One of the key purposes of this study is the public's mental health tracking via the sentimental analysis of financial text news primarily published on digital media to identify the overall mental health of the public and the impact of national or international financial policies. A dataset was collected using The Guardian application programming interface and processed using the support vector machine, AdaBoost, and single layer convolutional neural network. Among all identified techniques, the single layer convolutional neural network with a classification accuracy of 0.939 is considered the best during the training and testing phases as it produced efficient performance and effective results compared to other techniques, such as support vector machine and AdaBoost with associated classification accuracies 0.677 and 0.761, respectively. The findings of this research would also benefit public health, as well as financial and non-financial institutions.Entities:
Keywords: AdaBoost; deep learning; financial text; machine learning; mental health; sentiment analysis; single layer convolutional neural network; support vector machine; the Guardian
Mesh:
Year: 2022 PMID: 35955051 PMCID: PMC9368160 DOI: 10.3390/ijerph19159695
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Figure 1Different Emotional States Adapted from Circumplex Model.
Summary of the related work regarding text classification.
| References | Techniques | Challenges | Factors Considered | Advantages |
|---|---|---|---|---|
| [ |
Quad Channel Hybrid Long Short-Term Memory (QC-LSTM) Bidirectional Gated Recurrent Unit (BiGRU) | Huge quantity of detailed information is available in the medical domain, and it is quite a considerable challenge to process it efficiently. | A huge quantity of patient information has been collected in an electronic format. | For medical text classification tasks, machine learning techniques seem quite practical. |
| [ |
Bidirectional Encoder Representations from Transformers Distil Bidirectional Encoder Representations from Transformers | Training deep language models, however, is time-consuming and computationally intensive. | Financial domain leverage generic benchmark datasets from the literature and two proprietary datasets in the financial technological industry. | Yielded state-of-the-art performance and offload practitioners from the burden of preparing the adequate resources (time, hardware, and data) to train models. |
| [ |
Support Vector Machines Rocchio classifier | Classification of English text and documents is inefficient. | Size of the feature set. | Better classification of English documents when using more than 4000 features. |
| [ |
Financial Bidirectional Encoder Representations from Transformers | A limited number of models understand financial jargon. | Financial data for over ten years for 25 different companies. | Can analyze historical data effectively. |
Figure 2Text-Based Sentimental Classification through Single Layer Convolutional Neural Network.
Figure 3Frequencies Ranges of Chosen Data.
Proposed attributes & possible data types.
| Name of Attributes | Possible Data Types |
|---|---|
| Neutral | Text |
| Glad | Text |
| Depressed | Text |
| Annoyed | Text |
Figure 4Process for Text-Based Sentimental Classification.
Preprocessing configuration.
|
|
| Lowercase |
| Remove URLs | ||
|
| Regexp | |
|
| Porter Stemmer | |
|
| Stopwords | |
| Regexp | ||
|
| 1–2 | |
|
| Averaged Perceptron Tagger |
Corpus view.
| Tweet Content | Positive Score | Negative Score | Neutral Score | Compound Score |
|---|---|---|---|---|
| On my way home n having. | 0.000 | 0.673 | 0.337 | −0.534 |
| The financial matters and the. | 0.235 | 0. | 0.765 | 0.359 |
| Mmm much better day… | 0.848 | 0.000 | 0.152 | 0.445 |
| has work this afternoon… | 0.299 | 0.111 | 0.590 | 0.469 |
| -------------------------------- | ||||
Figure 5Visualization of Guardian Dataset Through Heat Map.
Figure 6Sentiment Representation through Word Cloud.
Figure 7Detail Flowchart of Proposed Work.
Text based sentimental classification.
| Sentimental Attributes | Classifiers | ||
|---|---|---|---|
| SVM | AdaBoost | SLCNN | |
|
| 99 | 110 | 117 |
|
| 471 | 550 | 599 |
|
| 109 | 123 | 128 |
|
| 232 | 269 | 278 |
Figure 8Text-Based Sentimental Classification.
Performance measurement during training for text-based sentimental classification (training).
| Performance Measures | Classifiers with Optimization Techniques | |||||||
|---|---|---|---|---|---|---|---|---|
| SVM (Polynomial) | SVM (Linear) | SVM (Sigmoid) | AdaBoost (SAMME) | AdaBoost (SAMME.R) | SLCNN (L-BFGS-B) | SLCNN (SGD) | SLCNN (Adam) | |
|
| 0.677 | 0.680 | 0.679 | 0.780 | 0.783 |
| 0.915 | 0.916 |
|
| 0.671 | 0.677 | 0.676 | 0.758 | 0.761 |
| 0.932 | 0.938 |
|
| 0.664 | 0.666 | 0.665 | 0.762 | 0.763 |
| 0.944 | 0.943 |
|
| 0.681 | 0.691 | 0.690 | 0.774 | 0.782 |
| 0.960 | 0.960 |
|
| 0.612 | 0.616 | 0.612 | 0.723 | 0.727 |
| 0.924 | 0.925 |
Figure 9Performance Analysis of SVM, AdaBoost, and SLCNN.
Execution time of the Guardian dataset analysis for text-based sentimental classification.
| Machine Learning Techniques | Execution Time (ms) |
|---|---|
|
| 1707 |
|
| 1705 |
|
| 1715 |
|
| 1601 |
|
| 1608 |
|
| 1680 |
|
| 1681 |
|
| 1677 |
Figure 10Execution Time Analysis.
Performance measurement during testing for text-based sentimental classification (testing).
| Performance Measures | Classifiers with Optimization Techniques | |||||||
|---|---|---|---|---|---|---|---|---|
| SVM (Polynomial) | SVM (Linear) | SVM (Sigmoid) | AdaBoost (SAMME) | AdaBoost (SAMME.R) | SLCNN (L-BFGS-B) | SLCNN (SVM) | SLCNN (Adam) | |
|
| 0.578 | 0.583 | 0.581 | 0.680 | 0.682 |
| 0.817 | 0.817 |
|
| 0.571 | 0.572 | 0.572 | 0.661 | 0.664 |
| 0.824 | 0.833 |
|
| 0.540 | 0.546 | 0.541 | 0.659 | 0.669 |
| 0.831 | 0.837 |
|
| 0.579 | 0.583 | 0.582 | 0.675 | 0.680 |
| 0.861 | 0.858 |
|
| 0.512 | 0.519 | 0.518 | 0.627 | 0.629 |
| 0.829 | 0.831 |
Figure 11Model Comparison by Performance Measures during Testing.
Models’ comparison by performance measurements.
| Model Comparison. | Machine Learning Techniques | SVM | AdaBoost | SLCNN |
|---|---|---|---|---|
|
|
| 0.046 | 0.000 | |
|
| 1.000 | 1.000 | ||
|
| 0.000 | 0.954 | ||
|
|
| 0.046 | 0.232 | |
|
| 0.089 | 0.768 | ||
|
| 0.911 | 0.954 | ||
|
|
| 0.056 | 0.131 | |
|
| 0.242 | 0.869 | ||
|
| 0.0758 | 0.944 | ||
|
|
| 0.530 | 0.760 | |
|
| 0.240 | 0.209 | ||
|
| 0.911 | 0.954 | ||
|
|
| 0.046 | 0.232 | |
|
| 0.089 | 0.768 | ||
|
| 0.470 | 0.791 | ||
|
|
| 0.224 | 0.210 | |
|
| 0.514 | 0.790 | ||
|
| 0.486 | 0.776 |
Figure 12Models’ Comparison for Identified Classifiers.
Performance comparison of the anticipated text classification models.
| Classifier | AUC | CA | F1-Score | Precision | Recall | |
|---|---|---|---|---|---|---|
|
| Logistic Model Trees | - | 0.8550 | - | - | - |
|
| SGD Modified Hurbe | - | 0.8200 | 0.6435 | - | 1.00 |
|
| SVM | - | 0.784 | - | 0.548 | 1.00 |
|
| BiLSTM | - | 0.8766 | 0.8766 | 0.8766 | 0.8766 |
|
| ODCNN | 0.990 | 0.998 | - | - | - |
|
| BiLSTMATTN | - | 0.9286 | 0.93 | 0.93 | 0.93 |
|
| SVM | 0.583 | 0.572 | 0.546 | 0.583 | 0.519 |
| AdaBoost | 0.682 | 0.664 | 0.669 | 0.680 | 0.629 | |
| SLCNN | 0.819 | 0.834 | 0.840 | 0.863 | 0.837 |