| Literature DB >> 36248924 |
Ibrahim Eldesouky Fattoh1, Fahad Kamal Alsheref2, Waleed M Ead2, Ahmed Mohamed Youssef3.
Abstract
The spread of data on the web has increased in the last twenty years. One of the reasons is the appearance of social media. The data on social sites describe many real-life events in our daily lives. In the period of the COVID-19 pandemic, a lot of people and media organizations were writing and documenting their health status and the latest news about the coronavirus on social media. Using these tweets (sentiments) about the coronavirus and analyzing them in a computational model can help decision makers in measuring public opinion and yielding remarkable findings. In this research article, we introduce a deep learning sentiment analysis model based on Universal Sentence Encoder. The dataset used in this research was collected from Twitter, and it was classified as positive, neutral, and negative. The sentence embedding model determines the meaning of word sequences instead of individual words. The model divides the dataset into training and testing and depends on the sentence similarity in detecting sentiment class. The obtained accuracy results reached 78.062%, and this result outperforms many traditional ML classifiers based on TF-IDF applied on the same dataset and another model based on the CNN classifier.Entities:
Mesh:
Year: 2022 PMID: 36248924 PMCID: PMC9556213 DOI: 10.1155/2022/6354543
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Two sentence encoder models: (a) transformer encoder and (b) DAN Encoder.
Figure 2Sentiment classification framework.
Figure 3Dataset distribution.
Dataset sample.
| Screen name | Tweets | Sentiment | Label |
|---|---|---|---|
|
| @TartiiCat Well new/used Rift S are going for $700.00 on Amazon rn although the normal market price is usually $400.00. Prices are really crazy right now for VR headsets since HL Alex was announced and it's only been worse with COVID-19 | 0 | Negative |
|
| @MeNyrbie @Phil_Gahan @Chrisitv | 1 | Neutral |
|
| We've updated our amp The Essentials page What s new Latest NDIS updates changes to community shopping hours amp a list of free online activities for the whole family new videos coming soon | 2 | Positive |
Demographic representation of tweet classification.
| Sentiment label | Code | Data type | |
|---|---|---|---|
| Negative | 0 | Training | 3500 |
| Testing | 500 | ||
|
| |||
| Neutral | 1 | Training | 1750 |
| Testing | 250 | ||
|
| |||
| Positive | 2 | Training | 3500 |
| Testing | 500 | ||
Accuracy of the proposed model for each epoch experiment.
| Epochs | Accuracy |
|---|---|
|
| 77.742% |
|
| 77.982% |
|
| 77.902% |
|
| 77.982% |
|
| 77.9823 |
|
| 78.0624 |
Accuracy of ML classifiers with TF-IDF.
| Classifiers | Accuracy |
|---|---|
| Fast tree | 57.59% |
| Fast forest | 63.69% |
| Light GBM | 50.74% |
| Maximum entropy | 78.90% |
| Logistic regression | 78.83 |
Architecture of CNN.
| Layer (type) | Output shape | Parameter# |
|---|---|---|
| Dense (dense) | (None, 1024) | 61669376 |
| Activation (activation) | (None, 1024) | 0 |
| Dropout (dropout) | (None, 1024) | 0 |
| Dense_1 (dense) | (None, 3) | 3075 |
| Activation_1(activation) | (None, 3) | 0 |
Proposed model vs. CNN classifier based on TF-IDF.
| Epochs# | Proposed model | CNN model | ||
|---|---|---|---|---|
| Accuracy | Loss | Accuracy | Loss | |
|
| 77.742% | 0.8005 | 72.16% | 1.1315 |
|
| 77.982% | 0.8006 | 70.70% | 1.4414 |
|
| 77.902% | 0.7993 | 69.87% | 1.6880 |
|
| 77.982% | 0.7994 | 69.09% | 2.2059 |
|
| 77.9823 | 0.7987 | 69.55% | 2.4477 |
|
| 78.0624 | 0.7991 | 69.23% | 2.8105 |
Figure 4Accuracy result comparison between the proposed model against the CNN-based model for epochs from 1 to 100.