| Literature DB >> 35464347 |
D Sunitha1, Raj Kumar Patra2, N V Babu3, A Suresh4, Suresh Chand Gupta5.
Abstract
As of November 2021, more than 24.80 crore people are diagnosed with the coronavirus in that around 50.20 lakhs people lost their lives, because of this infectious disease. By understanding the people's sentiment's expressed in their social media (Facebook, Twitter, Instagram etc.) helps their governments in controlling, monitoring, and eradicating the coronavirus. Compared to other social media's, the twitter data are indispensable in the extraction of useful awareness information related to any crisis. In this article, a sentiment analysis model is proposed to analyze the real time tweets, which are related to coronavirus. Initially, around 3100 Indian and European people's tweets are collected between the time period of 23.03.2020 to 01.11.2021. Next, the data pre-processing and exploratory investigation are accomplished for better understanding of the collected data. Further, the feature extraction is performed using Term Frequency-Inverse Document Frequency (TF-IDF), GloVe, pre-trained Word2Vec, and fast text embedding's. The obtained feature vectors are fed to the ensemble classifier (Gated Recurrent Unit (GRU) and Capsule Neural Network (CapsNet)) for classifying the user's sentiment's as anger, sad, joy, and fear. The obtained experimental outcomes showed that the proposed model achieved 97.28% and 95.20% of prediction accuracy in classifying the both Indian and European people's sentiments.Entities:
Year: 2022 PMID: 35464347 PMCID: PMC9014659 DOI: 10.1016/j.patrec.2022.04.027
Source DB: PubMed Journal: Pattern Recognit Lett ISSN: 0167-8655 Impact factor: 4.757
Fig. 1Flowchart of the proposed model.
Sample pre-processed tweets.
| Pre-processed tweets |
|---|
| India's Vaccination Drive crosses crore. |
| When scares end. |
| A gentle kind honest man has passed away. A very sweet man full of life love and joy. My condolences to his family. Rest in peace. |
| My head is like a prison cell. |
| Hackers uses fake coronavirus maps to infect visitors with malware. |
Performance analysis of the ensemble based deep learning model with different feature extraction techniques on the European tweets.
| Features | Classifiers | Accuracy (%) | Recall (%) | MCC (%) | Precision (%) | F-score (%) |
|---|---|---|---|---|---|---|
| GloVe | CapsNet | 89.02 | 90.91 | 87.87 | 88.34 | 79.82 |
| Word2Vec | 89.27 | 89.27 | 86.02 | 87.72 | 82.20 | |
| Fast text embedding | 90.38 | 88.90 | 83.20 | 90.86 | 74.02 | |
| Hybrid | 92.82 | 90.67 | 89.09 | 93.09 | 88.88 | |
| GloVe | GRU | 90.21 | 92.91 | 92.18 | 93.20 | 83.49 |
| Word2Vec | 90.28 | 93.07 | 87.02 | 93.44 | 84.65 | |
| Fast text embedding | 91.20 | 94.04 | 93.92 | 95.90 | 89.90 | |
| Hybrid | 92.03 | 95.92 | 93.36 | 95.55 | 91.91 | |
| GloVe | Ensemble | 94.44 | 96.64 | 95.01 | 96.30 | 92.04 |
| Word2Vec | 94.80 | 96.22 | 95.98 | 97.02 | 93.30 | |
| Fast text embedding | 94.92 | 97.30 | 96.11 | 97.90 | 94.43 | |
| Hybrid | 95.20 | 97.78 | 97.70 | 98.32 | 96.65 |
Performance of the ensemble based deep learning model with different cross-folds on the European tweets.
| Ensemble based deep learning model | |||||
|---|---|---|---|---|---|
| Cross-folds | Accuracy (%) | Recall (%) | MCC (%) | Precision (%) | F-score (%) |
| 3-folds | 92.39 | 94.20 | 94.09 | 95.34 | 95.27 |
| 5-folds | 94.34 | 95.37 | 96.96 | 96.02 | 96.11 |
| 10-folds | 95.20 | 97.78 | 97.70 | 98.32 | 96.65 |
Performance analysis of the ensemble based deep learning model with different feature extraction techniques on the Indian tweets.
| Features | Classifiers | Accuracy (%) | Recall (%) | MCC (%) | Precision (%) | F-score (%) |
|---|---|---|---|---|---|---|
| GloVe | CapsNet | 86.12 | 89.94 | 81.68 | 87.38 | 81.85 |
| Word2Vec | 87.29 | 88.28 | 84.07 | 88.78 | 86.27 | |
| Fast text embedding | 91.88 | 87.95 | 83.40 | 91.87 | 84.42 | |
| Hybrid | 93.42 | 92.68 | 85.82 | 92.12 | 86.89 | |
| GloVe | GRU | 88.93 | 91.95 | 90.17 | 92.23 | 89.99 |
| Word2Vec | 89.23 | 92.72 | 90.02 | 92.45 | 89.95 | |
| Fast text embedding | 91.90 | 94.78 | 92.99 | 93.60 | 90.87 | |
| Hybrid | 94.45 | 94.82 | 93.96 | 94.15 | 91.97 | |
| GloVe | Ensemble | 95.46 | 95.55 | 94.09 | 95.80 | 92.54 |
| Word2Vec | 95.88 | 95.27 | 94.99 | 95.92 | 93.80 | |
| Fast text embedding | 95.98 | 96.39 | 95.19 | 96.80 | 95.93 | |
| Hybrid | 97.28 | 96.98 | 95.90 | 97.77 | 96.29 |
Performance of the ensemble based deep learning model with different cross-folds on the Indian tweets.
| Ensemble based deep learning model | |||||
|---|---|---|---|---|---|
| Cross-folds | Accuracy (%) | Recall (%) | MCC (%) | Precision (%) | F-score (%) |
| 3-folds | 94.55 | 94.92 | 94.92 | 95.39 | 95.68 |
| 5-folds | 96.78 | 95.97 | 95.48 | 96.78 | 96.10 |
| 10-folds | 97.28 | 96.98 | 95.90 | 97.77 | 96.29 |