| Literature DB >> 35746490 |
Camelia Delcea1, Liviu-Adrian Cotfas1, Liliana Crăciun2, Anca Gabriela Molănescu2.
Abstract
Vaccination has been proposed as one of the most effective methods to combat the COVID-19 pandemic. Since the day the first vaccine, with an efficiency of more than 90%, was announced, the entire vaccination process and its possible consequences in large populations have generated a series of discussions on social media. Whereas the opinions triggered by the administration of the initial COVID-19 vaccine doses have been discussed in depth in the scientific literature, the approval of the so-called 3rd booster dose has only been analyzed in country-specific studies, primarily using questionnaires. In this context, the present paper conducts a stance analysis using a transformer-based deep learning model on a dataset containing 3,841,594 tweets in English collected between 12 July 2021 and 11 August 2021 (the month in which the 3rd dose arrived) and compares the opinions (in favor, neutral and against) with the ones extracted at the beginning of the vaccination process. In terms of COVID-19 vaccination hesitance, an analysis based on hashtags, n-grams and latent Dirichlet allocation is performed that highlights the main reasons behind the reluctance to vaccinate. The proposed approach can be useful in the context of the campaigns related to COVID-19 vaccination as it provides insights related to the public opinion and can be useful in creating communication messages to support the vaccination campaign.Entities:
Keywords: COVID-19 vaccination; deep learning; natural language processing; opinion mining; stance analysis
Year: 2022 PMID: 35746490 PMCID: PMC9228932 DOI: 10.3390/vaccines10060881
Source DB: PubMed Journal: Vaccines (Basel) ISSN: 2076-393X
Figure 1Stance detection steps.
The distribution of the number of tweets published in the analyzed period.
| Date | Date | Date | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| July | 12 | 95,877 | 21,899 | July | 23 | 158,257 | 32,204 | August | 3 | 132,002 | 31,384 |
| 13 | 128,208 | 25,935 | 24 | 123,736 | 22,974 | 4 | 137,450 | 32,835 | |||
| 14 | 119,361 | 25,158 | 25 | 88,270 | 19,193 | 5 | 142,740 | 35,537 | |||
| 15 | 114,203 | 23,085 | 26 | 122,915 | 29,234 | 6 | 152,467 | 32,887 | |||
| 16 | 99,482 | 24,838 | 27 | 137,744 | 32,036 | 7 | 121,262 | 26,787 | |||
| 17 | 113,643 | 24,291 | 28 | 147,068 | 34,692 | 8 | 110,653 | 24,929 | |||
| 18 | 82,749 | 17,603 | 29 | 138,891 | 35,125 | 9 | 158,312 | 35,770 | |||
| 19 | 111,832 | 26,275 | 30 | 153,311 | 36,667 | 10 | 128,561 | 31,463 | |||
| 20 | 117,093 | 29,349 | 31 | 110,618 | 24,883 | 11 | 131,879 | 27,946 | |||
| 21 | 114,847 | 27,641 | August | 1 | 89,678 | 21,433 | TOTAL | 3,841,594 | 876,151 | ||
| 22 | 121,548 | 31,068 | 2 | 136,937 | 31,030 | ||||||
Figure 2The evolution of the number of tweets in the entire and cleaned datasets.
Statistics for the classified dataset.
| Class |
|
|
| TOTAL |
|---|---|---|---|---|
|
| 1164 | 3127 | 345 | 4636 |
|
| 25.11% | 67.45% | 7.44% | 100.00% |
Sample tweets.
| Stance | Tweet |
|---|---|
|
| @Shonib4u @Barbara56532914 @mmaltaisLA @USATODAY Not odd. Medical intervention has only been used for dangerously low oxygen levels. As has been pointed out, 98% of people who Covid-19 survive w/o medical treatment. I understand that 2% is small but chances of vaccine averse events is 0.00001% which is much, much less. |
| YES, it’s worse because of selfish people not willing to get Covid-19 vaccine. #GetVaccinated | |
| Birth control pills have a higher risk of blood clots than the covid-19 vaccine. Go get the vaccine guys | |
|
| 💉🏫 The California State University system announced it will require students, faculty and staff on-campus this fall to be vaccinated against COVID-19. 🏫💉 |
| Stanford University reported at least seven confirmed cases of COVID-19 among fully vaccinated students this week. | |
| United Airlines will require U.S. employees to be vaccinated against COVID-19, joining a growing list of corporations responding to a surge in virus cases. | |
|
| More and more story’s like this are being exposed. These injections are pure poison. Autopsy on Dead Body of COVID-19 Vaccinated Individual Reveals Spike Proteins in Every Organ— |
| Arrest me if the covid 19 vaccine becomes mandatory because i refuse to inject an aborted fetus into my body, and not know what my future holds. If i have covid i wont be getting tested i refuse to inhale ethylene oxide. do your research or be sheep 🐑👌 | |
| One shot, two shots, three shots… FIVE SHOTS!!! The more shots you take, the more infected you get ! The more infected you get, the more doctors get the HOTS!!! |
Classifiers performance in terms of precision.
| Code | Classifier | Parameters | Class | ||
|---|---|---|---|---|---|
|
|
|
| |||
| ML1 |
| n-gram: (1, 2), features: 3000 | 68.62% | 75.50% | 68.08% |
| ML2 | n-gram: (1, 3), features: 3000 | 67.47% | 73.55% | 67.29% | |
| ML3 |
| n-gram: (1, 2), features: all | 69.74% | 68.72% | 65.63% |
| ML4 | n-gram: (1, 3), features: 3000 | 68.86% | 69.60% | 65.48% | |
| ML5 |
| n-gram: (1, 2), features: all | 69.52% | 77.21% | 72.50% |
| ML6 | n-gram: (1, 3), features: 3000 | 67.60% | 72.93% | 71.87% | |
| DL1 |
| cased: no | 73.58% | 82.71% | 77.90% |
| DL2 | cased: yes | 72.77% | 79.35% | 76.80% | |
| DL3 |
|
|
|
| |
| DL4 |
| 70.71% | 83.45% | 73.12% | |
Classifiers’ performance in terms of recall.
| Code | Classifier | Parameters | Class | ||
|---|---|---|---|---|---|
|
|
|
| |||
| ML1 |
| n-gram: (1, 2), features: 3000 | 62.39% | 69.20% | 79.74% |
| ML2 | n-gram: (1, 3), features: 3000 | 59.65% | 69.59% | 78.57% | |
| ML3 |
| n-gram: (1, 2), features: all | 53.40% | 73.38% | 76.28% |
| ML4 | n-gram: (1, 3), features: 3000 | 55.69% | 71.93% | 75.67% | |
| ML5 |
| n-gram: (1, 2), features: all | 67.92% | 73.60% | 77.17% |
| ML6 | n-gram: (1, 3), features: 3000 | 64.90% | 74.05% | 73.27% | |
| DL1 |
| cased: no | 77.58% | 75.90% | 79.45% |
| DL2 | cased: yes | 74.16% |
| 77.42% | |
| DL3 |
|
| 75.84% |
| |
| DL4 |
| 74.34% | 73.16% | 77.67% | |
Classifiers’ performance in terms of F-score.
| Code | Classifier | Parameters | Class | ||
|---|---|---|---|---|---|
|
|
|
| |||
| ML1 |
| n-gram: (1, 2), features: 3000 | 65.31 | 72.13 | 73.39 |
| ML2 | n-gram: (1, 3), features: 3000 | 63.28 | 71.46 | 72.44 | |
| ML3 |
| n-gram: (1, 2), features: all | 60.30 | 70.85 | 70.46 |
| ML4 | n-gram: (1, 3), features: 3000 | 61.49 | 70.57 | 70.05 | |
| ML5 |
| n-gram: (1, 2), features: all | 68.61 | 75.29 | 74.59 |
| ML6 | n-gram: (1, 3), features: 3000 | 66.11 | 73.45 | 72.40 | |
| DL1 |
| cased: no | 75.43 | 79.01 | 78.35 |
| DL2 | cased: yes | 73.32 | 77.81 | 76.98 | |
| DL3 |
|
|
|
| |
| DL4 |
| 72.30 | 77.74 | 75.07 | |
Classifiers’ performance in terms of accuracy.
| Code | Classifier | Parameters | Accuracy |
|---|---|---|---|
| ML1 |
| n-gram: (1, 2), features: 3000 | 70.44% |
| ML2 | n-gram: (1, 3), features: 3000 | 69.27% | |
| ML3 |
| n-gram: (1, 2), features: all | 67.68% |
| ML4 | n-gram: (1, 3), features: 3000 | 67.76% | |
| ML5 |
| n-gram: (1, 2), features: all | 72.89% |
| ML6 | n-gram: (1, 3), features: 3000 | 70.73% | |
| DL1 |
| cased: no | 77.63% |
| DL2 | cased: yes | 76.07% | |
| DL3 |
|
| |
| DL4 |
| 75.05% |
Classifiers’ performance in terms of AUC.
| Code | Classifier | Parameters | AUC |
|---|---|---|---|
| ML1 |
| n-gram: (1, 2), features: 3000 | 86.83% |
| ML2 | n-gram: (1, 3), features: 3000 | 86.21% | |
| ML3 |
| n-gram: (1, 2), features: all | 85.19% |
| ML4 | n-gram: (1, 3), features: 3000 | 84.50% | |
| ML5 |
| n-gram: (1, 2), features: all | 87.67% |
| ML6 | n-gram: (1, 3), features: 3000 | 83.48% | |
| DL1 |
| cased: no | 92.07% |
| DL2 | cased: yes | 91.17% | |
| DL3 |
|
| |
| DL4 |
| 90.45% |
Figure 3The learning curves for the best-performing classifier.
Figure 4The evolution of the number of in favor, neutral and against tweets in the entire dataset.
Figure 5The evolution of the numbers of in favor, neutral and against tweets in the cleaned dataset.
Figure 6The evolution of the number of in favor tweets.
Figure 7The evolution of the number of neutral tweets.
Figure 8The evolution of the number of against tweets.
Hashtags excluded from the analysis.
| Hashtag | Occurrences |
|---|---|
| #pcrscam | 202 |
| #casedemic | 210 |
| #huntdownmonsters | 177 |
| #folksriseup | 230 |
Top 4 specific hashtags for against tweets.
| #covidiots | 282 | #novaccinepassports | 4058 |
| #novaccinepassports | 225 | #novaccinepassport | 2198 |
| #clotshot | 194 | #covidiots | 433 |
| #novaccinepassport | 151 | #enoughisenough | 392 |
Figure 9The evolution of the numbers of the top 4 against hashtags in the cleaned dataset.
Figure 10The evolution of the numbers of the top 4 against hashtags in the entire dataset.
Top 15 selected unigrams.
| Unigrams | Number of Appearances |
|---|---|
| immunity | 4271 |
| risk | 3161 |
| experimental | 2907 |
| delta | 2898 |
| fda | 2704 |
| effective | 2500 |
| effects | 2214 |
| approved | 2207 |
| disease | 2103 |
| deaths | 2060 |
| flu | 1913 |
| death | 1767 |
| placebo | 1500 |
| die | 1341 |
| dangerous | 1180 |
The top 15 selected bigrams.
| Bigrams | Number of Appearances |
|---|---|
| fully vaccinated | 2033 |
| natural immunity | 1499 |
| delta variant | 1487 |
| side effects | 1356 |
| long term | 1223 |
| fda approved | 983 |
| vaccine passport | 954 |
| experimental vaccine | 835 |
| immune system | 769 |
| placebo vaccine | 716 |
| big pharma | 712 |
| vaccine passports | 659 |
| placebo substance | 670 |
| informed consent | 604 |
| herd immunity | 586 |
Top-15 selected trigrams.
| Trigrams | Number of Appearances |
|---|---|
| placebo vaccine passport | 700 |
| placebo substance treatment | 670 |
| natural immunity vs | 515 |
| risk covid 19 | 467 |
| fully vaccinated people | 434 |
| fda approved vaccine | 380 |
| covid 19 delta | 369 |
| beast natural immunity | 349 |
| pfizer biontech covid | 347 |
| vaccines worsening clinical | 319 |
| trial subjects risk | 318 |
| consent disclosure vaccine | 316 |
| long term effects | 306 |
| let big pharma | 295 |
| vaccine recipients severe | 270 |
The LDA topics, keywords and discussion topics.
| Topic Extracted Using LDA | Keywords Included | Discussion Topic |
|---|---|---|
| Topic 1 | risk, trial, mandate, shot, virus, vaccinate, disease, health, dose, clinical | Side Effects |
| Topic 2 | immunity, vaccinate, way, natural, infection, study, mrna, effective, delta, big | Existence of Alternatives |
| Topic 3 | approve, fda, effect, long, drug, experimental, placebo, treatment, term, passport | Hiding Relevant Information |
| Topic 4 | people, vaccinate, virus, get, vaccinated, die, variant, work, mask, fully | Mistrust |
| Topic 5 | test, know, vaccinate, come, pandemic, stop, corona, pcr, positive, die | Scam |
| Topic 6 | death, like, force, cause, require, life, read, virus, say, rate | Side effects |
Figure 11The LDA topics and salient terms.