| Literature DB >> 36187717 |
Pedro Brum1, Matheus Cândido Teixeira1, Renato Vimieiro1, Eric Araújo2, Wagner Meira1, Gisele Lobo Pappa1.
Abstract
The debate over the COVID-19 pandemic is constantly trending at online conversations since its beginning in 2019. The discussions in many social media platforms is related not only to health aspects of the disease, but also public policies and non-pharmacological measures to mitigate the spreading of the virus and propose alternative treatments. Divergent opinions regarding these measures are leading to heated discussions and polarization. Particularly in highly politically polarized countries, users tend to be divided in those in-favor or against government policies. In this work we present a computational method to analyze Twitter data and: (i) identify users with a high probability of being bots using only COVID-19 related messages; (ii) quantify the political polarization of the Brazilian general public in the context of the COVID-19 pandemic; (iii) analyze how bots tweet and affect political polarization. We collected over 100 million tweets from 26 April 2020 to 3 January 2021, and observed in general a highly polarized population (with polarization index varying from 0.57 to 0.86), which focuses on very different topics of discussions over the most polarized weeks-but all related to government and health-related events.Entities:
Keywords: Bots; Covid-19; Political polarization; Twitter
Year: 2022 PMID: 36187717 PMCID: PMC9510292 DOI: 10.1007/s13278-022-00949-x
Source DB: PubMed Journal: Soc Netw Anal Min
Main statistics of the dataset collected
| Time period | April 2020 to January 2021 |
|---|---|
| Number of tweets | 104,113,713 |
| Number of retweets | 66,099,002 |
| Number of hashtags | 7,289,188 |
| Number of URLs | 18,247,641 |
| Number of mentions | 8,6061,269 |
| Number of unique users | 7,146,271 |
Fig. 1Main statistics of the tweets collected over the 36 week-period
Fig. 2Number of geo-tagged tweets per Brazilian geographical regions
Fig. 3Number of tweets related to COVID-19 (left y axis) over time in Brazil
Features used to flag a user as a potential bot, followed and ordered by the percentage of users in the dataset that present the indicator
| Self-declared bots by screen name (0.01%) | |
| Proportion of symbols per tweet is greater than 0.95 (0.01%) | |
| Description has a url pointing to a Github page (0.03%) | |
| Self-declared bots by name (0.04%) | |
| Screen name contains terms such as “bot”, “robot”, “robo”, “conta-reserva” (Portuguese for backup account) | |
| Description contains expressions like “automatically retweet ...” (0.08%) | |
| Proportion of tweets related to COVID-19 is greater than 0.95 (2.61%) | |
| Has predominantly more followers than follows, i.e. | |
| Number of followers or friends is zero (3.50%) | |
| Proportion of urls per tweet is greater that 0.95 (3.60%) | |
| Proportion of urls per tweet is greater than 0.95 (7.11%) | |
| Age of the account (calculated by | |
| Proportion of mentions per tweet is greater than 0.95 (41.40%) | |
| More than 95% of account tweets are retweets (46.07%) | |
| Uses default profile, i.e. not customized (67.95%) | |
| Has no coordinates or places (89.39%) | |
| Unverified account (92.88%) | |
| Unprotected account (93.03%) |
a 0.046 gives a score of 0.25 if the account age is 30 days
-test measuring the difference of using ranking or testing users at random
| Bots | Genuine | ||
|---|---|---|---|
| Rank | 2,548 | 21,396 | 25,391 |
| Random | 23 | 1,424 | 1,447 |
| 2,571 | 22,820 | 52,229 |
Comparison between classifiers used to identify users as bots
| Classifier | F1-score | Precision | Recall | ROC AUC |
|---|---|---|---|---|
| AB | 0.6157 | 0.7367 | 0.5337 | 0.9179 |
| DT | 0.5678 | 0.6131 | 0.5374 | 0.7386 |
| KNN | 0.5917 | 0.7442 | 0.5079 | 0.8346 |
| LR | 0.5551 | 0.7550 | 0.4429 | 0.8877 |
| RF | 0.6536 | 0.8225 | 0.5521 | 0.9331 |
| SVC | 0.5476 | 0.7940 | 0.4208 | 0.8658 |
AB Ada Boost DT Decision tree, LR linear regression
KNN k-Nearest neighbours, RF random forest, SVC Support Vector Classification
Hashtags related to the public political opinion regarding the government in the tweets dataset
| Anti-government Hashtags | |
| #ForaBolsonaro( | |
| #BolsonaroGenocida ( | |
| #ImpeachmentDoBolsonaro Urgente ( | |
| #StopBolsonaro Mundial ( | |
| #BolsonaroAcabou ( | |
| Pro-government Hashtags | |
| #BolsonaroTemRazao( | |
| #BrasilComBolsonaro ( | |
| #DireitaComBolsonaro ( | |
| #EuApoioBolsonaro ( | |
| #BolsonaroReeleito ( |
Fig. 4Probability density function for Twitter political polarities in Brazil during Weeks 1 (a) and 24 (b). 4b shows the variables of the population of opposing opinions ( - against government and - pro-government), the centers of gravity for each population ( and ), and the distance between the centers of gravity (d)
Network metrics for retweet graphs
| Property | Max | Min | Avg |
|---|---|---|---|
| Vertices | 2,277,658 | 4,385,12 | 1,060,410.19 |
| Edges | 4,691,621 | 377,321 | 1,640,000.64 |
| # Isolated | 869,826 | 174,239 | 328,435.42 |
| vertices(%) | (43.2) | (24.7) | (32.3) |
| Avg. degree | 4.63 | 1.72 | 2.78 |
| Avg in-degree | 2.32 | 0.86 | 1.39 |
| Largest connected component | 25,533.00 | 61 | 4,777.53 |
| Density | 2.15E-06 | 6.83E-07 | 1.47E-06 |
| # components | 2,256,438 | 438,178 | 1,054,625.28 |
| # maximal cliques | 6,573,259 | 538,957 | 1,985,270 |
| Largest clique | 21 | 6 | 8.08 |
Fig. 5Time evolution of polarization index () and its related variables ( and d)
Fig. 6Word clouds for genuine users and bots pro- and anti-government in the least polarized week (week 1)
Fig. 7Word clouds for genuine users and bots pro- and anti-government in the most polarized weeks (weeks 28 and 29)
Words describing the top-5 most relevant topics discussed by users in week 29 (the most polarized week)
| Id | Anti-government | Pro-government |
|---|---|---|
| 1 | Days, people, chloroquine, quarantine,now | Acai, hair, unemployment, shakira, mobile |
| 2 | Cases, Brazil, deaths, raise, pandemics | Quarantine, wave, home, day, second, Brazil, people |
| 3 | Vaccine, health, ministry, efficacy, pfizer, coronavac | WHO, against, pandemics, president, airplane |
| 4 | Hair, green, shakira, unemployment, joao, mobile, acai | Brasil, Argentina, millions, causes, cases, deaths |
| 5 | Teatment, disease, social, precocious, masks, distance | Treatment, precocious, doctor, symptoms, air, feel |
10 most relevant words describing the topics discussed by pro-government users over time
| Id | Top-5 Terms |
|---|---|
| Governors, deaths, hospital, corona, Nise, combat, WHO, doctors, government, now | |
| WHO, Lancet, deaths, science, patients, saves, Raoult, symptoms, notice, azithromycin | |
| Quarantine, ramalho, watch, lives, broom, Recife, joao, sorry, peak, weak flu | |
| Uai, mobile, acai, caraio, hair, joao, iphone, shakira, igor, smile | |
| Covid19, positive, result, chloroquine, exam, STF, Jair, test, president, negative |
10 most relevant words describing the topics discussed by anti-government users over time
| Id | Top 5 Terms |
|---|---|
|
| President, latin, Pazuello, hydroxychloroquine, brand, workers, Brazil, vaccine, variant, ozone |
|
| Recife, quarantine, cachorra, weed, watch, joao, cadela, computer, weak flu, vassoura |
|
| Roberto, followtrick, corno, keyboard, computer, paulo, fuck, iphone, cachorra |
|
| Do, video, against, hydroxychloroquine, Bolsonaro, hospital, chloroquine, medicine, use, covid19 |
|
| Advertisement, positive, PF, Bolsonaro, covid, chloroquine, now, fake, covid19 |
10 most relevant words describing the topics posted by bots over time
| Id | Top 5 Terms |
|---|---|
| quarantine, covid19, left, STF, york, States, total, stayhome, year, anvisa | |
| iphone, bus, cachorra, green, naruto, quarantine, shakira, sorry, sensible, bots | |
| yes, hospital, quarantine, days, torcidas, covid19, Rio, take, world, social | |
| porn, unemployed, spree, quarantine, sorry, finished, pop, followtrick, bts, gay | |
| broom, shit, smile, jaehyo, naruto, eat, all, trump, quarantine, fuck |