| Literature DB >> 33900934 |
Konrad Krawczyk1, Tadeusz Chelkowski2, Daniel J Laydon3, Swapnil Mishra3, Denise Xifara4, Benjamin Gibert4, Seth Flaxman5, Thomas Mellan3, Veit Schwämmle6, Richard Röttger1, Johannes T Hadsund1, Samir Bhatt3,7.
Abstract
BACKGROUND: Before the advent of an effective vaccine, nonpharmaceutical interventions, such as mask-wearing, social distancing, and lockdowns, have been the primary measures to combat the COVID-19 pandemic. Such measures are highly effective when there is high population-wide adherence, which requires information on current risks posed by the pandemic alongside a clear exposition of the rules and guidelines in place.Entities:
Keywords: COVID-19; infoveillance; public health; sentiment analysis; text mining
Mesh:
Year: 2021 PMID: 33900934 PMCID: PMC8174556 DOI: 10.2196/28253
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Number of online news sources and collected articles per country.
| Country | Online news sources (N=172), n (%) | Collected articles (N=26,077,939), n (%) |
| Canada | 13 (7.5) | 1,269,200 (4.8) |
| Australia | 8 (4.6) | 1,124,859 (4.3) |
| Italy | 13 (7.5) | 1,526,521 (5.8) |
| The United Kingdom | 21 (12.2) | 4,977,792 (19.0) |
| The United States | 33 (19.1) | 4,388,383 (16.8) |
| France | 9 (5.2) | 1,951,608 (7.4) |
| Germany | 18 (10.4) | 2,348,403 (9.0) |
| Ireland | 8 (4.6) | 905,598 (3.4) |
| International | 6 (3.4) | 462,989 (1.7) |
| New Zealand | 5 (2.9) | 651,050 (2.4) |
| Russia | 19 (11.0) | 3,348,825 (12.8) |
| Spain | 19 (11.0) | 3,122,711 (11.9) |
Keywords employed for topic detection.
| Topic | Keywords by language |
| ||||||
|
| English | German | French | Spanish | Italian | Russian |
| |
|
|
|
| ||||||
|
| coronavirus | coronavirus | coronavirus | coronavirus | coronavirus | коронавирус |
| |
|
| covid | covid | covid | covid | covid | covid |
| |
|
| lockdown | lockdown | lockdown | lockdown | lockdown | lockdown |
| |
|
| quarantine | quarantäne | quarantaine | cuarantena | quarantena | карантин |
| |
|
| pandemic | pandemie | pandémie | pandemia | pandemia | пандемиа |
| |
|
| N/Aa | corona- | N/A | N/A | N/A | N/A |
| |
| Merkel | merkel | merkel | merkel | merkel | merkel | merkel |
| |
| Trump | trump | trump | trump | trump | trump | trump |
| |
| Biden | biden | biden | biden | biden | biden | biden |
| |
| Johnson | boris johnson | boris johnson | boris johnson | boris johnson | boris johnson | boris johnson |
| |
| Putin | putin | putin | putin | putin | putin | putin |
| |
|
|
| |||||||
|
| global warming | —b | — | — | — | — |
| |
|
| climate change | — | — | — | — | — |
| |
|
| climate crisis | — | — | — | — | — |
| |
|
|
| |||||||
|
| cat | — | — | — | — | — |
| |
|
| kitten | — | — | — | — | — |
| |
|
|
| |||||||
|
| baseball | — | — | — | — | — |
| |
|
| major league | — | — | — | — | — |
| |
|
| champion's league | — | — | — | — | — |
| |
|
| football | — | — | — | — | — |
| |
|
| nfl | — | — | — | — | — |
| |
|
| premier league | — | — | — | — | — |
| |
|
| basketball | — | — | — | — | — |
| |
|
| soccer | — | — | — | — | — |
| |
|
| nba | — | — | — | — | — |
| |
| Cancer | cancer | — | — | — | — | — |
| |
aN/A: not applicable; this keyword, which is specific to the German language because of its compound nature, was only found in German news sources.
bThe topics climate, cat, sport, and cancer were not identified in non–English-language online news sources, as these were solely employed for sentiment analysis.
COVID-19 news subtopics.
| Subtopica | Stemmed keywords |
| Case | case |
| Crisis | crisi |
| Death | die, death |
| Disease | diseas |
| Distancing | distanc |
| Fear | fear |
| Health | health |
| Home | home |
| Hospital | hospit |
| Infection | infect |
| Isolation | isol |
| Lockdown | lockdown |
| Mask | mask |
| Outbreak | outbreak |
| Quarantine | quarantin |
| Spread | spread |
| Symptom | symptom |
| Test | test |
| Treatment | treatment |
| Vaccine | vaccin |
aEach of the subtopics was identified by the stemmed keywords (ie, stemming).
Figure 1The extent of coronavirus coverage in 2020. We calculated the proportion of all COVID-19 articles as the proportion of all front-page articles. Proportions were calculated for each online news source separately and then aggregated at the national level. The green points represent the individual coverage of each online news source. The yellow line in each box represents the median; the upper and lower whiskers represent the 75th and 25th percentiles, respectively. The red dotted line indicates mean proportion across all online news sources.
English-language online news sourcesa, with positive (≥0) or negative (<0) relative sentiment skew of 2020 articles on a given topic.
| Topic | Positive online news sources, n (%) | Negative online news sources, n (%) | Relative sentiment skew, mean (SD) | Total articlesb, n |
| Cat (n=87) | 64 (74) | 23 (26) | 0.12 (0.23) | 2746 |
| Sport (N=91) | 84 (92) | 7 (8) | 0.12 (0.08) | 63,155 |
| Biden (n=90) | 75 (83) | 15 (17) | 0.09 (0.11) | 38,949 |
| Johnson (n=90) | 57 (63) | 33 (37) | 0.04 (0.17) | 22,613 |
| Merkel (n=79) | 38 (48) | 41 (52) | –0.01 (0.25) | 2011 |
| COVID-19 (N=91) | 17 (19) | 74 (81) | –0.04 (0.07) | 589,701 |
| Climate (N=91) | 38 (42) | 53 (58) | –0.04 (0.11) | 7195 |
| Putin (n=88) | 33 (38) | 55 (63) | –0.05 (0.23) | 5179 |
| Trump (N=91) | 24 (26) | 67 (74) | –0.06 (0.09) | 157,702 |
| Cancer (N=91) | 0 (0) | 91 (100) | –0.53 (0.12) | 9548 |
aWe had 91 English-language online news sources in total; however, in cases where it was impossible to identify a certain topic in a given source, it was left out.
bThe total number of articles we identified as being associated with a given topic across all online news sources.
Figure 2Relative sentiment skew (rrskew) of COVID-19 coverage. Each article title and description from each English-language online news source (ONS) received a Vader sentiment compound score between –1 and 1 (most negative and most positive, respectively). We noted the difference in mean sentiment for a specific topic and mean sentiment for other 2020 articles in a given online news source (rsskew; see Methods section). The density of the relative sentiment skew is plotted for each topic. Distributions are colored green if their relative sentiment skew was predominantly positive or red if predominantly negative (Table 4). Intensity of the color is scaled by the distance from the red dotted line at 0, which indicates a lack of difference between topic sentiment and all other articles in a given online news source.
Top words and bigrams in English-language countries.
| Rank | Words and bigrams by polarizationa | ||||
|
| Negative | All | Positive |
| |
|
| |||||
|
| 1 | coronavirus | coronavirus | coronavirus |
|
|
| 2 | covid | covid | covid |
|
|
| 3 | pandem | pandem | pandem |
|
|
| 4 | new | new | new |
|
|
| 5 | peopl | peopl | helpb |
|
|
| 6 | say | case | say |
|
|
| 7 | crisib | say | peopl |
|
|
| 8 | health | health | test |
|
|
| 9 | case | test | health |
|
|
| 10 | deathb | outbreak | case |
|
|
| 11 | outbreak | week | us |
|
|
| 12 | virus | us | home |
|
|
| 13 | test | virus | week |
|
|
| 14 | could | could | one |
|
|
| 15 | govern | day | time |
|
|
| 16 | us | one | day |
|
|
| 17 | countri | govern | govern |
|
|
| 18 | one | home | could |
|
|
| 19 | week | countri | workb |
|
|
| 20 | fearb | time | outbreak |
|
|
| |||||
|
| 1 | coronavirus_pandem | coronavirus_pandem | case_covid |
|
|
| 2 | coronavirus_crisi | case_coronavirus | coronavirus_pandem |
|
|
| 3 | coronavirus_outbreak | case_covid | posit_test |
|
|
| 4 | health_public | coronavirus_spread | health_public |
|
|
| 5 | posit_test | distanc_social | coronavirus_outbreak |
|
|
| 6 | case_coronavirus | health_public | case_coronavirus |
|
|
| 7 | coronavirus_spread | covid_test | covid_pandem |
|
|
| 8 | coronavirus_new | coronavirus_outbreak | distanc_social |
|
|
| 9 | case_covid | covid_pandem | coronavirus_lockdownb |
|
|
| 10 | distanc_social | coronavirus_new | coronavirus_spread |
|
|
| 11 | covid_crisib | coronavirus_crisi | covid_test |
|
|
| 12 | covid_test | case_newb | covid_vaccinb |
|
|
| 13 | covid_pandem | covid_outbreak | home_stay |
|
|
| 14 | coronavirus_dueb | posit_test | coronavirus_testb |
|
|
| 15 | covid_deathb | minist_prime | covid_positb |
|
|
| 16 | second_waveb | home_stay | minist_prime |
|
|
| 17 | death_tollb | first_timeb | like_lookb |
|
|
| 18 | coronavirus_deathb | around_worldb | covid_outbreak |
|
|
| 19 | amid_coronavirusb | covid_spreadb | coronavirus_new |
|
|
| 20 | two_week | two_week | coronavirus_vaccinb |
|
aFor each of the 91 English-language online news sources, we calculated the most common words and bigrams and grouped these by Vader scores: >0.2 for positive, <–0.2 for negative, and any score for all. We averaged the ranks of words and bigrams across all online news sources, and here we present the top 20 for each subdivision. The words in the table are stemmed.
bThese entries indicate elements that can be found in the top 20, only in the specific subdivisions of positive, all, or negative.
Figure 3COVID-19 subtopic coverage and sentiment means. We calculated the mean coverage and mean sentiment of each subtopic. Coverage is expressed as the mean of ratios of subtopics in a given online news source against all COVID-19 articles in the same online news source. Sentiment is the mean of the subtopic relative sentiment skew for all online news sources. The shaded areas illustrate regions with relative sentiment skew above 0.2 (green), between 0.2 and –0.2 (white), and below –0.2 (green).