Literature DB >> 32305520

Association of the COVID-19 pandemic with Internet Search Volumes: A Google TrendsTM Analysis.

Maria Effenberger1, Andreas Kronbichler2, Jae Il Shin3, Gert Mayer2, Herbert Tilg1, Paul Perco4.   

Abstract

OBJECTIVES: To assess the association of public interest in coronavirus infections with the actual number of infected cases for selected countries across the globe.
METHODS: We performed a Google TrendsTM search for "Coronavirus" and compared Relative Search Volumes (RSV) indices to the number of reported COVID-19 cases by the European Center for Disease Control (ECDC) using time-lag correlation analysis.
RESULTS: Worldwide public interest in Coronavirus reached its first peak end of January when numbers of newly infected patients started to increase exponentially in China. The worldwide Google TrendsTM index reached its peak on the 12th of March 2020 at a time when numbers of infected patients started to increase in Europe and COVID-19 was declared a pandemic. At this time the general interest in China but also the Republic of Korea has already been significantly decreased as compared to end of January. Correlations between RSV indices and number of new COVID-19 cases were observed across all investigated countries with highest correlations observed with a time lag of -11.5 days, i.e. highest interest in coronavirus observed 11.5 days before the peak of newly infected cases. This pattern was very consistent across European countries but also holds true for the US. In Brazil and Australia, highest correlations were observed with a time lag of -7 days. In Egypt the highest correlation is given with a time lag of 0, potentially indicating that in this country, numbers of newly infected patients will increase exponentially within the course of April.
CONCLUSIONS: Public interest indicated by RSV indices can help to monitor the progression of an outbreak such as the current COVID-19 pandemic. Public interest is on average highest 11.5 days before the peak of newly infected cases.
Copyright © 2020 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Entities:  

Keywords:  COVID-19; Coronavirus; Google Trends; Public awareness; SARS-CoV-2

Mesh:

Year:  2020        PMID: 32305520      PMCID: PMC7162745          DOI: 10.1016/j.ijid.2020.04.033

Source DB:  PubMed          Journal:  Int J Infect Dis        ISSN: 1201-9712            Impact factor:   3.623


Introduction

A novel coronavirus, the acute respiratory syndrome coronavirus 2 (SARS-CoV-2), causes a new disease named Corona Virus Disease 2019 (COVID-19). It was first detected in December 2019 in Wuhan (Hubei, China)(Wang et al., 2020). Due to a high virulence and a high proportion of asymptomatic cases, the outbreak spreads all over the world. On April 5th 2020 the World Health Organization (WHO) reported 1 133 758 confirmed cases. Today, a cumulative mortality rate of 5.5% (62 784) has been reported. The internet is increasingly used as a source of health care information. Infodemiology and infoveillance are essential public health informatics methods which are used to analyze search behavior on the internet. Infodemiology is defined as “science of distribution and determinants of information in an electronic medium, specifically the internet, or in a population, with the ultimate aim to inform public health and public policy”, while the primary aim of infoveillance is surveillance (Eysenbach, 2009). Infodemiology and infoveillance of epidemiological data are important to increase situational awareness and make suitable interventions (Rivers et al., 2019). The analysis of relative internet search volumes (RSV) gives information on the extent of public attention (Arora et al., 2019, Kaleem et al., 2019, Ling and Lee, 2016) with Google TrendsTM being one of the most widely used tools for this purpose. RSV are used for real-time analyses for transmissibility, severity, and natural history of an emerging pathogen, as observed with severe acute respiratory syndrome (SARS), the 2009 influenza pandemic, and Ebola (Chowell et al., 2009, Cleaton et al., 2016). The analyses of confirmed cases are particularly useful to infer key epidemiological parameters, such as the incubation and infectious periods and ongoing outbreaks or an outbreak probability. In addition, Google TrendsTM data might be used to forecast an increase in infected cases. A linear time series pattern with official dengue reports, indicating a potential use to monitor public interest before an increase of cases and during the outbreak (Husnayain et al., 2019). Beside infectious diseases, Google TrendsTM have been successfully used to forecast the suicide risk increase (Barros et al., 2019). In this study, we investigated the public interest in COVID-19 since December 31st 2019 comparing Google Trends™ data to data of newly infected COVID-19 cases.

Methods

Retrieving outbreak and confirmed cases numbers from the WHO

Data on confirmed COVID-19 cases were retrieved on the 5th of April from the European Center for Disease Control (ECDC) for the time from the 31st of December 2019 until the 1st of April 2020 (https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide). Worldwide data were retrieved as well as data for the following countries, namely China, Republic of Korea, Japan, Iran, Italy, Austria, Germany, the United Kingdom (UK), the United States (US), Egypt, Australia, and Brazil.

Retrieving Google TrendsTM data on COVID-19

The Google TrendsTM tool was used to retrieve data on internet user search activities in the context of COVID-19. Google TrendsTM enables researchers to study trends and patterns of GoogleTM search queries (Arora et al., 2019). It was implemented in 2004 and data on internet search queries are available on a daily basis. Google TrendsTM expresses the absolute number of searches relative to the total number of searches over the defined period of interest (Arora et al., 2019). The retrieved Google TrendsTM index ranges from 0 to 100, with 100 being the highest relative search term activity for the specified search query in the time period of interest. Further information on Google TrendsTM can be found on the respective help page (https://support.google.com/trends/). Nation-specific Google TrendsTM indices were retrieved from December 31st 2019 to April 1st 2020 using the search term “Coronavirus (Virus)”. We retrieved Google TrendsTM indices for China, Republic of Korea, Japan, Iran, Italy, Austria, Germany, the UK, the US, Egypt, Australia, and Brazil. The search was performed in English.

Analysis

Daily Google TrendsTM indices were used for generation of the lineplots with the ggplot2 package of the statistical software R (version 3.4.1). The ccf function of the tseries R package was used to calculate time-lag correlations between Google TrendsTM indices and new cases. A time lag between −35 and +35 days was used in the analysis with the Pearson correlation coefficient serving as correlation measure.

Results

Worldwide interest in coronavirus started on January 20th and reached its first peak on January 31st, a few days after the word was spread on the outbreak in Wuhan, China. The increasing numbers of cases over the globe prompted the WHO to declare the coronavirus outbreak as a pandemic on March 11th, leading to an increase in public interest currently peaking on March 12th 2020 (Figure 1 ). The data on newly confirmed cases, overall confirmed cases, and overall death worldwide as well as for the afore-mentioned countries under study are summarized in Table 1 . There are two peaks, one sharp increase in numbers when cases were counted based on clinical diagnosis and not from a confirmatory laboratory test in China and the other peak on March 16th due to cases around the globe.
Figure 1

Worldwide Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases.

Table 1

Illustration of cases (overall, O), newly diagnosed ones (on a daily basis, N), and deaths (D) is represented. We have chosen a time frame of four days to ensure a comprehensive view. Notably, in some countries there was a steady state of cases (i.e. Italy) before a rapid increase was observed.

20-Jan-20O/N/D24-Jan-20O/N/D28-Jan-20O/N/D01-Feb-20O/N/D05-Feb-20O/N/D09-Feb-20O/N/D13-Feb-20O/N/D17-Feb-20O/N/D21-Feb-20O/N/D
China280/74/3830/265/254537/1796/10611,821/2102/25924,363/3893/49137,251/2657/81246,550/1820/136870,635/2051/177275,569/894/2239
Republic of Korea1/0/02/1/04/0/012/1/018/2/027/3/028/0/030/1/0204/100/1
Japan1/0/01/0/06/1/017/3/033/13/026/1/029/0/059/6/193/8/1
Iran5/3/2
Italy2/0/02/0/03/0/03/0/03/0/03/0/0
Austria
Germany1/0/07/2/012/0/014/0/016/0/016/0/016/0/0
UK2/0/03/0/09/1/09/0/09/0/0
US1/1/05/0/07/1/011/0/012/0/014/1/015/0/015/0/0
Egypt1/0/01/0/0
Australia5/1/012/3/013/1/015/0/015/0/015/0/017/2/0
Brazil
Worldwide282/74/3846/265/254593/1795/10611,953/2128/25924,554/3925/49237,558/2676/81346,997/1826/136971,429/2162/177576,769/1021/2247
Worldwide Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases. Illustration of cases (overall, O), newly diagnosed ones (on a daily basis, N), and deaths (D) is represented. We have chosen a time frame of four days to ensure a comprehensive view. Notably, in some countries there was a steady state of cases (i.e. Italy) before a rapid increase was observed. The worldwide initial peak associates with a strong increase of confirmed cases in China. In China, a maximum of Google TrendsTM RSV was observed at the end of January with a 5.47-fold increase of cases between January 24th and January 28th. Afterwards, with rigorous measures the relative increase in new cases was slower, and a decrease of new cases was firstly reported to the WHO on February the 6th, with the exception of a sharp increase as mentioned above. The RSV trend followed a similar path, with a steady number of search enquiries around 25% of the maximal interest during the last weeks. (Figure 2 ). Correlation analysis indicates highest public interest in COVID-19 on average around 11.5 days before the maximum of newly infected cases was reported (Figure 3 ). In countries with proximity to China such as the Republic of Korea or Japan a high volume of search queries was observed during or closely after the peak was reached in China. A non-comparable smaller peak was observed in countries in the European Union or the US (Figure 2).
Figure 2

Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases for selected countries under study.

Figure 3

Time-lag correlations of Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases for selected countries under study.

Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases for selected countries under study. Time-lag correlations of Relative Search Volume (RSV) for “Coronavirus” and newly confirmed COVID-19 cases for selected countries under study. In the Republic of Korea, a first Google Trends™ index peak was observed end of January only slightly shifted as compared to the peak in China with a second peak being observed on February 23rd (Figure 2). This second peak in Korea proceeded the peak in newly infected cases by 7 days (Figure 3). Japan's RSV started to increase on February 24th, with a peak on February 27th, also followed by an increase in confirmed COVID-19 cases. In Iran, the most affected country in the Middle East, a strong increase of RSV could be observed on February 18th with a peak between 20th and 22nd of February. The Iranian increase of RSV was five days before the first confirmed cases in Iran, with also a strong association and prediction of the outbreak, which followed five to seven days later. Egypt, the first country on the African continent with a confirmed COVID-19 case, showed a small RSV peak during the outbreak in China. Furthermore, the RSV started to steadily increase since February 20th with an observed leap in interest on April 1st. Australia showed a similar pattern with an increase in RSV during the first outbreak in China, followed by a decrease afterwards and again an increase since February 23rd, followed by increasing new COVID-19 cases 10 days later (Figure 2). In European countries, especially in Italy, a small peak in the Google TrendsTM analysis was found during the outbreak in China and a climax was found on February 23rd 2020, a few days before the numbers of newly COVID-19 started to increase exponentially. Similar trends were observed in Austria, Germany and the UK with a delay of several days and a second peak, which was accompanied by an increase in numbers in the following days. The highest RSV peak was reached mid of March, which is in line with rigorous policies by the government regarding the rapid spread. The UK and Australia show very similar patterns with highest correlations between RSV indices and newly diagnosed cases found with time lags of −12 and −7 days respectively (Figure 3). In the US, a steady increase of GoogleTM search queries since February 27th was observed followed by an outbreak since March 2nd. The peak of search queries was reached on March 12th while the numbers of newly infected patients are still increasing (Figure 2). High correlation with a time lag of −12 might indicate that the curve of newly infected cases might flatten out soon if measures taken by the government are comparable to European countries (Figure 3). Google TrendsTM analysis in Brazil showed a peak during COVID-19 outbreak in China and on February 24th, where the first case of COVID-19 was confirmed in Brazil. Since March 3rd a new increase in RSV is found in Brazil, followed by increasing numbers of newly confirmed cases of COVID-19 (Figure 2).

Discussion

In our study, we found a significant increase in RSV using Google Trends™ for COVID-19 worldwide with a peak of RSVs around 11.5 days prior to the peak in newly diagnosed cases in different countries all over the world. As such, Google Trends™ can be used to associate and predict outbreaks worldwide and provides a valuable picture of the outbreak of COVID-19 in real time. Close monitoring and continued evolution of enhanced communication strategies is needed that provide general populations and vulnerable populations most at risk with actionable information for self-protection, including identification of symptoms (Heymann et al., 2020). The application of internet data in health care research, also known as infodemiology, is a promising new field and it may complement and extend the current data sources and foundations (Mavragani and Ochoa, 2019). The attention to COVID-19 increased days to weeks before the actual peak outbreak, not only worldwide, but also in most of the investigated countries in this study. This strongly supports our finding that the RSV is a useful tool to monitor local and global outbreaks of infectious diseases. The internet is the biggest platform for search engines and social media for real time data and outbreaks. RSV has been used before to detect outbreaks, like the recent severe influenza outbreak in 2009 (Cook et al., 2011). Close monitoring and continued evolution of enhanced communication strategies is needed that provide general populations and vulnerable populations most at risk with actionable information for self-protection, including identification of symptoms (Heymann et al., 2020). Most countries and the WHO provide awareness – raising and educational programs on COVID-19 via internet. The strong association between RSV and increasing outbreak numbers may be due to implementation of such programs in the different countries. The impact of web based research continuously grows since the past decade (Jun et al., 2018). Analyzing Google Trends™ data makes use of millions of users searches and has widely been used in the context of health issues. Public attention in different fields has been published recently (e.g. osteoarthritis, breast cancer or COPD) (Boehm et al., 2019, Jellison et al., 2018, Kaleem et al., 2019). Furthermore, infodemiology and Google Trends™ is used to generate awareness profiles and is a suitable substitute for classical data collection, such as surveys (Jun et al., 2018). Far mostly, Google Trends™ is used to monitor disease control and awareness in cancer, HIV or stroke, but also in rare diseases like antiphospholipid syndrome or systemic lupus erythematosus (Ling and Lee, 2016, Mahroum et al., 2019, Sciascia and Radin, 2017, Sciascia et al., 2018). Google Trends™ can be used to detect success rates of awareness programs and predict infectious outbreaks worldwide (McLean et al., 2019, Patel et al., 2020). There are limitations of this study. There is no information about the individual searches for the analyzed topics. The selections of spelling/terms might affect the results and conclusions. The importance of accuracy in defining the search queries is exemplified by searching Google Trends™ for the topic “pneumonia”. Pneumonia is associated with COVID-19, although not specifically representing COVID-19. Thus, using the query “pneumonia” may be useful to analyze symptom-related curiosity, but does not sufficiently represent COVID-19 outbreaks. The number of studies based on Google Trends™ is increasing, but so far there is no standardized procedure for data collection. More guidance by Google™ should be warranted in order to assist researchers to establish an optimal search strategy (Nuti et al., 2014). Despite the fact the Google search is accessible worldwide, the use of different search tools in certain countries like for example Baidu in China might lead to more accurate estimations of public interest. It was for example shown that a high Baido Search Index (BSI) predicted dengue fever outbreaks in Guangzhou and to a lesser degree in Zhongshan, indicating that BSI might complement traditional dengue fever surveillance in China (Liu et al., 2016). In our study we however decided to make use of data from one common framework. In conclusion, infodemiology and RSV can help to monitor the progression of an outbreak such as the current COVID-19 pandemic.

Authors’ contributions

Literature search: ME, AK, JIS, GM, HT, PP; Figures: PP; Study design: ME, AK, JIS, GM, HT, PP; Data collection: ME, AK, PP; Data analysis: PP; Data interpretation: ME, AK, JIS, GM, HT, PP; Writing: ME, AK, JIS, GM, HT, PP.

Conflict of interest

The authors declare no conflicts of interest.

Funding source

None declared.

Ethical approval

Not applicable.
  22 in total

1.  Severe respiratory disease concurrent with the circulation of H1N1 influenza.

Authors:  Gerardo Chowell; Stefano M Bertozzi; M Arantxa Colchero; Hugo Lopez-Gatell; Celia Alpuche-Aranda; Mauricio Hernandez; Mark A Miller
Journal:  N Engl J Med       Date:  2009-06-29       Impact factor: 91.245

2.  Internet search query analysis can be used to demonstrate the rapidly increasing public awareness of palliative care in the USA.

Authors:  Sarah McLean; Paul Lennon; Paul Glare
Journal:  BMJ Support Palliat Care       Date:  2017-01-27       Impact factor: 3.568

3.  Google Trends: Opportunities and limitations in health and health policy research.

Authors:  Vishal S Arora; Martin McKee; David Stuckler
Journal:  Health Policy       Date:  2019-01-11       Impact factor: 2.980

4.  Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet.

Authors:  Gunther Eysenbach
Journal:  J Med Internet Res       Date:  2009-03-27       Impact factor: 5.428

5.  Using Baidu Search Index to Predict Dengue Outbreak in China.

Authors:  Kangkang Liu; Tao Wang; Zhicong Yang; Xiaodong Huang; Gabriel J Milinovich; Yi Lu; Qinlong Jing; Yao Xia; Zhengyang Zhao; Yang Yang; Shilu Tong; Wenbiao Hu; Jiahai Lu
Journal:  Sci Rep       Date:  2016-12-01       Impact factor: 4.379

6.  Google Trends in Infodemiology and Infoveillance: Methodology Framework.

Authors:  Amaryllis Mavragani; Gabriela Ochoa
Journal:  JMIR Public Health Surveill       Date:  2019-05-29

7.  Using "outbreak science" to strengthen the use of models during epidemics.

Authors:  Caitlin Rivers; Jean-Paul Chretien; Steven Riley; Julie A Pavlin; Alexandra Woodward; David Brett-Major; Irina Maljkovic Berry; Lindsay Morton; Richard G Jarman; Matthew Biggerstaff; Michael A Johansson; Nicholas G Reich; Diane Meyer; Michael R Snyder; Simon Pollett
Journal:  Nat Commun       Date:  2019-07-15       Impact factor: 14.919

8.  Google Search Trends in Oncology and the Impact of Celebrity Cancer Awareness.

Authors:  Tasneem Kaleem; Timothy D Malouff; William C Stross; Mark R Waddle; Daniel H Miller; Audrey L Seymour; Nicholas G Zaorsky; Robert C Miller; Daniel M Trifiletti; Laura Vallow
Journal:  Cureus       Date:  2019-08-10

9.  A novel coronavirus outbreak of global health concern.

Authors:  Chen Wang; Peter W Horby; Frederick G Hayden; George F Gao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

10.  COVID-19: what is next for public health?

Authors:  David L Heymann; Nahoko Shindo
Journal:  Lancet       Date:  2020-02-13       Impact factor: 79.321

View more
  57 in total

1.  Association of COVID-19 with lifestyle behaviours and socio-economic variables in Turkey: An analysis of Google Trends.

Authors:  Gamze Bayın Donar; Seda Aydan
Journal:  Int J Health Plann Manage       Date:  2021-09-22

2.  Geographic social inequalities in information-seeking response to the COVID-19 pandemic in China: longitudinal analysis of Baidu Index.

Authors:  Zhicheng Wang; Hong Xiao; Leesa Lin; Kun Tang; Joseph M Unger
Journal:  Sci Rep       Date:  2022-07-18       Impact factor: 4.996

3.  COVID-19 forecasts using Internet search information in the United States.

Authors:  Simin Ma; Shihao Yang
Journal:  Sci Rep       Date:  2022-07-07       Impact factor: 4.996

Review 4.  A review and agenda for integrated disease models including social and behavioural factors.

Authors:  Jamie Bedson; Laura A Skrip; Danielle Pedi; Sharon Abramowitz; Simone Carter; Mohamed F Jalloh; Sebastian Funk; Nina Gobat; Tamara Giles-Vernick; Gerardo Chowell; João Rangel de Almeida; Rania Elessawi; Samuel V Scarpino; Ross A Hammond; Sylvie Briand; Joshua M Epstein; Laurent Hébert-Dufresne; Benjamin M Althouse
Journal:  Nat Hum Behav       Date:  2021-06-28

5.  Data science in unveiling COVID-19 pathogenesis and diagnosis: evolutionary origin to drug repurposing.

Authors:  Jayanta Kumar Das; Giuseppe Tradigo; Pierangelo Veltri; Pietro H Guzzi; Swarup Roy
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

6.  Using awareness to Z-control a SEIR model with overexposure: insights on Covid-19 pandemic.

Authors:  Deborah Lacitignola; Fasma Diele
Journal:  Chaos Solitons Fractals       Date:  2021-05-24       Impact factor: 5.944

7.  From science to politics: COVID-19 information fatigue on YouTube.

Authors:  Chyun-Fung Shi; Matthew C So; Sophie Stelmach; Arielle Earn; David J D Earn; Jonathan Dushoff
Journal:  BMC Public Health       Date:  2022-04-23       Impact factor: 4.135

8.  The Mental Well-Being of Health Care Workers during the Peak of the COVID-19 Pandemic-A Nationwide Study in Poland.

Authors:  Mateusz Babicki; Ilona Szewczykowska; Agnieszka Mastalerz-Migas
Journal:  Int J Environ Res Public Health       Date:  2021-06-05       Impact factor: 3.390

9.  Google searches for bruxism, teeth grinding, and teeth clenching during the COVID-19 pandemic.

Authors:  Elif Kardeş; Sinan Kardeş
Journal:  J Orofac Orthop       Date:  2021-06-29       Impact factor: 1.938

10.  Social Media Use, Self-Efficacy, Perceived Threat, and Preventive Behavior in Times of COVID-19: Results of a Cross-Sectional Study in Pakistan.

Authors:  Qaisar Khalid Mahmood; Sara Rizvi Jafree; Sahifa Mukhtar; Florian Fischer
Journal:  Front Psychol       Date:  2021-06-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.