Literature DB >> 34909483

Infodemiology of flu: Google trends-based analysis of Italians' digital behavior and a focus on SARS-CoV-2, Italy.

Omar Enzo Santangelo1, Sandro Provenzano2, Vincenza Gianfredi3.   

Abstract

INTRODUCTION: The aim of the current study was to assess if the frequency of internet searches for influenza are aligned with Italian National Institute of Health (ISS) cases and deaths. Also, we evaluate the distribution over time and the correlation between search volume of flu and flu symptoms with reported new cases of SARS-CoV-2.
MATERIALS AND METHODS: The reported cases and deaths of flu and the reported cases of SARS-CoV-2 were selected from the reports of ISS, the data have been aggregated by week. The search volume provided by Google Trends (GT) has a relative nature and is calculated as a percentage of query related to a specific term in connection with a determined place and time-frame.
RESULTS: The strongest correlation between GT search and influenza cases was found at a lag of +1 week particularly for the period 2015-2019. A strong correlation was also found at a lag of +1 week between influenza death and GT search. About the correlation between GT search and SARS-CoV-2 new cases the strongest correlation was found at a lag of +3 weeks for the term flu.
CONCLUSION: In the last years research in health care has used GT data to explore public interest in various fields of medicine. Caution should be used when interpreting the findings of digital surveillance. ©2021 Pacini Editore SRL, Pisa, Italy.

Entities:  

Keywords:  Big data; Flu; Google trends; Italy; Medical informatics computing; SARS-CoV-2

Mesh:

Year:  2021        PMID: 34909483      PMCID: PMC8639123          DOI: 10.15167/2421-4248/jpmh2021.62.3.1704

Source DB:  PubMed          Journal:  J Prev Med Hyg        ISSN: 1121-2233


Introduction

Influenza (or even flu) is a viral infectious disease causing a respiratory tract infection that cause a high burden in terms of direct and indirect costs, therefore it is still a public health concern [1]. Indeed, influenza viruses are characterized by the antigenic drift, that is responsible for the annual variability of the virus genome, which in turn is the reason why people can get the flu more than one time in their life [2]. Another characteristic of the flu viruses is the seasonality. Indeed, flu viruses are most common during the fall and winter, with a peak activity between December and February [2]. During the last season 2019-2020, an elevated influenza-like-illness have been detected. This excess of cases is due to a novel Coronavirus, the SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), that was firstly identified in China, Wuhan, in the province of Hubei in December 2019 [3]. SARS-CoV-2 is responsible for a disease defined COVID-19 (where “CO” means corona, “VI” means virus, “D” means disease and “19” indicates the year in which it occurred) previously known as “2019 novel coronavirus”. Flu and the new influenza-like-illness are both respiratory illnesses caused by different viruses: influenza and the new SARS-CoV-2 [4]. The two infectious diseases spread from person-to-person via respiratory droplets emitted when people cough, sneeze or talk (close contact increases risk of transmission), landing in the upper respiratory tract of people nearby [5]. Moreover, the two types of illnesses have similar symptoms, making the differential diagnosis quite complex. In most of the cases there are symptoms variously intense including runny or stuffy nose, fever, cough, and more serious symptoms as pneumonia, bacterial infections, or hospitalizations; even if the new SARS-CoV-2 might range from a complete asymptomatic presentation to a highly complicated pulmonary and multi-organs failure, showing a more severe manifestation and causing thousands of deaths [6]. The first Italian patient tested positive to SARS-CoV-2 was detected in February in Lombardy region [7, 8]. Since the 23rd of February 2020, 225,435 total cases and 31,908 deaths have been recorder in Italy [9]. In the weeks ahead, we have seen the exponential increase of new cases and deaths for COVID-19 and the number of affected countries climb even higher. However, these numbers might be underestimated since collected through the classical surveillance systems, that are largely affected by under-diagnosis and under-reporting [10]. Nowadays, flourishing evidence is focusing on the adoption of potential novel surveillance systems based on disease-related internet activity traces [11-16] in order to monitor in a fast and cheap fashion the spread of (emerging and old) infectious diseases. Therefore, the aim of the current study was to assess if the frequency of the Italian general population searches for influenza, using the Google Trends, are aligned with Italian National Institute of Health (ISS - Istituto Superiore di Sanità) influenza cases and deaths. Moreover, we also assessed if there was a correlation between flu symptoms search volume and influenza cases and deaths. Lastly, due to the overlap with the spread of the new SARS-CoV-2, we evaluate the distribution over time and the correlation between Google search volume of flu and flu symptoms with reported cases of SARS-CoV-2 in Italy.

Materials and methods

A cross-sectional study design was used. The reported cases of flu were selected from October 2015 to April 2020. The reported deaths of flu were selected from October 2016 to April 2019. Every week from the 42nd week of the current year to the 17th week of the following year the ISS issues a bulletin with the flu cases reported in the previous week [17]. The reported cases of SARS-CoV-2 were selected from 24 February 2020 (9th week of 2020) to the end of 17th week of the following year [9], the data have been aggregated by week. Data on Internet searches have been obtained from Google Trends (GT) based on Google Search, the most widely used internet search engine, analyzes the popularity research topics in Google using graphs to compare the search volume of different queries over time and across different geographical locations [18]. We used the following Italian search terms in the “Health” category: “Influenza” (“Flu” in English) and “sintomi influenza” (“Symptoms of Flu” in English). Three time-frame elapsing have been extracted partly overlapping. The first from October 12, 2015 to April 28, 2019, named “2015-2019 period”, the second from October 12, 2015 to April 26, 2020, named “2015-2020 period” and finally the third from October 17, 2016 to April 28, 2019, named “2016-2019 period”. The data have been aggregated by week. According to the selected period, the relative search volume (RSV) changes, because it is a relative index. The file in “.CSV” format has been downloaded. GT produces relative search volume (RSV) scaled to the highest search proportion week, which is computed as the percentage of queries concerning a particular term for a specific location and time period, where 100 is the maximum value and 0 is the minimum value. Thus, RSV allows for directly comparing search volume across search terms. The data coincide temporally with the weekly incidence reported in the epidemiological bulletins of the ISS; then, the data extracted from GT have been moved over time (Lag), one week in the future and one week in the past. Cross-correlation results are obtained as product-moment correlations between the two-time series. The advantage of using cross-correlations is that it accounts for time dependence between two time-series variables. Statistical analyses have been performed using the Spearman’s rank correlation coefficient (rho). The statistical significance level for the analyses has been fixed in 0.05. The data have been analyzed using the STATA statistical software, version 14 [19]. In the Tables, the wording “+1” means that we have moved the extracted data from Google one week in the future. In other words, Google anticipated the data by one week in relation to the comparison (for example the number of new cases of flu). Reverse speech for lag-1.

Results

Influenza-related digital behavior showed an increasing trend throughout the study period (from 2015 to 2019), with a peak during the epidemic year 2017, for influenza search term, and year 2019 for influenza symptoms search term. The temporal correlation between influenza cases reported by ISS and GT-based RSV was very large (rho > 0.70, highly statistically significant with p-values < 0.001) for the two study periods 2015-2019 and 2015-2020. The strongest correlation between Google trends search (for both flu and symptoms of flu) and the reported influenza cases from ISS was found at a lag of +1 week particularly for the period 2015-2019 (rho = 0.92 for flu and rho = 0.87 for symptoms), as showed in Table I. The correlation between influenza cases and Google trends search was still strong for the period 2015-2020 even if slightly attenuated compare to 2015-2019 (rho = 0.77 for flu and rho = 0.82 for symptoms, p-values < 0.001), as reported in Table I. In addition, a strong correlation was also found at a lag of +1 week between influenza death and Google trends search (rho = 0.84 for flu and rho = 0.81 for symptoms, p-values < 0.001), as described in Table II. These statistically significant patterns were depicted in Figure 1 (2015-2019 period) and Figure 2 (2015-2020 period) for influenza cases and in Figure 3 for deaths (2016-2019 period). When examining the correlation between Google trends search and SARS-CoV-2 new cases reported by the Ministry of Health, the strongest correlation was found at a lag of +3 weeks for the search term flu (rho = 0.80, p-value < 0. 01) as showed in Table III. This statistical pattern is confirmed in Figure 4, where the Google research volume for flu and flu symptoms were plotted considering both influenza cases and new SARS-CoV-2 cases. In this figure, the search volume for flu and flu symptoms shows a double peak. The first is concurrent to the influenza cases peak, the second is precedent to the reported new SARS-CoV-2 cases.
Tab. I.

Focus on flu (2015-2019 and 2015-2020 periods). Time series bi-directional cross-correlation coefficients for 1 week displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and cases reported by the ISS. Used Spearman’s rank correlation coefficient.

Lag in week compared to deaths reported by ISS
2015-2019 period -1 0 +1
Flu0.8257*0.8966*0.9211*
Symptoms of Flu0.7657*0.8380*0.8722*
2015-2020 period
Flu0.7521*0.7755*0.7704*
Symptoms of Flu0.7991*0.8377*0.8212*

* p-value < 0.001.

Tab. II.

Focus on flu (2016-2020 period). Time series bi-directional cross-correlation coefficients for 1 week displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and deaths reported by the ISS. Used Spearman’s rank correlation coefficient.

Lag in week compared to deaths reported by ISS
2016-2020 period -1 0 +1
Flu0.6015*0.7545*0.8366*
Symptoms of Flu0.6177*0.7439*0.8056*

* p-value < 0.001.

Fig. 1.

Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological cases of Flu in Italy at Lag 0. 2015-2019 period.

Fig. 2.

Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological cases of Flu in Italy at Lag 0. RSV is relative search volumes. 2015-2020 period.

Fig. 3.

Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological deaths of Flu in Italy at Lag 0. 2016-2019 period.

Tab. III.

Focus on SARS-CoV-2. Time series bi-directional cross-correlation coefficients for 1, 2, 3 and 4 weeks displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and SARS-CoV-2 new cases. Used Spearman’s rank correlation coefficient.

Lag in week compared to SARS-CoV-2 new cases
0 +1 +2 +3 +4
Flu-0.4167-0.05000.38330.8000*0.6500
Symptoms of Flu-0.4333-0.20000.10000.44350.0084

*: p-value < 0.01.

Fig. 4.

Focus on SARS-CoV-2 new cases. Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological SARS-CoV-2 new cases in Italy at Lag 0. 2019-2020 period.

Discussion

In this study we found a large correlation between flu – cases and deaths – occurred in Italy and reported by ISS and GT research for both flu and flu symptoms. This result remains consistent even using different time lag, becoming more stronger when a time lag of +1 week was adopted. Due to the overlap between clinical symptomatology and season during which flu and SARS-CoV-2 spread among population (in Italy), we further assessed the correlation between Google trends search and SARS-CoV-2 new cases reported by the Ministry of Health. A strong correlation was found in this analysis as well, with the strongest correlation at a lag of +3 weeks. This means that at the beginning of the SARS-CoV-2 pandemic, people affected by COVID-19 searched on Internet information related to flu, probably confusing the two diseases. Moreover, it confirms the hypothesis that people frequently use internet for searching health related information. On Feb 22, 2020 an Editorial on the scientific journal The Lancet entitled “COVID-19: fighting panic with information” focused on the real risk of sanitary emergency saying there could be no way to prevent a COVID-19 pandemic in this globalized time, but verified information is the most effective prevention against the disease of panic [20]. Thus, from the first moment it became clear that we were struggling not only with an epidemic, but also with an infodemic [21]. A global epidemic of misinformation – spreading mainly through social media platforms and fake news – poses a serious problem for public health although the WHO is leading the effort to stem of public emergency. As a public health emergency of international concern, the COVID-19 has drawn global attention and response. In the scenario of COVID-19 pandemic [22], it is extremely important to promote the flu vaccination during the next campaign increasing the opinion, knowledge and attitude of health workers and the population with dedicated health policies [23-25]. This is true for several reasons, firstly, it could directly reduce the burden of the flu pandemic (diminishing and limiting the number of patients hospitalized because of flu), secondly, reducing the number of patients hospitalized because of flu, it will ameliorate the hospital organization of patients eventually positive to SARS-CoV-2. Thirdly, in flu immunized patients the differential diagnosis between flu and SARS-CoV-2 could be facilitate improving the clinical management of these patients [26]. In planning these measures, considerations should be given to minimizing the excess risk of morbidity and mortality from vaccine-preventable diseases (VPDs). Such outbreaks may result in VPD-related deaths and an increased burden on health systems already strained by the response to the COVID-19 outbreak [27]. In this context, the big data generated by web searches become increasingly important in the search for new surveillance systems based on digital epidemiology. According to Marcel Salathe the term digital epidemiology is a field of study that uses data that was generated outside the public health system, i.e. with data that was not generated with the primary purpose of doing epidemiology [28]. In a similar way to the results of the scientific literature our study shown that digital epidemiology, integrated to modern infectious disease surveillance systems, aim to employ the speed and scope of big data in an attempt to provide global health security [29]. Our study has strengths and limitations. Google Trends data helps identify developing interests in different public health topics including known and emerging infectious diseases (i.e. flu and SARS-CoV-2) or related clinical and diagnostic aspects and screening tests. Internet searches can be an important source for generating hypotheses about knowledge, attitudes, and practices in public health topics; evaluating changes in information seeking after targeted interventions to prevent the spread of emerging infectious diseases or stem vaccine-preventable diseases. In this field, public health interventions could be evaluated almost immediately and with a minimal expenditure. The mass media (TV, radio, and social network) may influence the online population’s researches [30]. Indeed, the spike of Internet searches, for example, for “Flu” or “symptoms of Flu” may be attributed to various factors as an increased number of cases in the community and increased attention given by the mass media. Indeed, the data is only available for States and selected metropolitan areas limiting comparability with rural areas or areas with a low search volume, represented by the areas where Internet is less widespread among the population. Finally, Google Trends data are anonymous limiting the utility in examining subgroups or disparities among populations. Thus, even considering the potential intrinsic limits of this analysis, our results show how this data might be extremely useful, encouraging the spread of future researches at each country level. The results of this study suggest that Google Trends based surveillance systems might be relevant for public health and for public health workers [31], because these novel systems have the potentiality to inform how the public is interested in searching health related information [32]. The info surveillance systems, based on the intrinsic characteristic of dynamicity, have the power to inform and provide near real-time data, useful to plan public health interventions [33]. Public health workforce should enforce communication and internet-based skills in order to fruitfully use a new and cheap technology able to support interventions design and implementation [34].

Conclusions

How key information must be communicated to the public during the next phase of the pandemic is critical. In the last years research in health care has used GT data to explore public interest and trends in various fields of medicine. It is evident that caution should be used when interpreting the findings of Google Trends digital surveillance. Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological cases of Flu in Italy at Lag 0. 2015-2019 period. Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological cases of Flu in Italy at Lag 0. RSV is relative search volumes. 2015-2020 period. Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological deaths of Flu in Italy at Lag 0. 2016-2019 period. Focus on SARS-CoV-2 new cases. Google Trends curve as RSVs (Relative Search Volumes) for symptoms of Flu and Flu vs epidemiological SARS-CoV-2 new cases in Italy at Lag 0. 2019-2020 period. Focus on flu (2015-2019 and 2015-2020 periods). Time series bi-directional cross-correlation coefficients for 1 week displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and cases reported by the ISS. Used Spearman’s rank correlation coefficient. * p-value < 0.001. Focus on flu (2016-2020 period). Time series bi-directional cross-correlation coefficients for 1 week displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and deaths reported by the ISS. Used Spearman’s rank correlation coefficient. * p-value < 0.001. Focus on SARS-CoV-2. Time series bi-directional cross-correlation coefficients for 1, 2, 3 and 4 weeks displaying relationships between Google Trends Terms (“Flu” and “Symptoms of Flu”) and SARS-CoV-2 new cases. Used Spearman’s rank correlation coefficient. *: p-value < 0.01.
  26 in total

1.  Digital epidemiology: assessment of measles infection through Google Trends mechanism in Italy.

Authors:  O E Santangelo; S Provenzano; D Piazza; D Giordano; G Calamusa; A Firenze
Journal:  Ann Ig       Date:  2019 Jul-Aug

Review 2.  [Communication in health.]

Authors:  Vincenza Gianfredi; Chiara Grisci; Daniele Nucci; Valeria Parisi; Massimo Moretti
Journal:  Recenti Prog Med       Date:  2018 Jul-Aug

3.  How often people google for vaccination: Qualitative and quantitative insights from a systematic search of the web-based activities using Google Trends.

Authors:  Nicola Luigi Bragazzi; Ilaria Barberis; Roberto Rosselli; Vincenza Gianfredi; Daniele Nucci; Massimo Moretti; Tania Salvatori; Gianfranco Martucci; Mariano Martini
Journal:  Hum Vaccin Immunother       Date:  2016-12-16       Impact factor: 3.452

4.  Clinical characteristics and drug therapies in patients with the common-type coronavirus disease 2019 in Hunan, China.

Authors:  Qiong Huang; Xuanyu Deng; Yongzhong Li; Xuexiong Sun; Qiong Chen; Mingxuan Xie; Shao Liu; Hui Qu; Shouxian Liu; Ling Wang; Gefei He; Zhicheng Gong
Journal:  Int J Clin Pharm       Date:  2020-05-14

5.  Digital epidemiology: what is it, and where is it going?

Authors:  Marcel Salathé
Journal:  Life Sci Soc Policy       Date:  2018-01-04

6.  Correlation between flu and Wikipedia's pages visualization.

Authors:  Vincenza Gianfredi; Omar Enzo Santangelo; Sandro Provenzano
Journal:  Acta Biomed       Date:  2021-02-08

Review 7.  Challenges and Opportunities of Mass Vaccination Centers in COVID-19 Times: A Rapid Review of Literature.

Authors:  Vincenza Gianfredi; Flavia Pennisi; Alessandra Lume; Giovanni Emanuele Ricciardi; Massimo Minerva; Matteo Riccò; Anna Odone; Carlo Signorelli
Journal:  Vaccines (Basel)       Date:  2021-06-01

8.  COVID-19: fighting panic with information.

Authors: 
Journal:  Lancet       Date:  2020-02-22       Impact factor: 79.321

Review 9.  What can internet users' behaviours reveal about the mental health impacts of the COVID-19 pandemic? A systematic review.

Authors:  Vincenza Gianfredi; Sandro Provenzano; Omar Enzo Santangelo
Journal:  Public Health       Date:  2021-07-05       Impact factor: 2.427

View more
  2 in total

1.  COVID-19 and thyroid disease: An infodemiological pilot study.

Authors:  Ioannis Ilias; Charalampos Milionis; Eftychia Koukkou
Journal:  World J Methodol       Date:  2022-05-20

2.  Spatiotemporal evolution of online attention to vaccines since 2011: An empirical study in China.

Authors:  Feng Hu; Liping Qiu; Wei Xia; Chi-Fang Liu; Xun Xi; Shuang Zhao; Jiaao Yu; Shaobin Wei; Xiao Hu; Ning Su; Tianyu Hu; Haiyan Zhou; Zhuang Jin
Journal:  Front Public Health       Date:  2022-07-26
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.