Literature DB >> 35046760

Ups and Downs of COVID-19: Can We Predict the Future? Local Analysis with Google Trends for Forecasting the Burden of COVID-19 in Pakistan.

Sibtain Ahmed1, Muhammad Abbas Abid2, Maria Helena Santos de Oliveira3, Zeeshan Ansar Ahmed1, Ayra Siddiqui4, Imran Siddiqui1, Lena Jafri1, Giuseppe Lippi5.   

Abstract

BACKGROUND: We aim to study the utility of Google Trends search history data for demonstrating if a correlation may exist between web-based information and actual coronavirus disease 2019 (COVID-19) cases, as well as if such data can be used to forecast patterns of disease spikes. PATIENTS &
METHODS: Weekly data of COVID-19 cases in Pakistan was retrieved from online COVID-19 data banks for a period of 60 weeks. Search history related to COVID-19, coronavirus and the most common symptoms of disease was retrieved from Google Trends during the same period. Statistical analysis was performed to analyze the correlation between the two data sets. Search terms were adjusted for time-lag over weeks, to find the highest cross-correlation for each of the search terms.
RESULTS: Search terms of 'fever' and 'cough' were the most commonly searched online, followed by coronavirus and COVID. The highest peak correlations with the weekly case series, with a 1-week backlog, was noted for loss of smell and loss of taste. The combined model yielded a modest performance for forecasting positive cases. The linear regression model revealed loss of smell (adjusted R2 of 0.7) with significant 1-week, 2-week and 3-week lagged time series, as the best predictor of weekly positive case counts.
CONCLUSIONS: Our local analysis of Pakistan-based data seemingly confirms that Google trends can be used as an important tool for anticipating and predicting pandemic patterns and pre-hand preparedness in such unprecedented pandemic crisis.
Copyright © 2021 International Federation of Clinical Chemistry and Laboratory Medicine (IFCC). All rights reserved.

Entities:  

Keywords:  COVID-19; Google Trends; Pakistan; pandemic; prediction

Year:  2021        PMID: 35046760      PMCID: PMC8751396     

Source DB:  PubMed          Journal:  EJIFCC        ISSN: 1650-3414


INTRODUCTION

Pakistan is a low-income country in the subcontinent and also one of the most overpopulated countries in the world, with a high prevalence of communicable diseases. Pakistan ranks 154th out of 189 countries, with Human Development Index value of 0.557 (1). Due to high population densities, lack of skilled medical personnel and resources, low literacy rate and budgetary constraints, among other reasons, Pakistan’s healthcare system is especially vulnerable to epidemic infectious disease, and coronavirus disease 2019 (COVID-19) is unfortunately not an exception to this rule. Pakistan has earlier struggled with controlling other life-threatening infectious diseases such as dengue, hepatitis, acquired immunodeficiency syndrome (AIDS) and so forth, and has lost thousands of lives because of this (2,3,4). This has led researchers to believe that the risk for COVID-19 morbidity and mortality may be higher in Pakistan compared to other worldwide countries (5). The lack of resources, such as test assays, has made it imperative for low-income countries (like Pakistan) to identify reliable alternatives to mass COVID-19 testing, so that the spread of disease could be curbed before becoming unmanageable for society, government and healthcare system. Since COVID-19 has put more pressure on an already overburdened and underfunded healthcare system, diagnostic testing for severe acute respiratory disease coronavirus 2 (SARS COV-2) infections has been vastly limited. This has potentially led to an underestimation of COVID-19 prevalence in the country (6). However, this problem is not only limited to Pakistan and other developing countries. Most clinical laboratories worldwide, up to 80%, have reported facing difficulties in SARS-COV-2 testing, while more than half reported shortage of supplies needed for routine molecular testing (52%) (7). Hence, a substitute for diagnostics is needed, that could accurately mirror COVID-19 epidemiology in specific geographical areas. In a study conducted by Ginsberg et al, internet search-engine query data was used to predict the course of influenza in the United States (8). This surveillance method provided positive results in non-English speaking countries as well. Studies have correlated the data from Europe, China, Korea and Taiwan with Google Trends for COVID-19 and other epidemic diseases, such as influenza (9, 10). Hence, the application of Google Trends to track disease progression has a far-reaching influence for countries across the globe, regardless of their location or language. It is also cost-effective, timesaving and does not carry any substantial economical or organizational burdens the healthcare system. Therefore, we ideally conceive that Google Trends may have the potential to assess the prevalence of COVID-19 in Pakistan and anticipate new waves of infection. A survey conducted in Pakistan showed that nearly half (45%) of patients search about their health-related concerns on the Internet (11). Hence, using epidemiological data from Google trends could significantly represent the Pakistani population. Google search terms regarding key symptoms, such as loss of taste and loss of smell, may help in predicting the epidemiological trajectory of COVID-19 (12,13). Olfactory and gustatory dysfunctions have a strong association with COVID-19 patients, with anosmia showing the highest correlation (14). Tracking these symptoms can be a feasible and viable means for assessing the prevalence of COVID-19 and effectively target the government response. This study was hence designed to assess the potential utility of Google search trends focused on COVID-19 symptoms (including anosmia and dysgeusia), in projecting the trajectory of the local pandemic outbreak in Pakistan through correlation with ongoing COVID-19 statistics.

MATERIAL AND METHODS

In order to avoid daily variations in the positivity rate, the number of weekly COVID-19 confirmed cases in Pakistan were retrieved from OurWorldInData.org, powered by the Johns Hopkins Coronavirus Resource Center (15). The data was further confirmed from the official website of the Ministry of National Health, Pakistan (16,17). This information was retrieved for a period between March 15, 2020 and June 15, 2021. Data was acquired from Google Trends (Google Inc., Mountain View, CA), using the following search terms, encompassing the most representative symptoms in COVID-19: fever, cough, headache, shortness of breath, taste loss and hearing loss, along with other virus-related keywords such as ‘COVID-19’, ‘coronavirus’, ‘virus’ and ‘COVID. A weekly Google Trends score was obtained for each keyword on a scale of 100 points, reflecting the cumulative number of Google searches during the previous week. The maximum attainable score of 100 was defined as the highest search volume during the study period for a particular search. The study was conducted in accordance with the Declaration of Helsinki, under the terms of relevant local legislation. This analysis was based on electronic searches in unrestricted, publicly available repositories, such that no informed consent or ethical committee approvals were needed.

STATISTICAL ANALYSIS

Statistical analysis was carried out using Microsoft Excel for Windows (2016) and R Software (version 4.0.2; R Foundation for Statistical Computing). Correlation analysis for individual search terms was used for assessing the time lags which generated the maximum achievable correlations between the weekly positive cases and Google trends timeline. The corresponding P values and 95% confidence intervals (CIs) were also calculated. A P value <0.05 was considered as statistically significant. To calculate the quantitative effect of Google Trend score increment on subsequent rise in weekly cases, time series linear regression analysis was performed, and the time lag with maximum predictive value was computed. Adjusted R2 values and graphic analysis was undertaken to assess the combined model performance of positivity rate forecasting compared against national surveillance data.

RESULTS

The highest overall trend value for the study duration was achieved for fever (n=3036), followed by cough (n=2120), Coronavirus (n=1669), COVID (n=1417), headache (n=1284), COVID-19 (n=333), virus (n=329), shortness of breath (n=257), loss of smell (n=129) and loss of taste (n=106) respectively. From all searched terms, fever and cough during second week of June and last week of May 2021 attained the highest Google trend value of 100. Time-series linear regression analysis is provided in Figure 1(a-e) and 2, summarizing the effects of the Google Trends search series when adjusted for the monthly trend of an increase in positive cases.
Figure 1

(a-e) Time-series linear regression analysis for weekly number of COVID-19 in Pakistan with Google Trends scores for suggestive symptoms

Figure 2

Time-series linear regression analysis for the combined model for weekly number of covid-19 in Pakistan with Google Trends

The linear regression model revealed loss of smell (adjusted R2 of 0.7) with the significant 1-week, 2-week and 3-week– lagged time series, as the best predictor of weekly positive case counts, as further elaborated in Table 3 and Figure 1(a-e). The combined model yielded an excellent performance for forecasting positive cases with adjusted R2 value of 0.83 as shown in Figure 2 and Table 2.
Table 3

Adjusted R2 based on time-series linear regression analysis for weekly number of covid-19 in Pakistan with Google Trends scores for suggestive symptoms

SymptomAdjusted R2
Fever0.36
Cough0.44
Headache0.60
Loss of smell0.70
Loss of taste0.70
Table 2

Adjusted R2 based on time-series linear regression analysis for the combined model for weekly number of COVID-19 in Pakistan with Google Trends scores

EstimateStd. Errorp-value
Month798.4204.1<0.001
Fever (Lagged -2)219.166.50.002
Taste Loss (Lagged -1)1478.9480.20.003
Taste Loss (Lagged -3)1717.6507.20.001
Covid (Lagged -1)237.850.0<0.001
Covid-19 (Lagged -7)-1725.8420.5<0.001
Adjusted R20.83

DISCUSSION

The results of our study demonstrate the existence of a statistically significant positive correlation between Google search terms and overall COVID-19 positivity rate in Pakistan. This was especially evident for search terms such as ‘fever’, ‘smell loss’, ‘taste loss’ and ‘shortness of breath’, with a time lag of 2 weeks, while for ‘cough’ and ‘coronavirus’ the time lag was 3 weeks. This model can hence successfully predict an increase in COVID-19 cases 2-3 weeks ahead of official diagnosis, thus allowing government and healthcare system to adapt and be prepared for the oncoming burden. Google Trends is a useful tool for forecasting both healthcare and non-healthcare related epidemiological trends. The freely available data can help health authorities to anticipate increases in demands of testing capacity, as well as treatment facility, including availability of hospital beds, oxygen supply, access to ventilators and availability of adequate number of physicians and ancillary staff. Although this study is a first-of-its-kind based in Pakistan, other countries have successfully used Google Trends to predict changes in the ongoing COVID-19 pandemic outbreak. Cherry et al. studied Google Trends data for 137 regions from 5 different countries, reporting that pathognomonic symptoms such as anosmia and dysgeusia can accurately predict the future incidence patterns of COVID-19(12). Henry et al. reported similar results in Poland (18). In another study, Lippi et al. also describe significant associations of fever, fatigue and dyspnea with the COVID-19 outbreaks in Italy (19). The same group reported that the correlation between Google searches and COVID-19 cases became stronger with a lag of 2 weeks, as compared to the same week (13). The results of these studies are hence in keeping with the findings of our Pakistan-based analysis. The use of Google Trends in health policy making and management of pandemic could be especially useful for low-middle income countries (LMIC) like Pakistan, where resources are limited and strict and timely management of these resources can help curb the increasing pandemic. The model we developed could be hence used in other LMIC, to direct resources where most required. Although our study utilized data from the whole country, region specific data can also be used to focus resources to regions which require them the most in near future. The limitations of our study include the limited use of internet decrease literacy rate in developing countries. Also, the internet use behavior can be influenced by media communications, and possibly serve as a cofounder. Increased knowledge about self-reported symptoms can also decrease the use of internet for searching COVID-related information.

CONCLUSION

Google Trends is an effective tool for forecasting trends of the ongoing COVID-19 pandemic outbreak. We found a high correlation between Google searches for COVID-19 symptoms and diagnoses of SARS-CoV-2 infection, which can be used to direct resources where required or needed. Such data can help government authorities and health policy-making agencies to make well-informed decisions related to imposition of lockdown and provision of resources. Utilizing such data can help developing countries like Pakistan streamline their efforts against the pandemic and possibly prepare of outbreaks before the actually happening to minimize morbidity and mortality as well financial losses that may pursue.
Table 1

Cross-correlation analysis between weekly number of COVID-19 cases in Pakistan with Google Trends scores for suggestive symptoms

Search termOptimal lagCorrelationp-value
Fever-20.437<0.001
Headache-80.3490.005
Smell loss-20.561<0.001
Cough-30.2600.035
Taste loss-20.618<0.001
Shortness of breath-20.2890.019
Coronavirus-3-0.3260.008
COVID-10.501<0.001
COVID-19-7-0.3530.004
Virus-40.3220.009
  12 in total

1.  HIV outbreaks in Pakistan.

Authors:  Ali Ahmed; Furqan Khurshid Hashmi; Gul Majid Khan
Journal:  Lancet HIV       Date:  2019-06-13       Impact factor: 12.767

Review 2.  Pakistan's health system: performance and prospects after the 18th Constitutional Amendment.

Authors:  Sania Nishtar; Ties Boerma; Sohail Amjad; Ali Yawar Alam; Faraz Khalid; Ihsan ul Haq; Yasir A Mirza
Journal:  Lancet       Date:  2013-05-17       Impact factor: 79.321

3.  Olfactory and gustatory dysfunctions in COVID-19 patients: A systematic review and meta-analysis.

Authors:  Minh P Hoang; Jesada Kanjanaumporn; Songklot Aeumjaturapat; Supinda Chusakul; Kachorn Seresirikachorn; Kornkiat Snidvongs
Journal:  Asian Pac J Allergy Immunol       Date:  2020-09       Impact factor: 2.310

4.  Utility of Google Trends in anticipating COVID-19 outbreaks in Poland.

Authors:  Brandon Michael Henry; Ivan Szergyuk; Maria Helena Santos de Oliveira; Giuseppe Lippi; Grzegorz Juszczyk; Marcin Mikos
Journal:  Pol Arch Intern Med       Date:  2021-03-26

5.  Detecting influenza epidemics using search engine query data.

Authors:  Jeremy Ginsberg; Matthew H Mohebbi; Rajan S Patel; Lynnette Brammer; Mark S Smolinski; Larry Brilliant
Journal:  Nature       Date:  2009-02-19       Impact factor: 49.962

6.  Google Trends-based non-English language query data and epidemic diseases: a cross-sectional study of the popular search behaviour in Taiwan.

Authors:  Yu-Wei Chang; Wei-Lun Chiang; Wen-Hung Wang; Chun-Yu Lin; Ling-Chien Hung; Yi-Chang Tsai; Jau-Ling Suen; Yen-Hsu Chen
Journal:  BMJ Open       Date:  2020-07-05       Impact factor: 2.692

7.  Association of the COVID-19 pandemic with Internet Search Volumes: A Google TrendsTM Analysis.

Authors:  Maria Effenberger; Andreas Kronbichler; Jae Il Shin; Gert Mayer; Herbert Tilg; Paul Perco
Journal:  Int J Infect Dis       Date:  2020-04-17       Impact factor: 3.623

Review 8.  Why is Pakistan vulnerable to COVID-19 associated morbidity and mortality? A scoping review.

Authors:  Muhammad Atif; Iram Malik
Journal:  Int J Health Plann Manage       Date:  2020-07-22

9.  Prevalence of dengue virus serotypes in the 2017 outbreak in Peshawar, KP, Pakistan.

Authors:  Najeeb Ullah Khan; Lubna Danish; Hydayat Ullah Khan; Maryam Shah; Muhammad Ismail; Ijaz Ali; Arnolfo Petruzziello; Rocco Sabatino; Annunziata Guzzo; Gerardo Botti; Aqib Iqbal
Journal:  J Clin Lab Anal       Date:  2020-07-22       Impact factor: 2.352

10.  Google search volume predicts the emergence of COVID-19 outbreaks.

Authors:  Giuseppe Lippi; Camilla Mattiuzzi; Gianfranco Cervellin
Journal:  Acta Biomed       Date:  2020-09-07
View more
  2 in total

1.  COVID-19 and thyroid disease: An infodemiological pilot study.

Authors:  Ioannis Ilias; Charalampos Milionis; Eftychia Koukkou
Journal:  World J Methodol       Date:  2022-05-20

Review 2.  Forecasting and Surveillance of COVID-19 Spread Using Google Trends: Literature Review.

Authors:  Tobias Saegner; Donatas Austys
Journal:  Int J Environ Res Public Health       Date:  2022-09-29       Impact factor: 4.614

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.