Literature DB >> 34869935

Anosmia-related internet search and the course of the first wave of the COVID-19 pandemic in the United States.

Kenneth M Madden1,2, Boris Feldman1.   

Abstract

BACKGROUND: The current pandemic of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) was first reported in Wuhan, China. Although the first case in the United States was reported on Jan 20, 2020 in Washington, the early pandemic time course is uncertain. One approach with the potential to provide more insight into this time course is the examination of search activity. This study analyzed US search data prior to the first press release of anosmia as an early symptom (March 20, 2020).
METHODS: Daily internet search query data was obtained from Google Trends (September 20th to March 20th for 2015 to 2020) both for the United States and on a state-by-state basis. Normalized anosmia-related search activity for the years prior to the pandemic was averaged to obtain a baseline level. Cross-correlations were performed to determine the time-lag between changes in search activity and SARS-CoV-2 cases/deaths.
RESULTS: Only New York showed both significant increases in anosmia-related terms during the pandemic year as well as a significant lag (6 days) between increases in search activity and the number of cases/deaths attributed to SARS-CoV-2.
CONCLUSIONS: There is no evidence from search activity to suggest earlier spread of SARS-CoV-2 than has been previously reported. The increase in anosmia-related searches preceded increases in SARS-CoV-2 cases/deaths by 6 days, but this was only significant over the background noise of searches for other reasons in the setting of a very large outbreak (New York in the spring of 2020).
© 2021 The Authors.

Entities:  

Keywords:  COVID-19; Infodemiology; Internet search

Year:  2021        PMID: 34869935      PMCID: PMC8629775          DOI: 10.1016/j.heliyon.2021.e08499

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

The current pandemic of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was first reported in Wuhan, China [1], and rapidly spread throughout the world. Although the first case in the United States was reported on Jan 20, 2020 in Washington [2] there is some doubt about the pandemic's early time course. Work done by the Seattle Flu Study indicated several generations of community spread between January 1 through March 9, 2020, and a death in California was identified retrospectively as due to COVID-19 three weeks prior to the first known SARS-Cov-2 case in the United States [3]. The symptoms of SARS-CoV-2 are myriad, but one of the very early symptoms has been shown to be anosmia, manifested as a loss of both smell and taste [4, 5]. An analysis of the time course of an increase in anosmia symptoms (over normal seasonal levels) could provide valuable insight into the time course of this pandemic. One approach with the potential to provide more insight into the time course of the spread of SARS-CoV-2 in the United states is an infodemiological one [6]. Infodemiology is defined as the “science of distribution and determinants of information in an electronic medium, specifically the Internet, or in a population, with the ultimate aim to inform public health and public policy.” [6]. This approach has been used to predict influenza outbreaks in the past [7], and several previous studies have attempted to use search data to detect the course of the pandemic [8, 9, 10, 11]. Unfortunately, this previous work on SARS-CoV-2 has several methodological issues: 1) the search terms used tended to involve the use of medical jargon, which is unlikely to arise in a layperson's vernacular [8,11] 2) the search terms were generated by the authors, not from the raw search data itself, leaving open the possibility of “data dredging” [8,10,11] 3) the datasets contained data that was after March 20, the date that anosmia and loss of taste was published in a press release as an early symptom of SARS-CoV-2 [5], potentially contaminating search data after this date [8, 9, 10] 4) Previous work has used regressions between search terms and cases/deaths, which cannot be used to infer that loss of smell/taste preceded SARS-CoV-2 cases (as opposed to increased cases/deaths resulting in more interest in symptoms) [8, 9, 10, 11]. Since several investigators have suggested [8, 9, 10, 11] that internet search data can be used to predict/follow the course of pandemics, the object of the current study is to determine the relationship between a characteristic symptom of a novel virus outside of the contaminating influence of intense media coverage. Our study addressed these issues by analysing Google Trends search data in the United States prior to the first press release of anosmia as a SARS-CoV-2 symptom on March 20, 2020 [5]. In addition, instead of using a priori search terms we will generate the search terms using unsupervised machine learning methods, avoiding both the possibility of data dredging and medical jargon. We will also use cross-correlation methods to examine if changes in US search data surrounding anosmia preceded changes in SARS-CoV-2 cases and deaths, prior to anosmia being known as a unique symptom.

Materials and methods

Normalized search data

Google Trends is a web-based tool that can be used to determine the number of searches that have been performed over a certain time period. Search data is normalized by the database for overall daily search activity, to correct for the fact that overall search activity varies on different days of the week (weekend versus non-weekend days, for example). Google Trends reports overall search activity as a score between 0 and 100 [12]. Search data can be narrowed down to a specific country, or a state within a country. In keeping with current guidelines for reporting Google Trends data in medical studies [13], search data was obtained from September 20th to March 20th for the years 2015–2020, and it was downloaded as a.csv file accessed on September 15th, 2020. March 20th was chosen as the last date in 2020, because it precedes the first reports of anosmia as an early symptom of SARS-CoV-2 [14]. In order to reduce non-health related inquiries, all searches were limited to those classified by Google as in the ‘Health’ subcategory. As in previous studies of this type [13, 15] this paper used publicly available, open accessible aggregate data and the Clinical Research Ethics Board at the University of British Columbia deemed that review was unnecessary [16]. In order to avoid issues around “data dredging” and following procedures documented in previous studies using search data for public health purposes [15, 17, 18, 19] all search terms were chosen systematically prior to starting any data analysis. In order to avoid data dredging, we selected an a priori initial search team that seemed “reasonable” and then used the Keyword Search Tool [12]. The Keyword Search Tool is an online application that uses unsupervised machine learning methods to generate keywords based on the entire search dataset and provides the normalized search activity associated with each suggestion. We initially started with the initial term “I can't smell” and all related keywords with a higher search volume were added to our list. Each new keyword was then in turn entered into the Keyword Search Tool until no new search terms were suggested. In our download, we used the Boolean logical operator “OR” which allowed us to collect all available search data containing any of our search terms. Our final keyword lists was then entered together into Google Trends together using logical ‘or’ operators; our aggregate search term consisted of the following: Can't smell OR can't taste or smell OR why can't i smell or taste OR why can't i taste or smell anything

Baseline and delta search data

Search activity from September 20th to March 20th for the years 2015–2019 were averaged to obtain a normalized baseline level of search activity for our final keyword list (Baseline). The change in search activity (Delta) was determined by subtracting Baseline from the normalized level of search activity for our final keyword list from September 20th to March 20th in 2019/2020 (PandemicSearch).

SARS-CoV-2 case and death counts

This case and mortality data [20] was downloaded as a.csv file accessed on September 15th, 2020.

Statistical analysis

All averages are reported as mean ± standard error. A two-way Analysis of Variance with repeated measures (time) was performed to determine the main effect between Baseline and PandemicSearch at both the entire country and state-by-state levels. For every state that showed a significant difference between Baseline and PandemicSearch, local regression (LOESS) curves were plotted using Delta data to visualize the time course of the increase in anosmia-related search activity. In order to examine the time-lag in any association between changes in the search activity and the SARS-CoV-2 cases/deaths time series a cross-correlation was performed between Delta and both the normalized SARS-CoV-2 cases and the normalized SARS-CoV-2 deaths. The R core software package version 4.0.0 was used for statistical analysis with a significance level of p < 0.05 [21].

Results

Anosmia-related search activity--entire US and individual states (Figure 1, Table 1)

Prior to the recognition of anosmia as a symptom of SARS-CoV-2, normalized anosmia-related search activity for the entire United States showed no difference in search activity observed during the pandemic year as compared to baseline (F = 2.863, p = 0.103). Delta Search Activity from September 20th, 2019 to March 20th, 2020: Change in anosmia-related search activity during the pandemic first wave (Delta, Normalized Units) for both national data, and the state of New York. State-by-state analysis of anosmia-related search activity. State-by-state result of 2-way analysis of variance with repeated measures. ∗, level of significant p-value < 0.05; NA, insufficient data for Google Trends report. When search activity was examined on a state-by-state level, the only state that showed a significant increase in anosmia-related search terms during the year of the pandemic was New York (see Table 1). As seen in Figure 1, anosmia-related terms were higher during this time than the average of the previous years (F = 4.711, p = 0.039).
Table 1

State-by-state analysis of anosmia-related search activity.

StateMean Baseline Search Activity (4 previous years)Pandemic Year Search ActivityF-statisticp value
Alabama4.310.20.740.788
AlaskaNANANANA
Arizona3.46.61.8770.182
Arkansas3.30.013.4920.001∗
California23.919.30.7460.396
Colorado8.45.10.6170.443
Connecticut6.011.90.9240.345
DelawareNANANANA
Florida12.515.10.2690.608
Georgia7.68.60.1120.741
HawaiiNANANANA
IdahoNANANANA
Illinois13.815.70.1030.751
Indiana3.97.30.6820.417
IowaNANANANA
KansasNANANANA
Kentucky2.93.30.0160.899
Louisiana3.81.91.5630.222
MaineNANANANA
Maryland7.93.13.2770.084
Massachusetts6.46.22.2030.15
Michigan10.411.60.0650.801
Minnesota11.07.80.5240.475
Mississippi2.00.02.9790.96
Missouri6.612.01.090.306
MontanaNANANANA
Nebraska0.02.710.327
Nevada5.12.40.6730.42
New HampshireNANANANA
New Jersey6.68.10.1830.672
New MexicoNANANANA
New York3.77.94.7110.039∗
North Carolina9.26.43.5240.066
North DakotaNANANANA
Ohio8.08.60.0160.899
OklahomaNANANANA
Oregon2.24.90.520.477
Pennsylvania8.014.61.5320.221
Rhode IslandNANANANA
South Carolina3.91.11.7760.194
South DakotaNANANANA
Tennessee6.96.800.992
Texas18.022.31.4060.246
Utah1.71.70.0010.971
VermontNANANANA
Virginia5.74.30.2980.59
Washington6.65.70.0690.795
West VirginiaNANANANA
Wisconsin4.510.11.2550.273
WyomingNANANANA

State-by-state result of 2-way analysis of variance with repeated measures. ∗, level of significant p-value < 0.05; NA, insufficient data for Google Trends report.

Figure 1

Delta Search Activity from September 20th, 2019 to March 20th, 2020: Change in anosmia-related search activity during the pandemic first wave (Delta, Normalized Units) for both national data, and the state of New York.

Cross-correlation analysis with case and death counts (Figures 2 and 3)

When data was examined for the United States as a whole, there was no time lag observed between anosmia-related search activity and either SARS-CoV-2 case numbers or deaths (Figure 2). However, when New York data was examined, there was an approximately 6 day lag noted between changes in search activity for anosmia-related terms and changes in both the number of detected cases and the number of deaths attributed to SARS-CoV-2 (Figure 3).
Figure 2

Cross Correlation Between National Search Activity and SARS-CoV-2 Cases and Deaths: Cross correlation between normalized anosmia-related search activity and SARS-CoV-2 cases and deaths for the entire United States.

Figure 3

Cross Correlation Between New York Search Activity and SARS-CoV-2 Cases and Deaths: Cross correlation between normalized anosmia-related search activity and SARS-CoV-2 cases and deaths for the state of New York shows approximately a 6-day time lag.

Cross Correlation Between National Search Activity and SARS-CoV-2 Cases and Deaths: Cross correlation between normalized anosmia-related search activity and SARS-CoV-2 cases and deaths for the entire United States. Cross Correlation Between New York Search Activity and SARS-CoV-2 Cases and Deaths: Cross correlation between normalized anosmia-related search activity and SARS-CoV-2 cases and deaths for the state of New York shows approximately a 6-day time lag.

Discussion

Principal findings

Overall US search activity for anosmia-related search terms was no different during the pandemic first wave as compared to average baseline activity when the dataset only included searches prior to the discovery of anosmia as an early symptom of SARS-CoV-2. In addition, the changes in national search activity showed no significant time lag between alterations in search activity and national changes in cases and deaths due to SARS-CoV-2. On a state-by-state level, the only geographic location that showed significantly elevated anosmia-related searches that preceded changes in cases and deaths due to SARS-CoV-2 was the state of New York.

Previous work

The use of internet search data to monitor for seasonal patterns in other health conditions [22, 23] and to monitor yearly influenza pandemics have been published previously [24]. Several recent studies have shown correlations between Google anosmia-related searches and SARS-CoV-2 cases/deaths [8, 9, 10, 11] suggesting that monitoring such search activity might provide a way to monitor the progression of a pandemic as it unfolds. However, there are significant issues with how previous work generated the initial list of search terms. All previous work generated an initial list through an uncertain a priori process, that leaves the open the possibility of data dredging [8, 9, 10, 11]; our study started with a single obvious term (“I can't smell”) and used the Google Keyword generator to generate alternatives based on the underlying raw search data. In addition, none of the previous studies used the Google Trends algorithm to limit search data to “Health related” inquiries [8, 9, 10, 11], which leaves the search data open to searches for a host of other reasons. Most previous investigations into this issue also used medical search terms (such as “anosmia” and “hypoxia”) that are more likely to measure increasing interest from health professionals about an evolving worldwide crisis, a phenomenon that seems unlikely to provide insight into the early development of symptoms in a lay population. Loss of smell was noted as an early symptom was first noted on March 20th, 2020 [4,5] and widely reported in the American media soon afterwards [25]. Most of the previous studies included data after this date, making the observed correlations between cases/deaths and search activity [8, 9, 10] likely due to increasing interest in the second highest search term [26] at the same time as SARS-CoV-2 spread across the US. One exception was that Walker et al. showed a significant correlation between anosmia-related searches and SARS-CoV-2 cases/deaths with a second analysis limiting their search data to prior to March 20th, 2020 but this second analysis was not performed on an American data set [11]. One way to demonstrate that changes in anosmia-related search activity preceded changes in SARS-CoV-2 cases/deaths is to perform a cross-correlation analysis demonstrating a significant time lag between searches and cases/deaths. Most of the previous studies [9, 10, 11] did not perform such an analysis. Given the natural history of SARS-CoV-2 related illness, one would expect increases in search activity for anosmia-related symptoms to precede increases in cases/deaths, and Higgins et al. showed strong correlations with a 12 day preceding time lag [8]. Unfortunately, this dataset contained search data well past the date when anosmia was widely reported in the news as a symptom, and the cross-correlation analysis was performed on the worldwide dataset as a whole [8]. When search data was both limited to prior to March 20th, the only region of the US to show a strong correlation with a significant time lag was the state of New York (see Figure 2). The fact that the time lag with both cases and deaths was similar was an unexpected result; we expected a longer time lag for SARS-CoV-2 deaths due to the natural course of the illness. This unexpected result is likely explained by delays in cause of death case reporting, a data collection issue that is well documented in the current pandemic, and was also demonstrated in other work [8].

Clinical implications

The results of our study have several clinical implications with respect to the use of digital epidemiological techniques in the setting of a pandemic and with respect to the SARS-CoV-2 pandemic specifically. Specific to SARS-CoV-2, some investigators have postulated that COVID-19 might have been spreading widely throughout the community much earlier than was first thought [3]. Neither our national nor our state-by-state analysis of anosmia-related search activity support this hypothesis (see Figure 1). Our results also show that digital epidemiology-related techniques might have some usefulness in monitoring pandemic spread, as shown by our strong correlations with a 6-day time lag between searches and cases/deaths in the state of New York in early 2020 (see Figure 3). The regional New York outbreak in early 2020 was absolutely enormous with a positive test rate peaking at 65 percent during the week of March 22nd [27], suggesting that digital surveillance of search activity during such a high interest event requires an extremely large signal to overcome the noise introduced by searches for other reasons. Contamination of search data by searches for other reasons make search data difficult to interpret, as shown in other investigations, such as the decreasing usefulness of Google Trends monitoring for influenza once the study was widely reported in the media [28]. However, the results of our investigation suggest that monitoring search terms for disease symptoms might have some utility in following the course of a disease outbreak, but only if the number of health-related searches are quite large (such as in a large population environment like New York state), and/or contamination from non-health related searches are limited (such as in the setting of less media coverage).

Limitations and future research

Although our study is suggestive with respect to the strengths and weaknesses of using epidemiological approaches to pandemic monitoring, further research needs to be done to determine if these approaches can provide real-time monitoring of the time course of viral spread. In addition, Google Trends only provides normalized results of search data as opposed to absolute numbers of searches. Offsetting this, however, is the fact that the number of keyword searches number in the billions [29] so any changes in these normalized results likely represent millions of additional searches. Also, although we limited our search data to those surrounding the topic of “Health”, this subsetting of search data relies on proprietary algorithms known only to Alphabet Inc and are unlikely to be fully accurate in a one-in-a century event such as the COVID-19 crisis. Our study also only considered searches in the English language and from the United States; linguistic and national differences might have conceivably changed search behaviour and is a potential future avenue of research.

Conclusions

When we analyzed Google Trends search data for ansomia related queries, there was no evidence for spread of SARS-CoV-2 in the United states earlier than previously reported [30]. Changes in search activity preceded increases in COVID-19 cases/deaths when our data was examined regionally, but only in the site of the largest outbreak, the state of New York.

Declarations

Author contribution statement

Kenneth M. Madden: Conceived and designed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper. Boris Feldman: Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data.

Funding statement

This work was supported by Allan M. McGavin Foundation.

Data availability statement

Data will be made available on request.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.
  21 in total

1.  What's the healthiest day?: Circaseptan (weekly) rhythms in healthy considerations.

Authors:  John W Ayers; Benjamin M Althouse; Morgan Johnson; Mark Dredze; Joanna E Cohen
Journal:  Am J Prev Med       Date:  2014-04-18       Impact factor: 5.043

2.  Seasonal trends in restless legs symptomatology: evidence from Internet search query data.

Authors:  David G Ingram; David T Plante
Journal:  Sleep Med       Date:  2013-09-14       Impact factor: 3.492

3.  Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet.

Authors:  Gunther Eysenbach
Journal:  J Med Internet Res       Date:  2009-03-27       Impact factor: 5.428

4.  Seasonality in seeking mental health information on Google.

Authors:  John W Ayers; Benjamin M Althouse; Jon-Patrick Allem; J Niels Rosenquist; Daniel E Ford
Journal:  Am J Prev Med       Date:  2013-05       Impact factor: 5.043

5.  Circaseptan (weekly) rhythms in smoking cessation considerations.

Authors:  John W Ayers; Benjamin M Althouse; Morgan Johnson; Joanna E Cohen
Journal:  JAMA Intern Med       Date:  2014-01       Impact factor: 21.873

6.  The use of google trends in health care research: a systematic review.

Authors:  Sudhakar V Nuti; Brian Wayda; Isuru Ranasinghe; Sisi Wang; Rachel P Dreyer; Serene I Chen; Karthik Murugiah
Journal:  PLoS One       Date:  2014-10-22       Impact factor: 3.240

7.  The Seasonal Periodicity of Healthy Contemplations About Exercise and Weight Loss: Ecological Correlational Study.

Authors:  Kenneth Michael Madden
Journal:  JMIR Public Health Surveill       Date:  2017-12-13

8.  Correlations of Online Search Engine Trends With Coronavirus Disease (COVID-19) Incidence: Infodemiology Study.

Authors:  Thomas S Higgins; Arthur W Wu; Dhruv Sharma; Elisa A Illing; Kolin Rubel; Jonathan Y Ting
Journal:  JMIR Public Health Surveill       Date:  2020-05-21

9.  Influenza forecasting with Google Flu Trends.

Authors:  Andrea Freyer Dugas; Mehdi Jalalpour; Yulia Gel; Scott Levin; Fred Torcaso; Takeru Igusa; Richard E Rothman
Journal:  PLoS One       Date:  2013-02-14       Impact factor: 3.240

10.  Use of Google Trends to investigate loss-of-smell-related searches during the COVID-19 outbreak.

Authors:  Abigail Walker; Claire Hopkins; Pavol Surda
Journal:  Int Forum Allergy Rhinol       Date:  2020-06-15       Impact factor: 5.426

View more
  2 in total

1.  Are symptoms associated with SARS-CoV-2 infections evolving over time?

Authors:  M Ricco; M Valente; F Marchesi
Journal:  Infect Dis Now       Date:  2022-02-02

Review 2.  Forecasting and Surveillance of COVID-19 Spread Using Google Trends: Literature Review.

Authors:  Tobias Saegner; Donatas Austys
Journal:  Int J Environ Res Public Health       Date:  2022-09-29       Impact factor: 4.614

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.