Máté Kapitány-Fövény1,2, Tamás Ferenci3, Zita Sulyok4, Josua Kegele5, Hardy Richter6, István Vályi-Nagy7, Mihály Sulyok6. 1. Faculty of Health Sciences, Semmelweis University, Budapest, Hungary. 2. Nyírő Gyula National Institute of Psychiatry and Addictions, Budapest, Hungary. 3. John von Neumann Faculty of Informatics, Physiological Controls Group, Óbuda University, Budapest, Hungary. 4. Institute of Tropical Medicine, Eberhard Karls University, Tübingen, Germany. 5. Department of Neurology and Epileptology, Neurology Clinics, Eberhard Karls University, Tübingen, Germany. 6. Department of Neurology and Stroke, Neurology Clinics, Eberhard Karls University, Tübingen, Germany. 7. South-Pest Central Hospital, National Institute of Hematology and Infectious Diseases, Budapest, Hungary.
Abstract
BACKGROUND: Online activity-based epidemiological surveillance and forecasting is getting more and more attention. To date, Google search volumes have not been assessed for forecasting of tick-borne diseases. Thus, we performed an analysis of forecasting of the Lyme disease incidence based on the traditional data extended with Google Trends. METHODS: Data on the weekly incidence of Lyme disease in Germany from 16 June 2013 to 27 May 2018 were obtained from the database of the Robert Koch Institute. Data of Internet searches were obtained from Google Trends searching "Borreliose" in Germany for the "last 5 years" as a timespan category. Data were split into the training (from 16 June 2013 to 11 June 2017) and validation (from 12 June 2017, to 27 May 2018) data sets. A seasonal autoregressive moving average model, SARIMA (0,1,1) (0,1,1) [52] model was selected to describe the time series of the weekly Lyme incidence. After this, we added the Google Trends data as an external regressor and identified the SARIMA (0,1,1) (0,1,1) [52] model as optimal. We made predictions for the validation interval using these two models and compared predictions with the values of the validation data set. RESULTS: Forecasting for the validation timespan resulted in similar values for the models. Comparing the forecasted values with the reported ones resulted in an residual mean squared error (RMSE) of 0.3763; the mean absolute percentage error (MAPE) was 8.233 for the model without Google searches with an RMSE of 0.3732; and the MAPE was 8.17495 for the Google Trends values-expanded model. The difference between the predictive performances was insignificant (Diebold-Mariano Test, p-value = 0.4152). CONCLUSION: Google Trends data are a good correlate of the reported incidence of Lyme disease in Germany, but it failed to significantly improve the forecasting accuracy in models based on traditional data.
BACKGROUND: Online activity-based epidemiological surveillance and forecasting is getting more and more attention. To date, Google search volumes have not been assessed for forecasting of tick-borne diseases. Thus, we performed an analysis of forecasting of the Lyme disease incidence based on the traditional data extended with Google Trends. METHODS: Data on the weekly incidence of Lyme disease in Germany from 16 June 2013 to 27 May 2018 were obtained from the database of the Robert Koch Institute. Data of Internet searches were obtained from Google Trends searching "Borreliose" in Germany for the "last 5 years" as a timespan category. Data were split into the training (from 16 June 2013 to 11 June 2017) and validation (from 12 June 2017, to 27 May 2018) data sets. A seasonal autoregressive moving average model, SARIMA (0,1,1) (0,1,1) [52] model was selected to describe the time series of the weekly Lyme incidence. After this, we added the Google Trends data as an external regressor and identified the SARIMA (0,1,1) (0,1,1) [52] model as optimal. We made predictions for the validation interval using these two models and compared predictions with the values of the validation data set. RESULTS: Forecasting for the validation timespan resulted in similar values for the models. Comparing the forecasted values with the reported ones resulted in an residual mean squared error (RMSE) of 0.3763; the mean absolute percentage error (MAPE) was 8.233 for the model without Google searches with an RMSE of 0.3732; and the MAPE was 8.17495 for the Google Trends values-expanded model. The difference between the predictive performances was insignificant (Diebold-Mariano Test, p-value = 0.4152). CONCLUSION: Google Trends data are a good correlate of the reported incidence of Lyme disease in Germany, but it failed to significantly improve the forecasting accuracy in models based on traditional data.
Authors: Kirsi-Marja Zitting; Heidi M Lammers-van der Holst; Robin K Yuan; Wei Wang; Stuart F Quan; Jeanne F Duffy Journal: J Clin Sleep Med Date: 2021-02-01 Impact factor: 4.062
Authors: Daniel Romero-Alvarez; Nidhi Parikh; Dave Osthus; Kaitlyn Martinez; Nicholas Generous; Sara Del Valle; Carrie A Manore Journal: BMC Infect Dis Date: 2020-03-26 Impact factor: 3.090
Authors: Eric Kontowicz; Grant Brown; James Torner; Margaret Carrel; Kelly K Baker; Christine A Petersen Journal: PLoS One Date: 2022-03-10 Impact factor: 3.240
Authors: Giovanni S Marchini; Kauy V M Faria; Felippe L Neto; Fábio César Miranda Torricelli; Alexandre Danilovic; Fábio Carvalho Vicentini; Carlos A Batagello; Miguel Srougi; William C Nahas; Eduardo Mazzucchi Journal: World J Urol Date: 2020-10-27 Impact factor: 4.226