Literature DB >> 32673377

Good times bad times: Automated forecasting of seasonal cryptosporidiosis in Ontario using machine learning.

Olaf Berke1, Lise Trotz-Williams1,2, Simon de Montigny3,4.   

Abstract

BACKGROUND: The rise of big data and related predictive modelling based on machine learning algorithms over the last two decades have provided new opportunities for disease surveillance and public health preparedness. Big data come with the promise of faster generation of and access to more precise information, potentially facilitating predictive precision in public health ("precision public health"). As an example, we considered forecasting of the future course of the monthly cryptosporidiosis incidence in Ontario.
METHODS: The traditional statistical approach to forecasting is the seasonal autoregressive integrated moving-average (SARIMA) model. We applied SARIMA and an artificial neural network (ANN) approach, specifically a feed-forward neural network, to predict monthly cryptosporidiosis incidence in Ontario in 2017 using 2005-2016 data as a training set. Both forecasting approaches are automated to make them relevant in a disease surveillance context. We compared the resulting forecasts using the root mean squared error (RMSE) and mean absolute error (MAE) as measures of predictive accuracy.
RESULTS: Cryptosporidiosis is a seasonal disease, which peaks in Ontario in late summer. In this study, the SARIMA model and ANN forecasting approaches captured the seasonal pattern of cryptosporidiosis well. Contrary to similar studies reported in the literature, the ANN forecasts of cryptosporidiosis were slightly less accurate than the SARIMA model forecasts.
CONCLUSION: The ANN and SARIMA approaches are suitable for automated forecasting of public health time series data from surveillance systems. Future studies should employ additional algorithms (e.g. random forests) and assess accuracy by using alternative diseases for case studies and conducting rigorous simulation studies. Difference between the forecasts from the machine learning algorithm, that is, the ANN, and the statistical learning model, that is, the SARIMA, should be considered with respect to philosophical differences between the two approaches.

Entities:  

Keywords:  SARIMA; artificial neural network; cryptosporidiosis; disease surveillance; forecasting; machine learning; seasonal time series; statistical learning

Year:  2020        PMID: 32673377      PMCID: PMC7343056          DOI: 10.14745/ccdr.v46i06a07

Source DB:  PubMed          Journal:  Can Commun Dis Rep        ISSN: 1188-4169


  11 in total

1.  Infodemiology: The epidemiology of (mis)information.

Authors:  Gunther Eysenbach
Journal:  Am J Med       Date:  2002-12-15       Impact factor: 4.965

2.  A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models.

Authors:  Evangelia Christodoulou; Jie Ma; Gary S Collins; Ewout W Steyerberg; Jan Y Verbakel; Ben Van Calster
Journal:  J Clin Epidemiol       Date:  2019-02-11       Impact factor: 6.437

3.  Big data. The parable of Google Flu: traps in big data analysis.

Authors:  David Lazer; Ryan Kennedy; Gary King; Alessandro Vespignani
Journal:  Science       Date:  2014-03-14       Impact factor: 47.728

4.  A glossary for big data in population and public health: discussion and commentary on terminology and research methods.

Authors:  Daniel Fuller; Richard Buote; Kevin Stanley
Journal:  J Epidemiol Community Health       Date:  2017-09-16       Impact factor: 3.710

5.  What is Machine Learning? A Primer for the Epidemiologist.

Authors:  Qifang Bi; Katherine E Goodman; Joshua Kaminsky; Justin Lessler
Journal:  Am J Epidemiol       Date:  2019-12-31       Impact factor: 4.897

Review 6.  Cryptosporidium species in humans and animals: current understanding and research needs.

Authors:  Una Ryan; Ronald Fayer; Lihua Xiao
Journal:  Parasitology       Date:  2014-08-11       Impact factor: 3.234

7.  Exploring the geographical distribution of cryptosporidiosis in the cattle population of Southern Ontario, Canada, 2011-2014.

Authors:  Andrea Nwosu; Olaf Berke; David L Pearl; Lise A Trotz-Williams
Journal:  Geospat Health       Date:  2019-11-06       Impact factor: 1.212

Review 8.  Big Data in Public Health: Terminology, Machine Learning, and Privacy.

Authors:  Stephen J Mooney; Vikas Pejaver
Journal:  Annu Rev Public Health       Date:  2017-12-20       Impact factor: 21.981

9.  Comparative study of four time series methods in forecasting typhoid fever incidence in China.

Authors:  Xingyu Zhang; Yuanyuan Liu; Min Yang; Tao Zhang; Alistair A Young; Xiaosong Li
Journal:  PLoS One       Date:  2013-05-01       Impact factor: 3.240

Review 10.  A perspective on Cryptosporidium and Giardia, with an emphasis on bovines and recent epidemiological findings.

Authors:  Harshanie Abeywardena; Aaron R Jex; Robin B Gasser
Journal:  Adv Parasitol       Date:  2015-03-23       Impact factor: 3.870

View more
  1 in total

1.  LANDMark: an ensemble approach to the supervised selection of biomarkers in high-throughput sequencing data.

Authors:  Josip Rudar; Teresita M Porter; Michael Wright; G Brian Golding; Mehrdad Hajibabaei
Journal:  BMC Bioinformatics       Date:  2022-03-31       Impact factor: 3.169

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.