| Literature DB >> 36249862 |
Marcelo Benedeti Palermo1, Lucas Micol Policarpo1, Cristiano André da Costa1, Rodrigo da Rosa Righi1.
Abstract
This systematic review aims to study and classify machine learning models that predict pandemics' evolution within affected regions or countries. The advantage of this systematic review is that it allows the health authorities to decide what prediction model fits best depending upon the region's criticality and optimize hospitals' approaches to preparing and anticipating patient care. We searched ACM Digital Library, Biomed Central, BioRxiv+MedRxiv, BMJ, Computers and Applied Sciences, IEEEXplore, JMIR Medical Informatics, Medline Daily Updates, Nature, Oxford Academic, PubMed, Sage Online, ScienceDirect, Scopus, SpringerLink, Web of Science, and Wiley Online Library between 1 January 2020 and 31 July 2022. We divided the interventions into similarities between cumulative COVID-19 real cases and machine learning prediction models' ability to track pandemics trending. We included 45 studies that rated low to high risk of bias. The standardized mean differences (SMD) for the two groups were 0.18, 95% CI, with interval of [0.01, 0.35], I 2 =0, and p value=0.04. We built a taxonomic analysis of the included studies and determined two domains: pandemics trending prediction models and geolocation tracking models. We performed the meta-analysis and data synthesis and got low publication bias because of missing results. The level of certainty varied from very low to high. By submitting the 45 studies on the risk of bias, the levels of certainty, the summary of findings, and the statistical analysis via the forest and funnel plots assessments, we could determine the satisfactory statistical significance homogeneity across the included studies to simulate the progress of the pandemics and help the healthcare authorities to take preventive decisions.Entities:
Keywords: Machine learning models; Meta-analysis; Pandemic prediction; Systematic review
Year: 2022 PMID: 36249862 PMCID: PMC9553296 DOI: 10.1007/s13721-022-00384-0
Source DB: PubMed Journal: Netw Model Anal Health Inform Bioinform ISSN: 2192-6670
All databases covered in this systematic review
| Database | Coverage |
|---|---|
| ACM Digital Library | January 2020 to July 2022 |
| BMJ Journals | January 2020 to July 2022 |
| BioRxiv + MedRxiv | January 2020 to July 2022 |
| EBSCOhost | |
| Computers and Applied Sciences | January 2020 to July 2022 |
| IEEE Xplore | January 2020 to July 2022 |
| ElSevier | |
| ScienceDirect | January 2020 to July 2022 |
| Scopus | January 2020 to July 2022 |
| Journal of Medical Internet Research | |
| JMIR Medical Informatics | January 2020 to July 2022 |
| Ovid | |
| MEDLINE(R) Daily Update | January 2020 to July 2022 |
| Oxford Academic (all journals) | January 2020 to July 2022 |
| NCBI PubMed PMC | January 2020 to July 2022 |
| Sage Journals Online | January 2020 to July 2022 |
| Springer Nature | |
| Biomed Central Journals | January 2020 to July 2022 |
| Nature | January 2020 to July 2022 |
| Springer Link | January 2020 to July 2022 |
| Web of Science | January 2020 to July 2022 |
| Wiley Online Library | January 2020 to July 2022 |
Base search string we use during databases record retrieval
| Search String | |
|---|---|
| (COVID-19 OR SARS-Cov-2) AND (pandemic scenario) AND ((epidemiological model) OR (predictive model)) AND (machine learning) |
Fig. 1Steps performed to execute the data collection process
Fig. 2Taxonomic analysis and definition of (a) Domain 1: COVID-19 pandemics’ trending prediction and (b) Domain 2: COVID-19 geolocation tracking and prediction
All databases used for this systematic review
| Database | Records retrieved |
|---|---|
| The ACM Guide to Computing Literature | 638 |
| Biomed Central Journals | 61 |
| BioRxiv + MedRxiv | 93 |
| BMJ Journals | 25 |
| Computers and Applied Sciences (EBSCOhost) | 965 |
| IEEE Xplore | 16 |
| JMIR Medical Informatics | 168 |
| Medline Daily Updates (Ovid) | 165 |
| Nature | 114 |
| Oxford Academic | 119 |
| PubMed PMC | 1795 |
| Sage Journals Online | 370 |
| ScienceDirect (Elsevier) | 1289 |
| Scopus (Elsevier) | 21 |
| SpringerLink | 458 |
| Web of Science | 17 |
| Wiley Online Library | 717 |
Fig. 3PRISMA 2020 workflow
Fig. 4Taxonomy of the studies included in this systematic review
Answers to research questions
| Author, year | Country | Proposed methodology | Questions |
|---|---|---|---|
| Abbasimehr et al. ( | Iran, USA, Italy | Use time-series augmentation and Ensemble Bagging ML model to predict the incidence of COVID-19. | GQ1, GQ2, SQ1, SQ3 |
| Abdallah et al. ( | Tunisia | Propose a new strategy based on reinforcement learning (RL) to study the dynamics of COVID-19 evolution | GQ1, GQ2, SQ1, SQ3 |
| Ahouz and Golabpour ( | Iran | Predict the incidence of COVID-19 within two weeks to better manage the disease using the Least-Square Boosting Classification algorithm. | GQ1, SQ1, SQ4 |
| Alali et al. ( | Saudi Arabia | Present an efficient machine learning method to forecast future trends of COVID-19 spread through Bayesian optimization | GQ1, GQ2, SQ1, SQ3 |
| Alamrouni et al. ( | Saudi Arabia, Turkey, Nigeria, Cyprus | Predict the cumulative COVID-19 cases through ARIMA and ensemble ML models | GQ1, GQ2, SQ1, SQ3 |
| Alassafi et al. ( | Saudi Arabia | Predict the cumulative COVID-19 positive cases by comparing RNN and LSTM neural networks’ efficiency | GQ1, GQ2, SQ3 |
| Alanazi et al. ( | Saudi Arabia, Canada | Model for predicting COVID-19 using the SIR and ML for innovative health care. | GQ1, SQ3, SQ4 |
| Amaral et al. ( | Brazil | A data-driven SIR model whose parameters are fully calibrated by temporal functions learned from individual regressors and trained on different data sources. | GQ1, SQ1 |
| Barraza et al. ( | Argentina | Model the epidemic spread as births in a population, where every birth corresponds to a new infection case. The paper introduces a new Markovian stochastic model that can be described as a non-homogeneous Pure Birth process. | GQ1, GQ2 |
| Basu and Campbell ( | USA | A prediction long short-term memory (LSTM) based model that can be trained on more than four months of cumulative COVID-19 cases and deaths. | GQ1, GQ2, SQ1 |
| Bedi et al. ( | India | A modified SEIRD model is based on the SEIRD model coupled with a deep learning-based long short-term memory (LSTM) model for COVID-19 case prediction trending. | GQ1, SQ1, SQ3, SQ4 |
| Bi et al. ( | USA | Present a GRU-based hybrid model to predict the spread of COVID-19 across USA | GQ1, GQ2, SQ1, SQ3 |
| Bushira and Ongala ( | Namibia | Using geospatial technologies, identify and map COVID-19 risk zones and model future COVID-19 responses in Namibia. | GQ1, GQ2, SQ1, SQ4 |
| Casini and Roccetti ( | Italy | Present three computational models of increasing complexity (linear, negative binomial regression, and cognitive) to identify the one that better correlates the relationship between Italian tourist flows during the summer of 2020 and the resurgence of COVID-19 cases across the country. | GQ1, SQ1 |
| Chandra et al. ( | Australia, India | Employ LSTM models to forecast the spread of COVID-19 infections among selected states in India. | GQ1, GQ2, SQ1, SQ3 |
| Chyon et al. ( | Bangladesh | Predict the spread of COVID-19 infections by employing ARIMA and time-series machine learning. | GQ1, SQ1, SQ3 |
| de Araújo Morais and da Silva Gomes ( | Brazil | Employ ARIMA with neural network models to forecast the spread of COVID-19 infections | GQ1, SQ1, SQ3 |
| Doornik et al. ( | UK | Build a short-term forecasting ML model for the Coronavirus pandemic | GQ1, GQ2, SQ3 |
| Fang et al. ( | China | Propose a data-driven XGBoost model for the prediction of COVID-19 | GQ1, GQ2, SQ1, SQ3 |
| Garetto et al. ( | Italy | Compared to traditional compartmental models, a time-modulated Hawkes process to model the spread of COVID-19. | GQ1, GQ2 |
| Giacopelli ( | Italy | A full-scale individual-based model of the COVID-19 outbreak to test various scenarios about the pandemic and achieve novel performance metrics. | GQ1, SQ1, SQ4 |
| Haghighat ( | Iran | A combined multilayer perceptron (MLP) neural network and Markov chain (MC) model to predict two indicators of the number of discharged and death cases according to their relationship with the number of hospitalized patients. | GQ1, GQ2, SQ1, SQ3 |
| Kolozsvári et al. ( | Hungary | Use the official epidemiological data to forecast the epidemic curves (daily new cases) of COVID-19 using Artificial Intelligence (AI)-based Recurrent Neural Networks (RNNs), then compare and validate the predicted models with the observed data. | GQ1, GQ2, SQ1, SQ3 |
| Kou et al. ( | China | A multi-scale agent-based model (MSABM) for disease spread between cities and in a town. The model investigates the infectious disease propagation between cities and within a city using the knowledge from person-to-person transmission. | GQ1, GQ2, SQ1 |
| Kuo and Fu ( | USA | Develop a COVID-19 case predicting model based on county-level data with ML techniques, and the next-1-day (N1D), 4-day (N4D), and 7-day (N7D) averages of daily cases and cumulative cases would be used as a response respectively. | GQ1, GQ2, SQ1, SQ4 |
| Majhi et al. ( | USA, India | Build predictive models that can predict the number of positive cases with higher accuracy. Regression-based, Decision tree-based, and Random forest-based models have been built on the data from China to validate India’s sample. | GQ1, SQ3 |
| Malakar ( | India | Model the COVID-19 vulnerability using an integrated fuzzy multi-criteria decision-making (MCDM) approach, namely fuzzy-analytical hierarchy process (AHP) and fuzzy-technique for order preference by similarity to ideal solution (TOPSIS) for West Bengal, India, through geographic information system (GIS). | GQ1, GQ2, SQ1, SQ4 |
| Mallick et al. ( | India | Predict the COVID-19 trajectories of the infected population of Indian states using SEIR mathematical model and LASSO ML. | GQ1, SQ1, SQ3 |
| Marzouk et al. ( | Egypt | Apply artificial intelligence-based models to predict the prevalence of the COVID-19 outbreak in Egypt. These models are long short-term memory networks (LSTM), convolutional neural networks, and multilayer perceptron neural networks. | GQ1, GQ2, SQ1, SQ3 |
| Senthilkumar et al. ( | India, Australia, Pakistan, UK, USA, United Arab Emirates | A multimodel ML technique called ensemble learning, autoregressive, and moving regressive (EAMA). The multimodel forecasts COVID-19 parameters in the long term within India and globally. | GQ1, SQ1 |
| Namasudra et al. ( | India | Present a novel Nonlinear Autoregressive (NAR) Neural Network Time-Series (NAR-NNTS) model to forecast COVID-19 cases. The authors train NAR-NNTS model with Scaled Conjugate Gradient (SCG), Levenberg–Marquardt (LM), and Bayesian Regularization (BR) training algorithms. | GQ1, GQ2, SQ1, SQ3 |
| Nobi et al. ( | Bangladesh, South Korea | Apply principal component analysis (PCA) to the correlations for the changes of cumulative time-series of COVID-19 death and confirmed cases between different countries | GQ1, GQ2, SQ1, SQ3 |
| Ohi et al. ( | Bangladesh, Saudi Arabia | Demonstrate what actions an agent (trained using reinforcement learning) may take in different possible pandemic scenarios depending on the spread of disease and economic factors. | GQ1, SQ1 |
| Ozik et al. ( | USA, France | Present CityCOVID, which is a detailed agent-based model that enables the creation of efficient, urban-scale MPI-distributed agent-based models (ABMs). | GQ1, GQ2, SQ1, SQ3, SQ4 |
| Pan et al. ( | Singapore, Taiwan, USA, Poland | A spatiotemporal analysis framework under the combination of an ensemble model (random forest regression) and a multi-objective optimization algorithm (NSGA-II). | GQ1, GQ2, SQ1 |
| Pang et al. ( | China | Collect five indicators of the data of COVID-19 and its loss in Hubei Province. By applying PCA, the authors established the disaster loss index model of COVID-19 in Hubei Province. | GQ1, SQ1 |
| Pourghasemi et al. ( | Iran, USA | Predicting mortality trends was built using regression modeling, spatial modeling, and risk mapping. The authors did the change detection using the random forest (RF) ML technique and the modeled risk map validation. | GQ1, GQ2, SQ1, SQ3, SQ4 |
| Raheja et al. ( | India, UK | Present a novel approach (ML diffusion) prediction model for predicting the number of Coronavirus cases in four countries: India, France, China, and Nepal. | GQ1, GQ2, SQ1 |
| Rashed and Hirata ( | Japan, Egypt | Analyze daily positive cases (DPC) data using a ML model to understand the effect of new viral variants on morbidity rates. | GQ2, SQ2 |
| Sah et al. ( | Turkey, India, Saudi Arabia | Forecast COVID-19 Pandemic by combining ARIMA and LSTM-GRU Models in India | GQ1, SQ1, SQ3 |
| Shoaib et al. ( | Taiwan, Pakistan, Thailand | Use the innovative design of the NAR-RBFs neural network paradigms designed to construct the SITR epidemic differential equation (DE) model to ascertain the different features of the spread of COVID-19. | GQ1, GQ2, SQ1 |
| Swaraj et al. ( | India, Brazil | Present an ensemble model that integrates an autoregressive moving average model (ARIMA), and a nonlinear autoregressive neural network (NAR) to predict future cases of COVID-19. | GQ1, GQ2, SQ1 |
| Ullah et al. ( | USA, Canada | Apply nonlinear modal regression and Monte Carlo algorithm to predict COVID-19 confirmed cases. | GQ2, SQ1, SQ3 |
| Vasconcelos et al. ( | Brazil | A generalized logistic growth model with time-dependent parameters to describe the fatality curves of the COVID-19 disease. Such an approach applies to several countries that exhibit multiple waves of infections. | GQ1, GQ2, SQ1 |
| Zivkovic et al. ( | Serbia, India, Vietnam, Turkey | Improve the current time-series prediction (forecasting) algorithms based on hybrid ML and nature-inspired algorithms to help predict future COVID-19 cases. | GQ1, GQ2, SQ1 |
Fig. 5Risk of bias from RoB 2.0 tool for each included study
Fig. 6Risk-of-bias summary from RoB 2.0 tool
Fig. 7Forest plot for effect sizes. The figure displays the meta-analysis consisting of the summary statistics [standardized mean (SMD), standard deviation (SD), sample size, and weight proportion] for comparison between predicted cases by ML model vs. actual COVID-19 cases. It also displays the mean difference and its 95% confidence interval (CI) for the continuous outcome
Fig. 8Funnel plot based on non-zero studies from forest plot (Fig. 7). The x-axis shows the SMD, while the y-axis shows the standard error (SE)
Fig. 9Question for domain 1 GRADE assessment: ML prediction compared to real cumulative cases for tracking COVID-19 pandemics’ future trending. Explanations: (a) Mohan et al. 2021 study should compare real and predicted cases; (b) Raheja et al. 2021 study presents an abnormal standard error and increases the risk of publication bias; (c) Marzouk et al. 2021 study should compare actual and predicted cases; (d) Basu et al. 2020 appear as an outlier in the funnel plot, indicating possible imprecision; (e) Vasconcelos et al. 2021 bring results, but presenting a low-graph resolution, which may cause imprecisions; (f) Pang et al. 2021 study should bring the predicted vs. real cases, as this is the goal of this study; (g) Shoaib et al. 2021 bring no outcomes; (h) Haghighat 2021 brings no outcomes; (i) Pan et al. 2020 graphs with predicted vs. real cases have low resolution, affecting exact values from the study; (j) Ohi et al. 2020 and Kou et al. 2021 do not bring results about predicted vs. real cumulative cases; (k, l) Sah 2022 and Bi 2022 should bring the predicted vs. real cases, as this is the goal of this study. There is only continuity from actual to predicted cases (Senthilkumar et al. 2021; Raheja et al. 2021; Marzouk et al. 2021; Basu and Campbell 2020; Vasconcelos et al. 2021; Pang et al. 2021; Shoaib et al. 2021; Haghighat 2021; Pan et al. 2021; Ohi et al. 2020; Kou et al. 2021; Sah et al. 2022; Bi et al. 2022)
Fig. 10Question for domain 2 GRADE assessment: ML prediction compared to real geographic location for tracking COVID-19 pandemics critical geolocations: (a) Ozik et al. 2021 have no outcomes for cumulative COVID-19 cases’ prediction. Giacopelli 2021 brings very few observations; (b) Malakar et al. 2021 bring imprecise outcomes for cumulated COVID-19 cases during modeling; (c) Bushira et al. 2021 have no outcomes for cumulative COVID-19 cases prediction; (d) Kuo et al. 2021 have no outcomes for cumulative COVID-19 cases’ prediction (Ozik et al. 2021; Giacopelli 2021; Malakar 2021; Bushira and Ongala 2021; Kuo and Fu 2021)