| Literature DB >> 35499039 |
E Camargo1, J Aguilar2,3,4, Y Quintero2, F Rivas5,3, D Ardila2.
Abstract
Several works have proposed predictive models of the SEIRD (Susceptible, Exposed, Infected, Recovered, and Dead) variables to characterize the pandemic of COVID-19. One of the challenges of these models is to be able to follow the dynamics of the disease to make more precise predictions. In this paper, we propose an approach based on incremental learning to build predictive models of the SEIRD variables for the COVID-19 pandemic. Our incremental learning approach is a dynamic ensemble method based on a bagging scheme that allows the addition of new models or the updating of incremental models. The article proposes an incremental learning architecture composed of two components. The first component carries out an analysis of the interdependencies of the SEIRD variables and the second component is an incremental learning model that builds/updates the predictive models. The paper analyses the quality of the predictive models of our incremental learning approach using data of the COVID-19 from Colombia, and shows interesting results about the predictions of the SEIRD variables. These results are compared with an incremental learning approach based on random forests.Entities:
Keywords: COVID-19; Machine learning; Prediction model
Year: 2022 PMID: 35499039 PMCID: PMC9035346 DOI: 10.1007/s12553-022-00668-5
Source DB: PubMed Journal: Health Technol (Berl) ISSN: 2190-7196
Fig. 1Incremental Learning Architecture
Fig. 2General procedure for selection/extraction of the best features
Fig. 3Our incremental learning approach
Variables statistics
| Average | 16.097 | 3.929 | 3.284 | 135 |
| min | 258 | 5 | 5 | 0 |
| max | 41.434 | 13.056 | 12.295 | 438 |
| sd | 12.328 | 4.010 | 3.670 | 132 |
Average MAPE predictions for August 2020
| Susceptible | 0.025 | 0.038 | 0.097 | 0.009 |
| Exposed | 5.484 | 7.775 | 7.703 | 7.931 |
| Infected | 6.816 | 6.529 | 7.162 | 8.738 |
| Recovered | 6.306 | 6.891 | 7.714 | 7.956 |
| Deaths | 11.651 | 17.566 | 18.348 | 16.565 |
Average MAPE predictions for September 2020
| Susceptible | 0.763 | 1.046 | 0.726 | 1.020 |
| Exposed | 6.932 | 6.766 | 6.546 | 7.077 |
| Infected | 6.613 | 7.700 | 7.478 | 9.128 |
| Recovered | 8.005 | 8.146 | 8.554 | 7.585 |
| Dead | 13.619 | 12.925 | 14.210 | 13.469 |
Target and Predictor variables for machine learning models
| Susceptible | exposed(t-5), exposed(t-6), infected(t-5), recovered(t-5), recovered(t-7) |
| Exposed | susceptible(t-6), infected(t-7), deaths(t-5), dead(t-7) |
| Infected | exposed(t-6), exposed(t-7), deaths(t-5), dead(t-6) |
| Recovered | susceptible(t-6), susceptible(t-7), exposed(t-5), exposed(t-6), exposed(t-7), infected(t-5), infected(t-7), dead(t-5) |
| Dead | susceptible(t-5), exposed(t-5), exposed(t-6), infected(t-5), recovered(t-5), recovered(t-7) |
Quality of the models used to predict learning incremental the SEIRD variables for Colombia
| S | Gradient boosting | 0.990 | 0.001 | 71.554 | 0.999 | 3.1e-05 | 70.271 | 0.993 | 0.001 | 70.770 |
| Random forest | 0.922 | 0.003 | 71.581 | 0.822 | 0.009 | 71.415 | 0.892 | 0.008 | 72.180 | |
| Linear regressor | 0.998 | 0.001 | 76.664 | 0.996 | 0.001 | 76.761 | 0.911 | 0.002 | 75.890 | |
| LM-BFGS | 0.933 | 0.001 | 77.105 | 0.994 | 0.001 | 77.054 | 0.922 | 0.001 | 78.200 | |
| E | Gradient boosting | 0.788 | 0.006 | 77.424 | 0.481 | 0.015 | 70.271 | 0.651 | 0.021 | 76.770 |
| Random forest | 0.842 | 0.006 | 71.786 | 0.987 | 0.001 | 71.502 | 0.887 | 0.07 | 70.118 | |
| Linear regressor | 0.788 | 0.006 | 76.064 | 0.899 | 0.003 | 76.681 | 0.809 | 0.003 | 75.889 | |
| LM-BFGS | 0.950 | 0.005 | 77.905 | 0.846 | 0.006 | 77.054 | 0.801 | 0.011 | 78.020 | |
| I | Gradient boosting | 0.872 | 0.004 | 74.564 | 0.990 | 0.001 | 70.071 | 0.810 | 0.002 | 70.077 |
| Random forest | 0.858 | 0.005 | 75.681 | 0.443 | 0.010 | 71.102 | 0.343 | 0.021 | 71.618 | |
| Linear regressor | 0.860 | 0.005 | 76.664 | 0.824 | 0.004 | 76.781 | 0.840 | 0.004 | 75.989 | |
| LM-BFGS | 0.833 | 0.006 | 77.803 | 0.863 | 0.004 | 77.154 | 0.791 | 0.007 | 78.120 | |
| R | Gradient boosting | 0.979 | 0.001 | 71.554 | 0.995 | 0.001 | 70.971 | 0.845 | 0.013 | 70.177 |
| Random forest | 0.955 | 0.005 | 70.923 | 0.229 | 0.030 | 71.102 | 0.246 | 0.031 | 71.118 | |
| Linear regressor | 0.884 | 0.006 | 76.664 | 0.980 | 0.001 | 76.782 | 0.980 | 0.001 | 75.189 | |
| LM-BFGS | 0.938 | 0.002 | 78.302 | 0.991 | 0.001 | 77.154 | 0.981 | 0.004 | 78.120 | |
| D | Gradient boosting | 0.919 | 0.007 | 70.432 | 0.968 | 0.001 | 70.871 | 0.955 | 0.003 | 70.177 |
| Random forest | 0.933 | 0.008 | 70.581 | 0.188 | 0.036 | 71.502 | 0.231 | 0.030 | 71.118 | |
| Linear regressor | 0.829 | 0.009 | 75.342 | 0.959 | 0.002 | 76.681 | 0.940 | 0.003 | 75.189 | |
| LM-BFGS | 0.863 | 0.006 | 76.903 | 0.935 | 0.002 | 77.054 | 0.912 | 0.003 | 78.120 | |
Fig. 4Prediction of the Susceptible variable with our Incremental learning approach
Fig. 5Prediction of the Exposed variable with our Incremental learning approach
Fig. 6Prediction of the Infectious variable with our Incremental learning approach
Fig. 7Prediction of the Recovered variable with our Incremental learning approach
Fig. 8Prediction of the Death variable with our Incremental learning approach