Literature DB >> 32520235

Short-term forecasting of daily COVID-19 cases in Brazil by using the Holt's model.

Edson Zangiacomi Martinez1, Davi Casale Aragon1, Altacílio Aparecido Nunes1.   

Abstract

INTRODUCTION: We evaluated the performance of the Holt's model to forecast the daily COVID-19 reported cases in Brazil and three Brazilian states.
METHODS: We chose the date of the first COVID-19 case to April 25, 2020, as the training period, and April 26 to May 3, 2020, as the test period.
RESULTS: The Holt's model performed well in forecasting the cases in Brazil and in São Paulo and Minas Gerais states, but the forecasts were underestimated in Rio de Janeiro state.
CONCLUSIONS: The Holt's model can be an adequate short-term forecasting method if their assumptions are adequately verified and validated by experts.

Entities:  

Mesh:

Year:  2020        PMID: 32520235      PMCID: PMC7269522          DOI: 10.1590/0037-8682-0283-2020

Source DB:  PubMed          Journal:  Rev Soc Bras Med Trop        ISSN: 0037-8682            Impact factor:   1.581


Coronavirus disease (COVID-19) is caused by SARS-CoV-2 (or 2019-nCOV), a pathogen that primarily targets the human respiratory system . The most common symptoms at the onset of the illness are cough, fever, and fatigue . The first cases were reported in December 2019 in Wuhan, Hubei Province, China, and rapidly spread throughout the country and then the world. In January 2020, the World Health Organization (WHO) declared that COVID-19 is a “public-health emergency of international concern” . To contribute in addressing the challenge of predicting the spread of the disease and obtaining short-term predictions, different types of mathematical and statistical models can be used (see and as examples). Accordingly, let us to consider an epidemic curve as a time series data of the daily number of cases of a disease, and let Y be the cumulative number of confirmed cases on day t. It is expected that this curve initially grows exponentially, but at a given moment, it slows and approaches a limit. Therefore, the simple exponential model is commonly used to describe the initial phase of an outbreak , and S-shaped models such as the logistic, Gompertz, log-normal, and Richards models are widely used to model all the reported cumulative cases of a disease . In the present communication, we alternatively propose the use of double exponential smoothing for short-term forecasting of the daily COVID-19 cases in Brazil, before the peak of the cases. Methods based on exponential smoothing are often used for forecasting. These methods are based on a moving average of past values only, so that the smoothed value at the present time is used as the forecast of the next value . The Holt-Winters exponential model is a more general method for smoothing the data when trend and seasonality are present. The double exponential smoothing (also called the Holt's method) is a special case in which seasonality is absent. Finally, the single exponential smoothing is used when no trend or seasonal components are present. In the equation for the Holt’s method, the forecasted value of the series at time t is given by where L is the estimated level given by T is the estimated slope given by and α and β are the smoothing parameters (technical details can be found in ). For applying the Holt's model, we used the holt function in the forecast library of the R language (version 3.6.2). Data on daily COVID-19 cases were obtained from the Brazilian Health Ministry (available at https://covid.saude.gov.br/). Our analysis included data from the whole country and from the Brazilian states of São Paulo, Minas Gerais, and Rio de Janeiro. These are the three most populous Brazilian states, and together, have more than 80 million inhabitants (approximately 40% of the Brazilian population). We considered the daily reports from the date on which the first case was notified in Brazil and in each state up to April 25, 2020, as the training period. The values of the validation period were the correspondent observations from April 26 to May 3, 2020. We compared the forecast accuracy of the Holt's method with those obtained by fitting the traditional logistic, Gompertz, log-normal, and Richards growth curves. These comparisons were based on the mean absolute percent error (MAPE), a measure based on the differences between the forecasted and the actual values. The Theil's U entropy coefficient was used as a measure of out-of-sample forecasting accuracy . When this coefficient is higher than 1, the forecasts under consideration are less accurate than those offered via a naïve approach, i.e., a simple method in which the forecasts are equal to the last observed value. Figure 1 shows the cumulative number of reports of COVID-19 until April 25, 2020, in Brazil and in the states of São Paulo, Minas Gerais, and Rio de Janeiro, and the forecasted values from the Holt’s method with their correspondent prediction intervals. These values are detailed in Table 1, which also compares the actual and forecasted daily values from April 26 to May 3, 2020. The Theil's U coefficients are lower than 1 for the forecasts considering the data from Brazil and the states of São Paulo and Minas Gerais, but higher than 1 when the data from the state of Rio de Janeiro is considered (values are shown in Figure 1). In addition, as observed in Table 1, almost all the actual daily reports of COVID-19 belong to the correspondent 95% prediction intervals, except for the forecasts considering the data from the state of Rio de Janeiro. The estimated number of cases tends to underestimate the actual reports of COVID-19 from April 27, owing to a sudden increase in notifications that started on this date.
FIGURE 1:

Time series for (a) Brazil and the states of (b) São Paulo, (c) Minas Gerais, and (d) Rio de Janeiro, showing point forecasts and 80% and 95% prediction intervals obtained using the Holt’s model (represented by the dark gray and the clear gray areas, respectively). The red points represent the actual number of notified cases.

TABLE 1:

Daily COVID-19 cases and the correspondents forecasts from the Holt’s method (with 95% prediction intervals), from April 26 to May 3, 2020.

DayObserved Forecasted 95% prediction
valuesvaluesinterval
BrazilApril 266188863598.7762684.76-64512.78
April 276650168898.8267035.24-70762.39
April 287188674198.8671131.73-77265.99
April 297816279498.9175031.55-83966.26
April 308538084798.9578762.27-90835.64
May 19158990099.0082341.48-97856.51
May 29655995399.0585781.96-105016.13
May 3101147100699.0989093.56-112304.62
São Paulo state April 262071521288.9120630.67-21947.16
April 272169622573.9621473.87-23674.05
April 282404123859.0122300.00-25418.02
April 292615825144.0623096.16-27191.96
April 302869826429.1123860.05-28998.16
May 13037427714.1624591.88-30836.44
May 23117428999.2125292.55-32705.87
May 33177230284.2625963.17-34605.35
Minas Gerais state April 2615481537.801501.70-1573.89
April 2715861600.341550.50-1650.19
April 2816491662.891597.30-1728.48
April 2917581725.441642.42-1808.47
April 3018271787.991686.04-1889.94
May 119351850.541728.31-1972.77
May 220231913.091769.32-2056.86
May 321181975.641809.15-2142.13
Rio de Janeiro state April 2671117169.337001.87-7336.78
April 2779447570.997345.19-7796.80
April 2885047972.667665.42-8279.90
April 2988698374.337968.45-8780.20
April 3094538775.998257.85-9294.13
May 1101669177.668535.73-9819.59
May 2105469579.328803.43-10355.22
May 3111399980.999061.89-10900.10
Considering the data from Brazil, the MAPE values for the forecasting methods based on the logistic, Gompertz, log-normal, and Richards curves are 17.09, 10.84, 9.05, and 10.84, respectively. These corresponding values are 21.81, 15.70, 14.37, and 15.70 considering the data from the state of São Paulo; 14.63, 8.52, 5.13, and 8.52 considering the data from the state of Minas Gerais; and 18.00, 10.54, 8.18, and 10.55 considering the data from the state of Rio de Janeiro. In all the situations, the MAPE values for the forecast based on the Holt’s method (shown in Figure 1) are smaller than those obtained from the fit of the traditional growth curves, showing a better performance of the Holt’s method compared to the others (even for the forecasts using data from the state of Rio de Janeiro). Figure 2 provides a visual comparison between the actual daily reports of COVID-19 from April 26 to May 3, 2020, and the forecasts from the different methods. Exponential models were not used in this analysis, as they performed poorly in describing the epidemic curves based on the training period.
FIGURE 2:

Comparison between the actual number of notified cases of COVID-19 and the forecasted values obtained from the Holt’s, logistic, Gompertz, log-normal, and Richards models, for the period from April 26 to May 3, 2020, considering (a) Brazil and the states of (b) São Paulo, (c) Minas Gerais, and (d) Rio de Janeiro.

In order to correctly interpret the results of these statistical models, we should keep in mind an important quote from Saffo : “The goal of forecasting is not to predict the future but to tell you what you need to know to take meaningful action in the present”. In this sense, the out-of-sample predicted values should be seen primarily as the daily number of cases of COVID-19 that we would expect to find if the epidemic curve continues to grow with the same behavior observed during the training period. The volatility of the time series of reported cases is highly dependent on extrinsic factors (such as the availability of tests for essential screening) as well as in the speed of updating and the availability of results and changes in the mitigation measures . In turn, these factors are affected by the incubation period of the virus of approximately 14 days (interquartile range, 8-17 days), with variations according to the age of the patient and status of the patient's immune system . Therefore, we can conclude that the Holt’s model showed good forecast performance for the data from Brazil and the states of São Paulo and Minas Gerais, probably because the behavior of the epidemic curves do not change significantly at the beginning of the validation period. This did not happen considering the data from the state of Rio de Janeiro. However, we do not believe that this is a defect of the method but rather a failure to comply with its assumptions. These observations apply to any mathematical or statistical model used for obtaining predictions of cases of COVID-19, and for that reason, every forecasting model should be accompanied by the expertise of trained individuals familiar with the dynamics of infectious diseases. In addition, we reinforce that the generalization of the results of this study is restricted to the objective of obtaining short-term forecasts for the cumulative number of cases of COVID-19 in a determined population, as the Holt’s model has a low sensitivity for predicting the peak of the outbreak or for providing long-term forecasts. An important and obvious limitation of this study is that it was conducted only using the reported number of COVID-19 cases that have been officially notified. Considering the insufficient number of screening tests and the consequent low effectiveness in confirming cases of COVID-19 in Brazil, it is obviously expected that the actual number of cases of the disease is much greater than that presented here , . Nevertheless, while these data are biased, they are the only source of information available that can guide our efforts to understand the outbreak dynamics. Because of the urgency for information that can be useful for the decision-making processes during the course of an epidemic, we consider that these data are “that's what we have for today,” and that they can be properly used when their potential limits are well discussed. As an additional commentary, the models presented in this study only represent the cumulative number of cases of a disease, while other more complex models can provide more accurate predictions by also taking into account the number of susceptible and recovered individuals (called susceptible-infected-recovered [SIR] models and their extensions) . In conclusion, despite all the problems described herein that make the prediction of cases of COVID-19 a challenging task, the Holt’s model can be an adequate alternative to the traditional S-shaped curves if their assumptions are adequately verified and validated by experts.
  11 in total

1.  Six rules for accurate effective forecasting.

Authors:  Paul Saffo
Journal:  Harv Bus Rev       Date:  2007 Jul-Aug

2.  Estimating initial epidemic growth rates.

Authors:  Junling Ma; Jonathan Dushoff; Benjamin M Bolker; David J D Earn
Journal:  Bull Math Biol       Date:  2013-11-23       Impact factor: 1.758

3.  The COVID-19 pandemic in Brazil: chronicle of a health crisis foretold.

Authors:  Guilherme Loureiro Werneck; Marilia Sá Carvalho
Journal:  Cad Saude Publica       Date:  2020-05-08       Impact factor: 1.632

4.  Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors:  Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

5.  How will country-based mitigation measures influence the course of the COVID-19 epidemic?

Authors:  Roy M Anderson; Hans Heesterbeek; Don Klinkenberg; T Déirdre Hollingsworth
Journal:  Lancet       Date:  2020-03-09       Impact factor: 79.321

6.  Reporting, Epidemic Growth, and Reproduction Numbers for the 2019 Novel Coronavirus (2019-nCoV) Epidemic.

Authors:  Ashleigh R Tuite; David N Fisman
Journal:  Ann Intern Med       Date:  2020-02-05       Impact factor: 25.391

Review 7.  The epidemiology and pathogenesis of coronavirus disease (COVID-19) outbreak.

Authors:  Hussin A Rothan; Siddappa N Byrareddy
Journal:  J Autoimmun       Date:  2020-02-26       Impact factor: 7.094

Review 8.  The SARS-CoV-2 outbreak: What we know.

Authors:  Di Wu; Tiantian Wu; Qun Liu; Zhicong Yang
Journal:  Int J Infect Dis       Date:  2020-03-12       Impact factor: 3.623

9.  Early dynamics of transmission and control of COVID-19: a mathematical modelling study.

Authors:  Adam J Kucharski; Timothy W Russell; Charlie Diamond; Yang Liu; John Edmunds; Sebastian Funk; Rosalind M Eggo
Journal:  Lancet Infect Dis       Date:  2020-03-11       Impact factor: 25.071

10.  Level of underreporting including underdiagnosis before the first peak of COVID-19 in various countries: Preliminary retrospective results based on wavelets and deterministic modeling.

Authors:  Steven G Krantz; Arni S R Srinivasa Rao
Journal:  Infect Control Hosp Epidemiol       Date:  2020-04-09       Impact factor: 3.254

View more
  6 in total

1.  Risk-based cost-benefit analysis of alternative vaccines against COVID-19 in Brazil: Coronavac vs. Astrazeneca vs. Pfizer.

Authors:  Paulo Gabriel Siqueira; Heitor Oliveira Duarte; Márcio das Chagas Moura
Journal:  Vaccine       Date:  2022-05-19       Impact factor: 4.169

2.  Prediction and analysis of COVID-19 daily new cases and cumulative cases: times series forecasting and machine learning models.

Authors:  Yanding Wang; Zehui Yan; Ding Wang; Meitao Yang; Zhiqiang Li; Xinran Gong; Di Wu; Lingling Zhai; Wenyi Zhang; Yong Wang
Journal:  BMC Infect Dis       Date:  2022-05-25       Impact factor: 3.667

3.  Feasibility of very short-term forecast models for COVID-19 hospital-based surveillance.

Authors:  Edson Zangiacomi Martinez; Afonso Dinis Costa Passos; Antônio Fernando Cinto; Andreia Cássia Escarso; Rosane Aparecida Monteiro; Jorgete Maria E Silva; Fernando Bellissimo-Rodrigues; Davi Casale Aragon
Journal:  Rev Soc Bras Med Trop       Date:  2021-02-10       Impact factor: 1.581

4.  The derived demand for advertising expenses and implications on sustainability: a comparative study using deep learning and traditional machine learning methods.

Authors:  Sule Birim; Ipek Kazancoglu; Sachin Kumar Mangla; Aysun Kahraman; Yigit Kazancoglu
Journal:  Ann Oper Res       Date:  2022-01-07       Impact factor: 4.854

5.  Long-term forecasts of the COVID-19 epidemic: a dangerous idea.

Authors:  Edson Zangiacomi Martinez; Davi Casale Aragon; Altacílio Aparecido Nunes
Journal:  Rev Soc Bras Med Trop       Date:  2020-08-26       Impact factor: 1.581

6.  Is it time to talk about the end of social distancing? A joinpoint analysis of COVID-19 time series in Brazilian capitals.

Authors:  Raphael Mendonça Guimarães; Mônica de Avelar Figueiredo Mafra Magalhães; Diego Ricardo Xavier; Raphael de Freitas Saldanha; Rafael de Castro Catão
Journal:  Rev Soc Bras Med Trop       Date:  2020-09-21       Impact factor: 1.581

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.