Shabnam Naher1,2, Fazle Rabbi3, Md Moyazzem Hossain4,5, Rajon Banik1, Sabbir Pervez6,4, Anika Bushra Boitchi1. 1. Department of Public Health and Informatics Jahangirnagar University Dhaka Bangladesh. 2. Department of Health Science University of Alabama Tuscaloosa Alabama USA. 3. Palli Daridro Bimichon Foundation (PDBF) Dhaka Bangladesh. 4. School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne UK. 5. Department of Statistics Jahangirnagar University Dhaka Bangladesh. 6. Heller School for Social Policy and Management Brandeis University Massachusetts USA.
Abstract
Background: Dengue is an alarming public health concern in terms of its preventive and curative measures among people in Bangladesh; moreover, its sudden outbreak created a lot of suffering among people in 2018. Considering the greater burden of disease in larger epidemic years and the difficulty in understanding current and future needs, it is highly needed to address early warning systems to control epidemics from the earliest. Objective: The study objective was to select the most appropriate model for dengue incidence and using the selected model, the authors forecast the future dengue outbreak in Bangladesh. Methods and Materials: This study considered a secondary data set of monthly dengue occurrences over the period of January 2008 to January 2020. Initially, the authors found the suitable model from Autoregressive Integrated Moving Average (ARIMA), Error, Trend, Seasonal (ETS) and Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend and Seasonal (TBATS) models with the help of selected model selection criteria and finally employing the selected model make forecasting of dengue incidences in Bangladesh. Results: Among ARIMA, ETS, and TBATS models, the ARIMA model performs better than others. The Box-Jenkin's procedure is applicable here and it is found that the best-selected model to forecast the dengue outbreak in the context of Bangladesh is ARIMA (2,1,2). Conclusion: Before establishing a comprehensive plan for future combating strategies, it is vital to understand the future scenario of dengue occurrence. With this in mind, the authors aimed to select an appropriate model that might predict dengue fever outbreaks in Bangladesh. The findings revealed that dengue fever is expected to become more frequent in the future. The authors believe that the study findings will be helpful to take early initiatives to combat future dengue outbreaks.
Background: Dengue is an alarming public health concern in terms of its preventive and curative measures among people in Bangladesh; moreover, its sudden outbreak created a lot of suffering among people in 2018. Considering the greater burden of disease in larger epidemic years and the difficulty in understanding current and future needs, it is highly needed to address early warning systems to control epidemics from the earliest. Objective: The study objective was to select the most appropriate model for dengue incidence and using the selected model, the authors forecast the future dengue outbreak in Bangladesh. Methods and Materials: This study considered a secondary data set of monthly dengue occurrences over the period of January 2008 to January 2020. Initially, the authors found the suitable model from Autoregressive Integrated Moving Average (ARIMA), Error, Trend, Seasonal (ETS) and Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend and Seasonal (TBATS) models with the help of selected model selection criteria and finally employing the selected model make forecasting of dengue incidences in Bangladesh. Results: Among ARIMA, ETS, and TBATS models, the ARIMA model performs better than others. The Box-Jenkin's procedure is applicable here and it is found that the best-selected model to forecast the dengue outbreak in the context of Bangladesh is ARIMA (2,1,2). Conclusion: Before establishing a comprehensive plan for future combating strategies, it is vital to understand the future scenario of dengue occurrence. With this in mind, the authors aimed to select an appropriate model that might predict dengue fever outbreaks in Bangladesh. The findings revealed that dengue fever is expected to become more frequent in the future. The authors believe that the study findings will be helpful to take early initiatives to combat future dengue outbreaks.
Dengue cases are increasing dramatically worldwide, especially in counties located in tropics and subtropics like Bangladesh.
,
This disease threatens about 50% of the global population (3.9 billion people). Dengue fever cases have risen over 15 times in the last two decades, according to the World Health Organization, from 505,430 cases in 2000 to over 2,400,138 cases in 2010 and 3,312,040 cases in 2015.
Annually, an estimated 390 million dengue virus infections occur, with 67–136 million occurring clinically.
Dengue fever is caused by the viral infection spread by mosquitos that are transmitted by four distinct virus serotypes (DEN‐1, DEN‐2, DEN‐3, and DEN‐4) that are genetically related but antigenically distinct.
Infection with either of these serogroups somehow doesn't confer cross‐protective immunity and raises the chances of causing more severe symptoms such as dengue hemorrhagic fever (DHF) as well as dengue shock syndrome (DSS).
Dengue is spread by the bite of the Aedes mosquito. Dengue fever first appeared in Asia, Africa, as well as North America in the years 1779–1780.
Dengue fever is prevalent in Southeast Asia and the Western Pacific, accounting for roughly three‐quarters of the global burden. A massive dengue outbreak struck the Philippines and Thailand throughout Asia in the 1950s.
The first episode of dengue fever in Bangladesh was officially recorded in 1964 in Dhaka.
The very first dengue outbreak in this country occurred during the monsoons of 2000, resulting in 5551 officially confirmed cases and 93 deaths,
primarily affecting infants, pregnant women, and the elderly.
A sudden rise in dengue outbreaks found in 2018 with 10,148 confirmed cases, and outbreak of this disease surpassed all previous records with 100,107 confirmed cases as of November 30, 2019, with more than 112,000 cases expected by the end of 2019.
,
,
Factors including the density of infected mosquitoes, people's immunity to dengue serotypes, bad housing conditions with lack of appropriate waste management, sanitation, drainage systems, and water availability, as well as using unsafe water reservoirs, all contribute to dengue vector habitats.
In addition to this tropical monsoon climate, heavy rainfall,
suitable temperature
contributes to the mosquito density, which contributes to the elevated frequency of dengue cases in Bangladesh.
,
,
Increases in internal and cross‐border dengue transmissions are linked to rising foreign and domestic commerce and people movements. As a consequence, dengue epidemics are becoming more common and severe in the area. Furthermore, there is no particular treatment or vaccines are available for dengue.
,
Besides, dengue preventive initiatives were also inadequate due to a scarcity of appropriate methods for vector control and weak disease monitoring, limiting the possibilities for preventing the transmission and managing the epidemic. Vector regulation, on the other hand, can be resource and labor‐intensive, putting a financial pressure on resource‐constrained settings like Bangladesh.Climate variables like temperature and precipitation have been extensively researched in recent decades for their capacity as emergency alert devices to ward off climate‐sensitive communicable diseases like Dengue and Malaria.
,
,
Climatic factors like temperature, humidity, and precipitation have been found to be significant determinants of mosquito propagation and lifespan in numerous studies.
,
,
Temperature has a significant impact on mosquitos' potential to spread dengue viruses
as well as the extent of dengue spread.
,
,
,
Considering the high disease burden in larger outbreak years and the challenge of predicting present and future needs, considerable effort has gone into designing early warning systems to anticipate or detect major epidemics as quickly as possible in the hopes of controlling epidemics in their initial phases.
,
In the context of Bangladesh, to combat the dengue outbreak, several steps were implemented, including waterlogging and cleaning of canals, water tanks, rainwater collection tanks, sump pits, downpipes, and gutters.
There is a national guidelines from the Directorate General of Health Services in Bangladesh. Moreover, mass media have been playing a vital role in recent years by disseminating information on dengue phases, the use of bed nets and mosquito repellents, and the wearing of light‐colored, loose‐fitting, long‐sleeved, and breathable clothes. However, in addition to existing policies, government officials, nongovernmental organizations, and policymakers must take nationwide activities, and public awareness should be raised through community education campaigns to tackle the upcoming challenges due to dengue prevalence.For the prediction of dengue outbreak, statistical modeling is one of the useful approaches.
,
The time series technique was being extensively used in field of epidemiologic studies on contagious diseases in previous studies conducted in China,
India,
Thailand,
West Indies,
Colombia,
and Australia.
Several previous studies considered Autoregressive Integrated Moving Average (ARIMA) model for forecasting purpose.
,
,
,
,
,
,
,
Moreover, in dengue prediction, the ARIMA models have been widely applied.
,
,
,
These are often combined with Seasonal Autoregressive Integrated Moving Average (SARIMA) models, which have been suitable for analyzing time series data with ordinary or seasonal patterns to establish statistical forecasting models.
,
,
,
,
Developing dengue incidence predictive model based on earlier outbreak evidence and climate variables can be a valuable tool for predicting the frequency and severity of possible epidemics.The key objective of this study is to develop the most appropriate time series model for forecasting dengue incidence in Bangladesh using period of January 2008 to January 2020 (in monthly frequency). More specifically, this study aims to understand the trend of dengue with seasonal variations, to fit the time series model with the data set for forecasting future dengue as an epidemic, to determine the most appropriate model for time series forecasting in determining the number of dengue‐affected people in Bangladesh. This predictive model leads to better resource distribution for improved health care interventions and helps healthcare facilities prepare for outbreaks.
METHODS
Data sources
The secondary data on the number of dengue‐infected people have been extracted from the Institute of Epidemiology Disease Control and Research (IEDCR) and the Directorate General of Health Services (DGHS)'s website over the period from January 2008 to January 2020. The original data format was inappropriate for Time Series Analysis; in this regard, the researchers had to process the data in an appropriate format for further analysis using R programming.
ARIMA model
This model is one of the statistical models that is used for analyzing as well as forecasting time series data.
If denotes a white noise with zero mean and variance then is known as a moving average (MA) process and is defined by with order qThe process is called an auto‐regressive (AR) process of order when it is defined asThe combined model and models are known as ARMA models. A model is defined as
where, for every, it is assumed that is independent of. If the difference of a time series is stationary then it is said to be an ARIMA model. If follows a model, is called a process. For example, an model is described as,
where, .
Error, Trend, Seasonal (ETS) model
The ETS model is a technique for forecasting univariate time series, which concentrates on trend along with seasonal components.
This model has the ability to trend and create seasonal components of various traits and they are nonstationary.
This model includes three parameters—Error, Trend, and Seasonal. For every single parameter, it has four values,
A—Additive; For Trend, we also have Ad—Additive Damped, M—Multiplicative, N—None, Z—Auto.Therefore, ETS (A, M, N) means—Additive for Error, Multiplicative for Trend, and None for Smoothening.
Trigonometric seasonality, Box‐Cox transformation, ARMA errors, Trend and Seasonal (TBATS) models
The TBATS is a forecasting procedure of time series data, having the key purpose of forecasting along with complex seasonal patterns utilizing exponential smoothing.
This model can be defined by the subsequent equations
:and seasonal part can be written as:where:‐time series at moment (Box‐Cox transformed)‐ i
th seasonal component‐local level‐trend with damping‐ARMA (p,q) process for residuals‐Gaussian white noise‐amount of seasonalities‐length of i
th seasonal period‐amount of harmonics for i
th seasonal period‐Box‐Cox transformation‐ smoothing‐trend damping‐ARMA(p,q) coefficients‐seasonal smoothing (two for each period)
Model selection
It is better to select the appropriate model before forecasting and keep this in mind, Akaike's Information Criterion (AIC), Akaike's Information Criterion correction (AICc), and Bayesian Information Criterion (BIC) were used here. Moreover, the RMSPE, MAPE, and TIC are used to check the accuracy of the fitted model.AIC: The AIC estimates the accuracy of each model and it helps to choose the appropriate model and calculated utilizing the following equation:; where, AIC = Akaike Information Criterion, = log likelihood.AICc: In case of small sample, AIC will provide overfitted model and AICc was addressed this issue and it can be measured by; where, is a bias correction.BIC: The BIC is analogous to the formula of AIC, however, having a distinct penalty for the number of parameters and it is calculated as:; where, = log likelihood, = number of total parameters, = number of observations.Root mean square percentage error (RMSPE): RMSPE is specified as,Mean percent forecast error (MPFE): The MPFE is measures as,Theil inequality coefficient (TIC): TIC
is defined as, where is the predicted/forecast value and is the actual value at time , = number of observations.
Box–Jenkins method
The influential work of Box–Jenkin's
gains popularity because it is applicable for any series, either stationary or not together with or without seasonal components. The steps of this method are presented by the following diagram (Figure 1).
Figure 1
Flowchart of Box–Jenkins method.
Flowchart of Box–Jenkins method.
RESULT
The variable “dengue” is the sum of all dengue infected patients all over Bangladesh. The authors plot the dengue‐infected patients by months all over the study period. The results depict that the dengue incidence was more frequent in August, September, and October compared to all other months. This indicates that the seasonality may be present in the data set. Thus, the authors considered seasonal ARIMA model and make a comparison with the ARIMA model (Figure 2).
Figure 2
Boxplot of the number of dengue infected patients by months over study period.
Boxplot of the number of dengue infected patients by months over study period.From Figure 3, it is seen that an epidemic of dengue attacks was held in 2019. Dengue‐infected numbers have increased every year varying slightly, but in 2019 it was an extreme value. First, the authors would like to verify the stationarity condition of the data series. This is due to the fact that the ARIMA model (which is a linear regression model) works best when the perdition is uncorrelated and independent to each other. Now, to test the stationarity, this paper considered augmented Dickey–Fuller (ADF). After performing an ADF test to check the stationary of our series, we found Dickey–Fuller = −2.0291, p = 0.5643, so our series is nonstationary. In this case, it is necessary to make the series stationary by the “Differencing” method. After the first differencing, it is found that Dickey–Fuller = −5.2484, p = 0.01, that is the mean now our series is stationary and ready for further analysis.
Figure 3
Time series plot of dengue‐infected people in Bangladesh.
Time series plot of dengue‐infected people in Bangladesh.Now, we select the most suitable model among ARIMA, SARIMA, ETS, and TBATS Models. To pick out the appropriate model, here we consider the value of AIC. Considering the AIC value as the indicator, ARIMA (2,1,2) shows the lowest value of AIC, which means ARIMA (2,1,2) is better than the ARIMA (2,1,2) (0,0,1)12, ETS and TBATS Model (Figure 4).
Figure 4
Comparison among ARIMA, SARIMA, ETS, and TBATS models. AIC, Akaike's Information Criterion; ARIMA, Autoregressive Integrated Moving Average; ETS, Error, Trend, Seasonal; SARIMA; Seasonal Autoregressive Integrated Moving Average; TBATS, Trigonometric seasonality, Box‐Cox transformation, ARMA errors, Trend and Seasonal.
Comparison among ARIMA, SARIMA, ETS, and TBATS models. AIC, Akaike's Information Criterion; ARIMA, Autoregressive Integrated Moving Average; ETS, Error, Trend, Seasonal; SARIMA; Seasonal Autoregressive Integrated Moving Average; TBATS, Trigonometric seasonality, Box‐Cox transformation, ARMA errors, Trend and Seasonal.Since the performance of ARIMA is better than other models considered in this study. For instance, TBATS model is used for forecasting time series along with complex seasonal patterns utilizing exponential smoothing
which is not applicable here. Therefore, we use the ARIMA model for subsequent analysis. The tentative order of the ARIMA model was identified by inspecting the ACF and PACF plots. It can be seen that there were two significant spikes in ACF plot and one significant spike in PACF plot. The ACF and PACF are presented in Figure 5.
Figure 5
Plots of ACF and PACF of the observed data.
Plots of ACF and PACF of the observed data.However, the final order of the model was identified by model selection criteria and estimated the parameters of the model. On the basis of the results of the model selection criteria like AIC, AICC, as well as BIC, it is seen that ARIMA (2,1,2) is the most suitable model for this data set. The estimates are presented below (Table 1).
Table 1
Estimates of the parameters of ARIMA (2,1,2) model.
Coefficients
ar1
ar2
Ma1
ma2
1.4652
−0.6756
−1.8680
0.9349
s.e.
0.0846
0.1033
0.0733
0.0675
AIC = 2802.86, AICc = 2803.3, BIC = 2817.71
Abbreviations: AIC, Akaike's Information Criterion; AICc, Akaike's Information Criterion correction; ARIMA, Autoregressive Integrated Moving Average; BIC, Bayesian Information Criterion.
Estimates of the parameters of ARIMA (2,1,2) model.Abbreviations: AIC, Akaike's Information Criterion; AICc, Akaike's Information Criterion correction; ARIMA, Autoregressive Integrated Moving Average; BIC, Bayesian Information Criterion.The accuracy of the fitted model is checked by well‐known model selection criteria like RMSPE, MAPE, and TIC. After excluding the zero values since it creates problem of calculating the values, the results of the model selection criteria are RMSPE = 10.813, MAPE = 3.584, and TIC = 0.693 which reveals the fitted model performs well. Now, before forecasting, we have to diagnose the residuals of the fitted model. First, we check the presence of autocorrelation among the residuals by the “Ljung‐Box” test. Hence, the “Ljung‐Box” test with at 5% level of significance strongly recommends that the residuals of the fitted ARIMA (2,1,2) model having autocorrelation. Moreover, to test the normality of residuals, we performed the Jarque‐Bera test and we found the following result, X
2 = 52,694, p < 2.2e‐16. The results indicate that the residuals follow the normality assumptions. The ACF and PACF plots of the fitted ARIMA (2,1,2) model is presented in Figure 6. No significant spike was observed in both the ACF and PACF plots suggesting no auto‐correlation presents among the residuals of the fitted ARIMA model. It is also confirmed by the well‐known test for autocorrelation “Ljung‐Box” test.
Figure 6
Plots of ACF and PACF of the residuals of the fitted ARIMA (2,1,2) model. ARIMA, Autoregressive Integrated Moving Average.
Plots of ACF and PACF of the residuals of the fitted ARIMA (2,1,2) model. ARIMA, Autoregressive Integrated Moving Average.The actual, fitted, and forecasted dengue cases in Bangladesh are presented in Figure 7 to visualize that how this model is fitted with the actual data points. Before the accuracy measurement, it is a basic way to check the fitted line of our model. From Figure 7, one can easily understand that this model is fitted almost accurately without the year 2019 when the epidemic attack was held.
Figure 7
Comparison of the original and forecasted number of dengue‐infected Bangladeshi people. ARIMA, Autoregressive Integrated Moving Average.
Comparison of the original and forecasted number of dengue‐infected Bangladeshi people. ARIMA, Autoregressive Integrated Moving Average.Finally, we forecasted the remaining 12 months (February 2020 to January 2021) with the selected ARIMA (2,1,2) model. The point forecast also has alternative values of lower and upper values with 95% and 80% confidence intervals. It means the point forecast may vary within the upper and lower limit at 95% or 80% confidence level. The interval values have plotted in the dark and light shadows in the following graph (Figure 7). Therefore, the dengue forecasting through this model is the better representation in the context of Bangladesh. The findings depict that the dengue‐infected people will be increased in the forecasting period of this study.
DISCUSSION
The goal of this study was to determine the trend of dengue outbreaks and make a short‐term forecast. Though the prevalence of dengue was not taken as a serious consideration before the outbreak was found in 2019. However, the outbreak creates potential risk to people in Bangladesh. For forecasting, statistical approach such as Box–Jenkins technique is normally one of the preferable approaches instead of system dynamics (SD) model
as this model shows the seasonality‐periodic fluctuations for studies.
However these models have some limitations in terms of feedback process.
For this reason, this study aims to forecast dengue incidence using ARIMA model to forecast algorithm based on the assumption that previous values of a time series can also be used to forecast values
also other researchers used this model for forecasting purpose.
,
,
,
,
Previous studies also highlighted that the ARIMA models performed better than other models to forecast the dengue incidences in different countries which is consistent with our study and reported that dengue fever outbreaks are more likely to occur during hot, dry weather with high daily temperatures.
,
,
,
Here, AIC, AICc, and BIC are employed to select the model and a previous study also used these criteria to select the most suitable ARIMA model for forecast purpose.
,
This study found that ARIMA (2,1,2) model is appropriate for dengue forecasting in Bangladesh. A similar study found that combining past dengue counts with meteorology and autoregressive lag terms would reliably forecast dengue occurrence, whether as transmitting figures or outbreaks or non‐outbreak times. Although this model has some limitations with lack of variance in the frequency of dengue incidence and proper identification of asymptotic infections.There are several factors working behind this outbreak. Some previous studies reported that climate variables,
,
,
urbanization,
,
density of female mosquitoes, and seasons
are the contributing factors to rapid dengue growth in different counties. Some studies pointed out that climate variables slightly improve the accuracy of the forecasting the incidence of dengue.
It is very difficult to predict the outbreaks of a disease, even more so when it comes to months‐ahead forecasting possibilities.
In forecasting a disease, each choice has a cost associated with it which will result in a gain or loss based on the outcome. Policymakers must choose the course of action that minimizes possible losses. The authors think that this paper will provide evidence for taking appropriate measures to prevent the future outbreak of dengue in Bangladesh.
LIMITATIONS OF THE STUDY
The limitation of this study is considering only the dengue incidences found in Bangladesh. It is believed that the prevalence of dengue may be influenced by other climatic and environmental factors like rainfall, humidity, and so on. Therefore, there is a lot of scope for future studies to develop and validate the model for the better findings.
CONCLUSIONS
Evidence of the existence of dengue epidemics in 2019, individuals and policymakers force to think about dengue preventative measures. Moreover, knowing the future scenario of dengue incidence is necessary before making a proper plan for future combating strategies. Keeping this in mind, the authors aimed to develop a model that may predict dengue incidences in Bangladesh. The results of the model selection criteria considered in this study depict that ARIMA (2,1,2) is the most suitable model for fitting dengue data and forecasting the future dengue scenario in Bangladesh.The findings depict that the incidence of dengue in Bangladesh is expected to rise in the future. Therefore, in addition to existing policies, government officials, nongovernmental organizations, and policymakers must take nationwide activities, and public awareness should be raised through community education campaigns to tackle the upcoming challenges due to dengue prevalence. This forecasting result helps us to get an indication of the future number of dengue‐infected people, enable us to assist public health policymakers to forecast dengue outbreaks and prepare preventive measures, and take suitable policy and strategies to control dengue outbreak in future prospect in Bangladesh.
The authors confirm that the current manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted.
Authors: Paula M Luz; Beatriz V M Mendes; Claudia T Codeço; Claudio J Struchiner; Alison P Galvani Journal: Am J Trop Med Hyg Date: 2008-12 Impact factor: 2.345
Authors: Mark E Beatty; Amy Stone; David W Fitzsimons; Jeffrey N Hanna; Sai Kit Lam; Sirenda Vong; Maria G Guzman; Jorge F Mendez-Galvan; Scott B Halstead; G William Letson; Joel Kuritsky; Richard Mahoney; Harold S Margolis Journal: PLoS Negl Trop Dis Date: 2010-11-16
Authors: Samir Bhatt; Peter W Gething; Oliver J Brady; Jane P Messina; Andrew W Farlow; Catherine L Moyes; John M Drake; John S Brownstein; Anne G Hoen; Osman Sankoh; Monica F Myers; Dylan B George; Thomas Jaenisch; G R William Wint; Cameron P Simmons; Thomas W Scott; Jeremy J Farrar; Simon I Hay Journal: Nature Date: 2013-04-07 Impact factor: 49.962