Literature DB >> 31783697

Predicting Seasonal Influenza Based on SARIMA Model, in Mainland China from 2005 to 2018.

Jing Cong1, Mengmeng Ren1, Shuyang Xie2, Pingyu Wang1,2.   

Abstract

Seasonal influenza is one of the mandatorily monitored infectious diseases, in China. Making full use of the influenza surveillance data helps to predict seasonal influenza. In this study, a seasonal autoregressive integrated moving average (SARIMA) model was used to predict the influenza changes by analyzing monthly data of influenza incidence from January 2005 to December 2018, in China. The inter-annual incidence rate fluctuated from 2.76 to 55.07 per 100,000 individuals. The SARIMA (1, 0, 0) × (0, 1, 1) 12 model predicted that the influenza incidence in 2018 was similar to that of previous years, and it fitted the seasonal fluctuation. The relative errors between actual values and predicted values fluctuated from 0.0010 to 0.0137, which indicated that the predicted values matched the actual values well. This study demonstrated that the SARIMA model could effectively make short-term predictions of seasonal influenza.

Entities:  

Keywords:  SARIMA model; influenza; prediction

Mesh:

Year:  2019        PMID: 31783697      PMCID: PMC6926639          DOI: 10.3390/ijerph16234760

Source DB:  PubMed          Journal:  Int J Environ Res Public Health        ISSN: 1660-4601            Impact factor:   3.390


1. Introduction

Influenza remains an increasing public health problem worldwide, especially since the 1918 influenza pandemic. Although many efforts have focused on measures and strategies to prevent and control influenza [1,2,3,4], it still results in significant mortality, health care capacity, and economic costs to society annually. A modeling study estimated that 291,243 to 645,832 seasonal influenza-associated respiratory deaths (4.0 to 8.8 per 100,000 individuals) occurred annually worldwide from 1999 to 2015 [5]. It estimated that 11.5% of lower respiratory tract infection episodes were attributable to influenza [6]. The expected annual losses from pandemic risk are about 0.6% of the global income [7]. The burden caused by influenza is higher in seasons dominated by A (H3N2) or A (H1N1) pdm 2009 influenza viruses but is lower in seasons where pre-pandemic A (H1N1) or influenza B accounts for the majority of cases [8]. In 2019, the WHO and partners launched the Global Influenza Strategy for 2019 to 2030 to strengthen seasonal prevention and control of future pandemics [9]. This strategy includes improving the influenza model and forecast. Therefore, it is important to use influenza surveillance data to model and forecast the influenza pandemics. Time series analysis of infection data from analysis of specific models can improve the prevention system and forecast future values based on the previously observed values [10,11]. These models include seasonal autoregressive integrated moving average (SARIMA), neural network model, exponential smoothing, grey swing model, etc. [12,13,14,15]. The SARIMA model takes both overall trends and seasonal changes into account, which is widely used in modeling time series [13,14,15]. Our previous study demonstrated that there was a trend of increasing influenza incidence from 2005 to 2015, in mainland China [16]. China has a population of about 1.3 billion people, and the disease burden caused by influenza is still heavy. The mortality caused by influenza is around 8% of all respiratory deaths in China [17]. Considering that seasonal influenza is one of the notifiable infectious diseases in China, it is helpful to model and forecast influenza by analyzing the influenza surveillance data. Furthermore, the country-specific estimate should be updated periodically, and a certain model should be developed. Therefore, the SARIMA model was performed to analyze the changes of influenza from recent surveillance data of influenza from January 2005 to December 2018, in China.

2. Materials

2.1. Date Collection

The total number of influenza cases and the monthly data of influenza incidence from January 2005 to December 2018 was provided by the website of the National Health Commission of the People’s Republic of China [18]. The Chinese government has established a web-based national notifiable infectious disease surveillance system for 39 infectious diseases, including influenza since 2003. Clinicians complete a standard case report card when they identify any probable, clinical, or laboratory-confirmed case of seasonal influenza-A and influenza-B. Then, the local epidemiologists do a field investigation when they receive the disease card using a standardized form, which improves the data accuracy [16,19].

2.2. SARIMA Model

The data of influenza incidence from January 2005 to June 2018 was used as a training dataset, and that data from July 2018 to December 2018 was used as the forecasting dataset. We established and selected the best SARIMA model (p, d, q) × (P, D, Q) according to the steps introduced by Box and Jenkins [20] (Figure 1). Autoregressive lags, moving average lags, seasonal autoregressive lags, and seasonal moving average lags are indicated by p, q, P, and Q, respectively.
Figure 1

The process and method of seasonal autoregressive integrated moving average (SARIMA) model.

2.3. Statistical Analysis

STATA 15.0 (Stata Corp., College Station, TX, USA) and SPSS 22.0 (SPSS Inc., Chicago, IL, USA) were performed to create the SARIMA model.

3. Results

3.1. General Trend of Influenza Incidence

A total of 2,686,180 influenza cases were reported in mainland China from January 2005 to December 2018. The annual incidence rate fluctuated from 2.76 to 55.07 per 100,000 individuals. Influenza occurred throughout the year, with two peaks in winter and spring (Figure 2).
Figure 2

The influenza incidence in mainland China from 2005 to 2018: (A) Influenza cases and incidence from January 2005 to December 2011 and (B) influenza cases and incidence from January 2012 to December 2018.

3.2. SARIMA Model

Using the raw training data from January 2005 to June 2018, trend difference (d = 0) and seasonal difference (D = 1) were completed. The augmented Dickey−Fuller method was used to determine whether the sequence was stationary, and the result supported that the data was a stationary time series (t = −9.247, p < 0.001). The auto-correlation Function (ACF) and partial correlation function (PACF) graphs were used to estimate the parameter ranges of p, P and q, Q (Figure 3). The ACF graph of one-order seasonal difference data (Figure 3C) and the PACF graph of one-order seasonal difference data (Figure 3D) showed better than the others. Then, some candidate SARIMA models were assessed to forecast future values based on the previously observed values (Table 1). On the basis of the results of the goodness-of-fit test statistics, SARIMA (1, 0, 0) × (0, 1, 1) 12 model was found to the optimal model, which had the lowest Akaike information criterion (AIC = 535.2955) and Bayesian information criterion (BIC = 544.3274). This model also passed the Ljung–Box Q Test (z = 25.607, p = 0.060). All the parameter estimates were significant (Table 1).
Figure 3

The ACF and PACF graphs for estimating the parameter: (A) The ACF graph of the raw data (d = 0 and D = 0), (B) the PACF graph of the raw data (d = 0 and D = 0), (C) the ACF graph of one-order seasonal difference data (d = 0 and D = 1), (D) the PACF graph of one-order seasonal difference data (d = 0 and D = 1), (E) the ACF graph of two-order seasonal difference data (d = 0 and D = 2), and (F) the PACF graph of two-order seasonal difference data (d = 0 and D = 2).

Table 1

Comparison of candidate SARIMA models.

ModelEstimateZp-ValueLjung-Box Q TestAICBICRMSEMAPE
StatisticsDFp-Value
SARIMA (0,0,1) (0,1,1)12---22.753160.121541.661550.6921.43948.744
q0.65411.000.000-------
Q−0.415−2.170.030-------
SARIMA (1,0,0) (0,1,1)12---25.607160.060535.296544.3271.40744.280
p0.66825.680.000-------
Q−0.445−2.240.025-------
SARIMA (1,0,1) (0,1,1)12---8.157150.917523.172535.2141.34544.137
p0.4813.150.002-------
q−0.393−3.7720.074-------
Q0.473−2.530.012-------
SARIMA (1,0,1) (1,1,1)12---7.916140.894525.083540.1361.34844.021
p0.4762.930.003-------
q0.3991.760.078-------
P−0.080−0.100.923-------
Q−0.425−0.500.615-------

AIC: Akaike information criterion; BIC: Bayesian information criterion; RMSE: root mean squared error; MAPE: mean absolute percent error; DF: degree of freedom.

The model forecasting effect was tested by comparing predicted values with the actual values. The results showed SARIMA (1, 0, 0) × (0, 1, 1) 12 model fitted the seasonal fluctuation well (Figure 4). Then, the model was used to forecast the influenza incidence from July to December 2018. The relative errors between actual values and predicted values fluctuated from 0.0010 to 0.0137. All actual values are among 95% CI of predicted values (Table 2).
Figure 4

Comparison of actual and predicted incidence of influenza in mainland China.

Table 2

Comparison of predicted values and actual values form July to December 2018 (per 100,000 population).

MonthActual ValuePredicted ValueRelative Error95%CI
LCLUCL
July1.042.470.0137−0.494.82
August0.881.820.0106−1.835.22
September0.951.340.0042−2.44.99
October1.060.960.0010−2.794.69
November1.931.610.0017−2.155.34
December9.355.720.00391.869.36

LCL: lower confidence limit; UCL: upper confidence limit.

4. Discussion

Seasonal influenza is an acute respiratory infection caused by influenza viruses. Influenza surveillance to improve the influenza is the basis of influenza prevention and control. Global influenza surveillance has been conducted through the WHO’s Global Influenza Surveillance and Response System (GISRS) since 1952. The GISRS Network remains alert to timely recognize potential threats and minimize the impact of influenza epidemics and pandemics [21]. In many countries, more attention has been paid to influenza surveillance. The surveillance data types are usually used to establish a variety of influenza surveillance systems, including influenza-likely illness (ILI), acute respiratory infection (ARI), influenza cases, laboratory-confirmed influenza, Google flu trends, protein sequences, etc. [5,10,22,23,24,25]. In this study, the current influenza surveillance system covers the following contents: (i) ILI, (ii) ARI, (iii) outbreak surveillance, and (iv) notifiable infectious disease surveillance. We used reportable contagious disease surveillance data to analyze the trend of influenza, and our results demonstrated that seasonal influenza is one of the web-based national notifiable infectious diseases in China, since 2003. The surveillance data of this study are of high accuracy by quality control, which ensures the authenticity of these results. Since 2003, the Chinese government has improved the surveillance system. When clinicians identify any case of seasonal influenza-A and influenza-B, since 2003, they report it through the web-based national notifiable infectious disease surveillance system within 24 hours. Epidemiologists evaluate the report rigorously, which is helpful to reduce the surveillance bias and enhance the data accuracy. Influenza is affected by many biological, behavioral, and environmental factors, which lead to a seasonal variation. Previous studies have assumed that influenza is an annual spring or winter epidemic in some cities of China [26,27]. The data from the 14-year surveillance in this study are in agreement that influenza has seasonal variation in winter or spring. Seasonality exists with two peaks in winter and spring, consistent with well-documented peaks for influenza-A and influenza-B [25,26]. The reasons for seasonal epidemics may be related to factors such as a vast population, high residential density, and crowded living conditions, the variability of influenza viruses, diversity of geography, cold winter weather, low vaccination rate, etc. [27,28,29]. The multiple factors cause difficulties in modeling the influenza pandemic. Several approaches are applied to make these models. These approaches can be categorized as follows: time series models, compartmental modes, agent-based models, met population models, and approaches in meteorology. Time series analysis has the advantage of forecasting the incidence without focusing on specific risk factors. It uses the number of patients in the past as features to forecast the number of patients in the future as the response. The SARIMA model is performed over a time series in an automated fashion to maximize prediction accuracy. In addition, it takes both overall trends and seasonal changes into account and has been widely used for time series analysis. The SARIMA model typically assumes that future values in about three to six months can be predicted based on previously observed values [29]. Accordingly, we constructed the SARIMA (1, 0, 0) × (0, 1, 1) 12 model to forecast influenza incidence. This model forecasted that the influenza incidence from July 2018 to December 2018 was similar to that of previous years, and there was also a seasonal variation during winter. Our results demonstrated that the predicted values matched the actual values well, supporting that the SARIMA model is effective in the prevention and control of influenza. It can capture trends and periodic changes.

5. Limitations

Several limitations should be noted in this study. First, only the SARIMA model was used and we assumed that there was a linear relationship between influenza incidence and its factors, such as exposure, susceptibility, access to care, etc. Many environmental and natural factors are dynamic, so the parameters of the SARIMA model should be periodically reassessed according to continuously updated data. Second, the surveillance data of this study cannot exclude surveillance bias in spite of quality control, which may affect our results to some extent. Third, we only used the data of all mainland China for prediction, analysis of subgroups (the South of China and the North of China) could be more reasonable. Fourth, we collected only monthly data, and weekly reporting could have better accuracy.

6. Conclusions

This work demonstrates that influenza occurred throughout the year with two peaks in winter and spring, in mainland China, which reminds us that influenza never goes away. Additional practical efforts should focus on reducing the burden of seasonal influenza. Our results also indicate that the SARIMA model can make short-term predictions of seasonal influenza effectively, and it is helpful to decision makers to allocate public health resources.
  26 in total

1.  Epidemiological dynamics and phylogeography of influenza virus in southern China.

Authors:  Xiaowen Cheng; Yi Tan; Mingliang He; Tommy Tsan-Yuk Lam; Xing Lu; Cécile Viboud; Jianfan He; Shunxiang Zhang; Jianhua Lu; Chunli Wu; Shishong Fang; Xin Wang; Xu Xie; Hanwu Ma; Martha I Nelson; Hsiang-fu Kung; Edward C Holmes; Jinquan Cheng
Journal:  J Infect Dis       Date:  2012-08-28       Impact factor: 5.226

2.  Effectiveness of Influenza Vaccination on Hospitalizations and Risk Factors for Severe Outcomes in Hospitalized Patients With COPD.

Authors:  Sunita Mulpuru; Li Li; Lingyun Ye; Todd Hatchette; Melissa K Andrew; Ardith Ambrose; Guy Boivin; William Bowie; Ayman Chit; Gael Dos Santos; May ElSherif; Karen Green; Francois Haguinet; Scott A Halperin; Barbara Ibarguchi; Jennie Johnstone; Kevin Katz; Joanne M Langley; Jason LeBlanc; Mark Loeb; Donna MacKinnon-Cameron; Anne McCarthy; Janet E McElhaney; Allison McGeer; Jeff Powis; David Richardson; Makeda Semret; Vivek Shinde; Daniel Smyth; Sylvie Trottier; Louis Valiquette; Duncan Webster; Shelly A McNeil
Journal:  Chest       Date:  2019-01       Impact factor: 9.410

3.  Patients hospitalized with laboratory-confirmed influenza during the 2010-2011 influenza season: exploring disease severity by virus type and subtype.

Authors:  Sandra S Chaves; Deborah Aragon; Nancy Bennett; Tara Cooper; Tiffany D'Mello; Monica Farley; Brian Fowler; Emily Hancock; Pam Daily Kirley; Ruth Lynfield; Patricia Ryan; William Schaffner; Ruta Sharangpani; Leslie Tengelsen; Ann Thomas; Diana Thurston; Jean Williams; Kimberly Yousey-Hindes; Shelley Zansky; Lyn Finelli
Journal:  J Infect Dis       Date:  2013-07-17       Impact factor: 5.226

4.  Estimates of global seasonal influenza-associated respiratory mortality: a modelling study.

Authors:  A Danielle Iuliano; Katherine M Roguski; Howard H Chang; David J Muscatello; Rakhee Palekar; Stefano Tempia; Cheryl Cohen; Jon Michael Gran; Dena Schanzer; Benjamin J Cowling; Peng Wu; Jan Kyncl; Li Wei Ang; Minah Park; Monika Redlberger-Fritz; Hongjie Yu; Laura Espenhain; Anand Krishnan; Gideon Emukule; Liselotte van Asten; Susana Pereira da Silva; Suchunya Aungkulanon; Udo Buchholz; Marc-Alain Widdowson; Joseph S Bresee
Journal:  Lancet       Date:  2017-12-14       Impact factor: 79.321

5.  Real-time forecasting of an epidemic using a discrete time stochastic model: a case study of pandemic influenza (H1N1-2009).

Authors:  Hiroshi Nishiura
Journal:  Biomed Eng Online       Date:  2011-02-16       Impact factor: 2.819

6.  Forecasting influenza in Hong Kong with Google search queries and statistical model fusion.

Authors:  Qinneng Xu; Yulia R Gel; L Leticia Ramirez Ramirez; Kusha Nezafati; Qingpeng Zhang; Kwok-Leung Tsui
Journal:  PLoS One       Date:  2017-05-02       Impact factor: 3.240

7.  Influenza-associated excess respiratory mortality in China, 2010-15: a population-based study.

Authors:  Li Li; Yunning Liu; Peng Wu; Zhibin Peng; Xiling Wang; Tao Chen; Jessica Y T Wong; Juan Yang; Helen S Bond; Lijun Wang; Yiu Chung Lau; Jiandong Zheng; Shuo Feng; Ying Qin; Vicky J Fang; Hui Jiang; Eric H Y Lau; Shiwei Liu; Jinlei Qi; Juanjuan Zhang; Jing Yang; Yangni He; Maigeng Zhou; Benjamin J Cowling; Luzhao Feng; Hongjie Yu
Journal:  Lancet Public Health       Date:  2019-09

8.  Risk Factors and Attack Rates of Seasonal Influenza Infection: Results of the Southern Hemisphere Influenza and Vaccine Effectiveness Research and Surveillance (SHIVERS) Seroepidemiologic Cohort Study.

Authors:  Q Sue Huang; Don Bandaranayake; Tim Wood; E Claire Newbern; Ruth Seeds; Jacqui Ralston; Ben Waite; Ange Bissielo; Namrata Prasad; Angela Todd; Lauren Jelley; Wendy Gunn; Anne McNicholas; Thomas Metz; Shirley Lawrence; Emma Collis; Amanda Retter; Sook-San Wong; Richard Webby; Judy Bocacao; Jennifer Haubrock; Graham Mackereth; Nikki Turner; Barbara McArdle; John Cameron; Edwin G Reynolds; Michael G Baker; Cameron C Grant; Colin McArthur; Sally Roberts; Adrian Trenholme; Conroy Wong; Susan Taylor; Paul Thomas; Jazmin Duque; Diane Gross; Mark G Thompson; Marc-Alain Widdowson
Journal:  J Infect Dis       Date:  2019-01-09       Impact factor: 5.226

9.  Time series analysis of influenza incidence in Chinese provinces from 2004 to 2011.

Authors:  Xin Song; Jun Xiao; Jiang Deng; Qiong Kang; Yanyu Zhang; Jinbo Xu
Journal:  Medicine (Baltimore)       Date:  2016-06       Impact factor: 1.889

10.  Time series analysis of temporal trends in the pertussis incidence in Mainland China from 2005 to 2016.

Authors:  Qianglin Zeng; Dandan Li; Gui Huang; Jin Xia; Xiaoming Wang; Yamei Zhang; Wanping Tang; Hui Zhou
Journal:  Sci Rep       Date:  2016-08-31       Impact factor: 4.379

View more
  9 in total

1.  Time trend prediction and spatial-temporal analysis of multidrug-resistant tuberculosis in Guizhou Province, China, during 2014-2020.

Authors:  Wang Yun; Chen Huijuan; Liao Long; Lu Xiaolong; Zhang Aihua
Journal:  BMC Infect Dis       Date:  2022-06-07       Impact factor: 3.667

Review 2.  Review of Big Data Analytics, Artificial Intelligence and Nature-Inspired Computing Models towards Accurate Detection of COVID-19 Pandemic Cases and Contact Tracing.

Authors:  Israel Edem Agbehadji; Bankole Osita Awuzie; Alfred Beati Ngowi; Richard C Millham
Journal:  Int J Environ Res Public Health       Date:  2020-07-24       Impact factor: 3.390

3.  Spatial and Temporal Analysis of Plasmodium knowlesi Infection in Peninsular Malaysia, 2011 to 2018.

Authors:  Wei Kit Phang; Mohd Hafizi Abdul Hamid; Jenarun Jelip; Rose Nani Mudin; Ting-Wu Chuang; Yee Ling Lau; Mun Yik Fong
Journal:  Int J Environ Res Public Health       Date:  2020-12-11       Impact factor: 3.390

4.  Forecasting the incidence of mumps in Chongqing based on a SARIMA model.

Authors:  Hongfang Qiu; Han Zhao; Haiyan Xiang; Rong Ou; Jing Yi; Ling Hu; Hua Zhu; Mengliang Ye
Journal:  BMC Public Health       Date:  2021-02-17       Impact factor: 3.295

5.  Implications of the COVID-19 Lockdown on Dengue Transmission in Malaysia.

Authors:  Song-Quan Ong; Hamdan Ahmad; Ahmad Mohiddin Mohd Ngesom
Journal:  Infect Dis Rep       Date:  2021-02-05

6.  Development and comparison of predictive models for sexually transmitted diseases-AIDS, gonorrhea, and syphilis in China, 2011-2021.

Authors:  Zhixin Zhu; Xiaoxia Zhu; Yancen Zhan; Lanfang Gu; Liang Chen; Xiuyang Li
Journal:  Front Public Health       Date:  2022-08-12

7.  The research of SARIMA model for prediction of hepatitis B in mainland China.

Authors:  Daren Zhao; Huiwu Zhang; Qing Cao; Zhiyi Wang; Ruihua Zhang
Journal:  Medicine (Baltimore)       Date:  2022-06-10       Impact factor: 1.817

8.  Predictive analysis of the number of human brucellosis cases in Xinjiang, China.

Authors:  Yanling Zheng; Liping Zhang; Chunxia Wang; Kai Wang; Gang Guo; Xueliang Zhang; Jing Wang
Journal:  Sci Rep       Date:  2021-06-01       Impact factor: 4.379

9.  Spatio-Temporal Analysis of Influenza-Like Illness and Prediction of Incidence in High-Risk Regions in the United States from 2011 to 2020.

Authors:  Zhijuan Song; Xiaocan Jia; Junzhe Bao; Yongli Yang; Huili Zhu; Xuezhong Shi
Journal:  Int J Environ Res Public Health       Date:  2021-07-02       Impact factor: 3.390

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.