| Literature DB >> 28506278 |
Farzaneh Sadat Tabataba1,2, Prithwish Chakraborty3, Naren Ramakrishnan3,4, Srinivasan Venkatramanan4, Jiangzhuo Chen4, Bryan Lewis4, Madhav Marathe3,4.
Abstract
BACKGROUND: Over the past few decades, numerous forecasting methods have been proposed in the field of epidemic forecasting. Such methods can be classified into different categories such as deterministic vs. probabilistic, comparative methods vs. generative methods, and so on. In some of the more popular comparative methods, researchers compare observed epidemiological data from the early stages of an outbreak with the output of proposed models to forecast the future trend and prevalence of the pandemic. A significant problem in this area is the lack of standard well-defined evaluation measures to select the best algorithm among different ones, as well as for selecting the best possible configuration for a particular algorithm.Entities:
Keywords: Epidemic forecasting; Epidemic-Features; Error Measure; Performance evaluation; Ranking
Mesh:
Year: 2017 PMID: 28506278 PMCID: PMC5433189 DOI: 10.1186/s12879-017-2365-1
Source DB: PubMed Journal: BMC Infect Dis ISSN: 1471-2334 Impact factor: 3.090
Fig. 1Software Framework: Software Framework contains four packages: Epi-features package, Error Measure package, Ranking schema and Visualization module. The packages are independent and are only connected through the exchanged data
Fig. 2Predicting Epidemic Curve. The red arrow points to the prediction time k in which prediction occurs based on k initial data points of time-series. The red dashed line is predicted epidemic curve and the black line is observed one
Notation and Symbols
| Symbol | Definition |
|---|---|
| y(t) | number of new cases of disease in the |
| x(t) | number of new cases of disease in the |
|
| number of new cases of disease predicted at the start of epidemic season |
|
| predicted value of the maximum number of new cases of the disease |
|
|
|
|
| duration of the epidemic season |
|
|
|
|
|
|
|
| Total number of infected persons during specified period |
|
| The population size at the start of specified period |
|
| Total number of infected persons with specific age during the specified period |
|
| The population size with specific age at the start of specified period |
|
| or |
|
| or |
|
|
|
|
| Arithmetic Mean of a set of Errors |
|
| Median value of a set of Errors |
|
| Root Mean Square of a set of Errors |
Definitions of different Epidemiologically Relevant features (Epi-features)
| Epi-feature name | Definition |
|---|---|
| Peak value | Maximum number of new infected cases in a given week in the epidemic time-series |
| Peak time | The week when peak value is attained |
| Total attack rate | Fraction of individuals ever infected in the whole population |
| Age-specific attack rate | Fraction of individuals ever infected belonging to a specific age window |
| First-take-off-(value): | Sharp increase in the number of new infected case counts over a few consecutive weeks |
| First-take-off-(time): | The start time of sudden increase in the number of new infected case counts |
| Intensity duration | The number of weeks (usually consecutive) where the number of new infected case counts is more than a specific threshold |
| Speed of epidemic | Rate at which the case counts approach the peak value |
| Start-time of disease season | Time at which the fraction of infected individuals exceeds a specific threshold |
Fig. 3Figure explaining Intensity Duration. Intensity Duration’s length (ID) indicates the number of weeks where the number of new infected case counts are more than a specific threshold
Fig. 4Figure explaining Speed of Epidemic. Speed of Epidemic (SpE) is the steepness of the line that connects the start data-point of time-series sequence to the peak data-point. SpE indicates how fast the infected case counts reach the peak value
List of main Error Measures. Arithmetic mean and absolute errors are used to calculate these measures in which positive and negative deviations do not cancel each other out and measures do not provide any information about the direction of errors
| Measure name | Formula | Description | Scaled | Outlier Protection | Other forms | Penalize extreme deviation | Other Specification |
|---|---|---|---|---|---|---|---|
| Mean Absolute Error (MAE) |
| Demonstrates the magnitude of overall error | No | Not Good | GMAE | No | - |
| Root Mean Squared Error (RMSE) |
| Root square of average squared error | No | Not Good | MSE | Yes | - |
| Mean Absolute Percentage Error (MAPE) |
| Measures the average of absolute percentage error | Yes | Not Good | MdAPE | No | - |
| symmetric Mean Absolute Percentage Error (sMAPE) |
| Scale the error by dividing it by the average of | Yes | Good | MdsAPE | No | Less possibility of division by zero rather than MAPE. |
| Mean Absolute Relative Error (MARE) |
| Measures the average ratio of absolute error to Random walk error | Yes | Fair | MdRAE, GMRAE | No | - |
| Relative Measures: e.g. RelMAE (RMAE) |
| Ratio of accumulation of errors to cumulative error of Random Walk method | Yes | Not Good | RelRMSE, LMR [ | No | - |
| Mean Absolute Scaled Error (MASE) |
| Measures the average ratio of error to average error of one-step Random Walk method | Yes | Fair | RMSSE | No | - |
| Percent Better (PB) |
| Demonstrates average number of times that method overcomes the Random Walk method | Yes | Good | - | No | Not good for calibration and close competitive methods. |
|
| |||||||
| Mean Arctangent Absolute Percentage Error (MAAPE) |
| Calculates the average arctangent of absolute percentage error | Yes | Good | MdAAPE | No | Smooths large errors. Solve division by zero problem. |
| Normalized Mean Squared Error (NMSE) |
| Normalized version of MSE: value of error is balanced | No | Not Good | NA | No | Balanced error by dividing by variance of real data. |
Md represent Median RMS represent Root Mean Square
Ranking of methods for predicting peak value based on different error measures for Region 1 over whole season (2013-2014). The color spectrum demonstrates different ranking levels. Dark green represents the best rank, whereas dark orange represents the worst one
| MAE | RMSE | MAPE | sMAPE | MdAPE | MdsAPE | Consensus | Median | |
|---|---|---|---|---|---|---|---|---|
| Ranking | ||||||||
| Method 1 | 6 | 6 | 6 | 6 | 5 | 6 | 5.83 | 6 |
| Method 2 | 5 | 5 | 5 | 5 | 2 | 3 | 4.17 | 5 |
| Method 3 | 2 | 3 | 2 | 4 | 3 | 4 | 3.00 | 3 |
| Method 4 | 1 | 1 | 1 | 2 | 1 | 1 | 1.17 | 1 |
| Method 5 | 4 | 4 | 4 | 3 | 6 | 4 | 4.17 | 4 |
| Method 6 | 3 | 2 | 3 | 1 | 3 | 1 | 2.17 | 2.5 |
Fig. 5HHS region map, based on “U.S. Department of Health & Human Services” division [32]
Different errors for predicting peak value for Region 1 over whole season (2013-2014)
| MAE | RMSE | MAPE | sMAPE | MdAPE | MdsAPE | |
|---|---|---|---|---|---|---|
| Method 1 | 4992.0 | 9838.6 | 4.9 | 1.04 | 1.7 | 1.03 |
| Method 2 | 4825.2 | 9770.4 | 4.7 | 0.99 | 1.4 | 0.95 |
| Method 3 | 3263.0 | 5146.5 | 3.2 | 0.96 | 1.5 | 1.01 |
| Method 4 | 2990.7 | 4651.3 | 2.9 | 0.899 | 1.1 | 0.85 |
| Method 5 | 3523.2 | 5334.8 | 3.4 | 0.95 | 2.1 | 1.01 |
| Method 6 | 3310.9 | 4948.5 | 3.2 | 0.896 | 1.5 | 0.85 |
Fig. 6Box-Whisker Plot shows the Consensus Ranking of forecasting methods in predicting Peak value for Region 1, aggregated on different error measures
Fig. 7Consensus Ranking of forecasting methods over all error measures for predicting different Epi-features for Region 1. Method 4 is superior in predicting five Epi-features out of eight, but is far behind other methods in predicting three other Epi-features
Fig. 8The box-whisker diagrams shows the median, mean and the variance of Consensus Ranking of methods over all Epi-features for Region 1
Average Consensus Ranking over different error measures for all Epi-features- Region 1
| Peak value | Peak time | Take-off-value | Take-off-time | ID length | ID start time | Start of flu season | Speed of epidemic | Average | Median | |
|---|---|---|---|---|---|---|---|---|---|---|
| M1 | 5.83 | 3.83 | 6 | 1 | 3.33 | 5.67 | 6 | 5.83 | 4.69 | 5.67 |
| M2 | 4.17 | 4.5 | 5 | 2 | 1 | 4.33 | 5.0 | 4.5 | 3.81 | 4.33 |
| M3 | 3 | 2.83 | 3.83 | 3 | 3.33 | 3.17 | 3 | 3.17 | 3.17 | 3.17 |
| M4 | 1.17 | 3.33 | 1.17 | 5 | 4.00 | 1.0 | 1 | 1.17 | 2.23 | 1.17 |
| M5 | 4.17 | 1.17 | 3 | 4 | 4.33 | 4.67 | 3 | 4.17 | 3.56 | 4 |
| M6 | 2.17 | 2.33 | 1.50 | 6 | 4.67 | 2.00 | 1.00 | 1.67 | 2.67 | 2.17 |
Fig. 9Consensus Ranking over all Epi-Features - Regions 1-6. The box-whisker diagrams show the median, mean and the variance of Consensus Ranking of methods in predicting different Epi-features
Fig. 10Consensus Ranking over all Epi-Features- Regions 7-10. The box-whisker diagrams show the median, mean and the variance of Consensus Ranking of methods in predicting different Epi-features
Fig. 11Consensus Ranking over all 10 HHS-Regions. The box-whisker diagrams show the median, mean and the variance of Consensus Ranking of methods in predicting the Epi-features for all HHS regions
Average Consensus Ranking of methods over different Epi-features- Regions 1 - 10
| Region1 | Region2 | Region3 | Region4 | Region5 | Region6 | Region7 | Region8 | Region9 | Region10 | Ave | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| M1 | 4.69 | 3.31 | 4.6 | 3.94 | 3.65 | 2.21 | 4.3 | 3.94 | 3.46 | 4.29 | 3.84 |
| M2 | 3.81 | 2.77 | 4.23 | 4.0 | 3.71 | 1.29 | 3.73 | 3.69 | 3.79 | 3.96 | 3.50 |
| M3 | 3.17 | 3.46 | 1.96 | 2.68 | 2.67 | 2.21 | 3.03 | 2.73 | 2.17 | 2.33 | 2.64 |
| M4 | 2.23 | 3.19 | 2.04 | 2.7 | 3.08 | 1.29 | 2.93 | 2.60 | 2.44 | 3.71 | 2.62 |
| M5 | 3.56 | 1.79 | 1.79 | 2.41 | 2.77 | 2.21 | 2.67 | 3.06 | 2.88 | 2.67 | 2.58 |
| M6 | 2.67 | 3.23 | 2.13 | 2.48 | 2.83 | 1.29 | 2.60 | 3.27 | 3.13 | 3.58 | 2.72 |
Fig. 12Horizon Ranking of six methods for predicting the peak value calculated based on APE, and sAPE, on Region 1
Fig. 13Horizon Ranking of six methods for predicting the peak time calculated based on APE, and sAPE, on Region 1. Methods 4 and 6 are the dominant for the first eight weeks of prediction, and then method 1 wins the first place for seven weeks. In the next eight weeks, methods 1, 3, and 5 are superiors simultaneously
Fig. 14Horizon Ranking of six methods for predicting the Intensity Duration length and start time calculated based on APE, and sAPE, on Region 1
Fig. 15Horizon Ranking of six methods for predicting the Take-off value and time calculated based on APE, and sAPE, on Region 1
Fig. 16Horizon Ranking graphs for leveraging forecasting methods in predicting Speed of Epidemic and Start of flu season, on Region 1
Fig. 17Visual comparison of 1-step-ahead predicted curves generated by six methods vs. the observed curve, Region 1: The first and second methods show bigger deviations from observed curve, especially in the first half of the season. As the six methods are different configurations of one algorithm, their outputs are so competitive and sometimes similar to each other; methods 3 and 5, and methods 4 and 6 show some similarity in their one-step-ahead epidemic curve that is consistent with Horizon Ranking charts for various Epi-features
List of advanced error measures to aggregating the error values across multiple series
| Measure name | Formula | Description |
|---|---|---|
| Absolute Percentage Error ( |
| where |
| Mean Absolute Percentage Error ( |
| where |
| Median Absolute Percentage Error ( | Median Observation of | Obtaining median of APE errors over series. |
| Relative Absolute Error ( |
| Measures the ratio of absolute error to Random walk error in time horizon t. |
| Geometric Mean Relative Absolute Error ( |
| Measures the Geometric average ratio of absolute error to Random walk error |
| Median Relative Absolute Error ( | Median Observation of | Measures the median observation of |
| Cumulative Relative Error ( |
| Ratio of accumulation of errors to cumulative error of Random walk Method |
| Geometric Mean Cumulative Relative Error ( |
| Geometric Mean of Cumulative Relative Error across all series. |
| Median Cumulative Relative Error ( |
| Median of Cumulative Relative Error across all series. |
| Root Mean Squared Error ( |
| Square root of average squared error across series in time horizon t |
| Percent Better ( |
| Demonstrates average number of times that method overcomes the Random Walk method in time horizon t. |
| | |
Notation Table II
| Symbol | Definition |
|---|---|
|
| Random variable |
|
| Probability density function (pdf) of random variable |
|
| Mean value for the random variable |
|
| Standard deviation for the random variable |
|
| Mean value of the samples belonging to random variable |
|
| Standard deviation of the samples belonging to random variable |
|
|
|
|
|
|
|
| where |
|
| Number of sample set |
|
| Random variable |
|
| Probability density function (pdf) of random variable |
|
| where |
Distance functions to measure dissimilarity between probability density functions of stochastic observation and stochastic predicted outputs
| Distance function | Formula (continuous form) | Formula (discrete form) |
|---|---|---|
| Bhattacharyya |
|
|
| , |
| |
| Hellinger |
|
|
|
|
| |
| Jaccard | - |
|
|
|
Fig. 18Comparison of MAPE and sMAPE domains and ranges spectrum: Red borders in the left graph (a) belong to predicted curves x(t)=2×y(t) and x(t)=0×y(t) with MAPE = 1 and the red borders in the right chart (b) corresponds to x(t)=3×y(t) and x(t)=(1/3)×y(t) which generate sMAPE = 1. The black borders in graphs c & d are corresponding to predicted epidemic curves which generates MAPE=2 and sMAPE =2 in the left and right charts sequentially
Fig. 19Colored Spectrum of MAPE range: MAPE does not have any limitation from the upper side that results in eliminating the large overestimated forecasting
Different error measures calculated for one-step-ahead epidemic curve over whole season (2013-2014), averaged across all HHS regions: Comparing Methods M1 to M6 and ARIMA approach
| MAE | RMSE | MAPE | sMAPE | MdAPE | MdsAPE | |
|---|---|---|---|---|---|---|
| Method 1 | 316.18 | 378.63 |
| 0.33 | 0.34 | 0.29 |
| Method 2 | 293.76 | 357.34 |
| 0.31 | 0.30 | 0.26 |
| Method 3 | 224.53 | 293.52 |
| 0.22 | 0.22 | 0.20 |
| Method 4 | 204.5 | 274.41 |
| 0.21 | 0.18 | 0.18 |
| Method 5 | 224.57 | 293.90 |
| 0.22 | 0.22 | 0.20 |
| Method 6 | 204.25 | 274.97 |
| 0.20 | 0.18 | 0.18 |
| ARIMA | 1015.60 | 1187.62 |
| 0.74 | 0.78 | 0.75 |
Fig. 201-step-ahead predicted curve generated by ARIMA vs the observed curve: The large gap between predicted and observed curves shows that ARIMA performance is behind the other six approaches and confirms that clustering approach based on MAPE value could be a good criteria for discriminating methods with totally different performances