| Literature DB >> 35495076 |
Aji Prasetya Wibawa1, Agung Bella Putra Utama1, Hakkun Elmunsyah1, Utomo Pujianto1, Felix Andika Dwiyanto1, Leonel Hernandez2.
Abstract
CNN originates from image processing and is not commonly known as a forecasting technique in time-series analysis which depends on the quality of input data. One of the methods to improve the quality is by smoothing the data. This study introduces a novel hybrid exponential smoothing using CNN called Smoothed-CNN (S-CNN). The method of combining tactics outperforms the majority of individual solutions in forecasting. The S-CNN was compared with the original CNN method and other forecasting methods such as Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM). The dataset is a year time-series of daily website visitors. Since there are no special rules for using the number of hidden layers, the Lucas number was used. The results show that S-CNN is better than MLP and LSTM, with the best MSE of 0.012147693 using 76 hidden layers at 80%:20% data composition.Entities:
Keywords: CNN; Exponential smoothing; Optimum smoothing factor; Time-series
Year: 2022 PMID: 35495076 PMCID: PMC9040363 DOI: 10.1186/s40537-022-00599-y
Source DB: PubMed Journal: J Big Data ISSN: 2196-1115
Types of data smoothing
| Method | Advantages | Disadvantages |
|---|---|---|
| Exponential smoothing/simple exponential smoothing | • Ease of calculation • Flexibility • Good performance | • Not capable of managing trends well |
| Moving average | • Best used when there is slight or no seasonal variation | • Might not accurately reflect the most recent trends |
| Random walk | • Simple to use • Can easily handle flows around complicated boundaries | • Does not precisely conserve the mean position of the vorticity in free space • The computed solutions are noisy due to the statistical errors |
Related Works
| Title | Year | Method | Resume |
|---|---|---|---|
| Convolutional Neural Network–Component Transformation (CNN–CT) for confirmed COVID-19 cases [ | 2021 | CNN-CT (ARIMA and ES) | The combination of strategies outperformed most individual methods |
| A comparison between Seasonal Autoregressive Integrated Moving Average (SARIMA) and Exponential Smoothing (ES) based on Time Series Model for forecasting road accidents [ | 2021 | SARIMA and ES | The ES model outperformed the SARIMA model of mean absolute error, and root mean square error, mean absolute percentage error, and normalized Bayesian information criteria |
| On short-term load forecasting using machine learning techniques and a Novel Parallel Deep LSTM-CNN approach [ | 2021 | ARIMA, ES, Linear Regression, SVR, DNN, LSTM, LSTM-CNN, PLCNet | ARIMA and ES are two well-known time-series analysis approaches that need some parameter adjustment to work with these methods |
| A study on the prediction of power demand for electric vehicles using exponential smoothing techniques [ | 2021 | ES and ARIMA | ES is 9% more accurate than ARIMA as a model of electric vehicle power-demand prediction models |
| Smoothing and stationarity enforcement framework for deep learning time-series forecasting [ | 2021 | ES and CNN-LSTM | ES increases the deep learning forecasting performance |
| A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting [ | 2020 | ES-RNN | The winning hybrid method is used for data deseasonalization, normalization, and extrapolation |
| Forecasting time series with multiplicative trend exponential smoothing and LSTM: COVID-19 case study [ | 2020 | MTES and LSTM | MTES outperformed LSTM in terms of RSME |
Fig. 1Experimental design of Smoothed-CNN (S-CNN) with optimum
Dataset Description
| Dataset 1 | Dataset 2 | Dataset 3 | Dataset 4 | |
|---|---|---|---|---|
| Dataset Characteristics | Multivariate | Multivariate | Multivariate | Multivariate |
| Attribute Characteristics | Numeric | Numeric | Numeric | Numeric |
| Instances | 365 | 365 | 365 | 365 |
| Attributes | 4 (Sessions, page views, visitors, new visitors) | 4 (Page loads, unique visits, first time visits, returning visits) | 2 (Sell, target) | 32 cities in India |
| Missing data | No | No | No | No |
| Selected Attributes | Sessions | Unique visits | Sell | Delhi |
| Sources |
Fig. 2Training testing data composition
Fig. 3Time-series component of optimum
Fig. 4Sequence of Fibonacci and Lucas numbers
The list of CNN forecast component parameters
| Category | Parameter | Value |
|---|---|---|
| Convolutional layer | Type of convolutional | Conv1D |
| The number of convolutional layers | 3 | |
| The number of filters | 128 | |
| The filter size | 2 | |
| The activation function | ReLU | |
| Pooling layer | Type of pooling | MaXPooling1D |
| The number of pooling layer | 1 | |
| Size of the pooling window | 2 | |
| Flattent layer | The number of flatten layer | 1 |
| Fully connected layer | The number of hidden layers | Lucas number |
| The number of units or neuron | 100 | |
| The Activation function output | ReLU | |
| Loss function | MSE | |
| Type of optimizer | Adam | |
| The number of epochs | 10,000 | |
| The batch size | 16 |
MSE of CNN and S-CNN ( = 0.57) in all scenarios
| Number of Hidden Layers | Scenario 1 | Scenario 2 | ||
|---|---|---|---|---|
| CNN | S-CNN | CNN | S-CNN | |
| 3 | 0.039486471 | 0.036637854 | 0.015127435 | 0.015027847 |
| 4 | 0.038327266 | 0.035072528 | 0.022477422 | 0.020706612 |
| 7 | 0.026439827 | 0.022077946 | 0.019107487 | 0.017952291 |
| 11 | 0.025968207 | 0.025052203 | 0.029150361 | 0.023890096 |
| 18 | 0.024530865 | 0.020869428 | 0.017511018 | 0.017088791 |
| 29 | 0.026677929 | 0.025410001 | 0.017774397 | 0.016944807 |
| 47 | 0.026606092 | 0.026262877 | 0.013246628 | 0.013057781 |
| 76 | 0.026771594 | 0.020868076 | 0.013227105 | 0.012147693 |
| Average | 0.029351031 | 0.026531364 | 0.018452732 | 0.017101990 |
Training time (s) of CNN and S-CNN ( = 0.57) in all scenarios
| Number of Hidden Layers | Scenario 1 | Scenario 2 | ||
|---|---|---|---|---|
| CNN | S-CNN | CNN | S-CNN | |
| 3 | 1401 | 1221 | 1901 | 1671 |
| 4 | 1451 | 1391 | 2021 | 1722 |
| 7 | 1551 | 1421 | 2121 | 1941 |
| 11 | 1650 | 1470 | 2380 | 2120 |
| 18 | 1810 | 1711 | 2630 | 2420 |
| 29 | 2120 | 1950 | 2840 | 2631 |
| 47 | 2601 | 2451 | 3241 | 2941 |
| 76 | 3521 | 3410 | 3641 | 3351 |
| Average | 2013 | 1878 | 2597 | 2350 |
MAPE of CNN and S-CNN ( = 0.57) in all scenarios
| Number of Hidden Layers | Scenario 1 | Scenario 2 | ||
|---|---|---|---|---|
| CNN | S-CNN | CNN | S-CNN | |
| 3 | 10.38339615 | 10.15580773 | 10.32497287 | 9.86823878 |
| 4 | 10.63442349 | 9.45147180 | 10.23149490 | 10.26693096 |
| 7 | 10.55925727 | 9.48089667 | 9.29571771 | 10.58340192 |
| 11 | 10.44860005 | 10.49708843 | 10.37959695 | 10.50005768 |
| 18 | 10.86777925 | 10.29217958 | 10.22896290 | 10.13189130 |
| 29 | 10.48959136 | 10.39933443 | 10.74029326 | 9.49165793 |
| 47 | 10.60612321 | 9.936217308 | 10.73232293 | 9.50004619 |
| 76 | 10.66244721 | 10.57700276 | 10.75060725 | 9.70475220 |
| Average | 10.58145225 | 10.09874984 | 10.33549610 | 10.00587208 |
Fig. 5The forecasting results of CNN scenario 1
Fig. 6The forecasting results of S-CNN scenario 1
Fig. 7Comparison of CNN and S-CNN with scenario 1: a MSE; b Processing time
Fig. 8The forecasting results of CNN scenario 2
Fig. 9The forecasting results of S-CNN scenario 2
Fig. 10Comparison of CNN and S-CNN with scenario 2: a MSE; b Processing time
Performance comparison using various values of smoothing factors
| Alpha (α) | MSE | MAPE | Training Time (s) | |||
|---|---|---|---|---|---|---|
| Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | |
| 0.1 | 0.041253364 | 0.037273699 | 10.53615943 | 10.38117275 | 1931 | 2565 |
| 0.2 | 0.040030418 | 0.036722839 | 10.52603677 | 10.40674120 | 1773 | 2459 |
| 0.3 | 0.037012976 | 0.032575541 | 10.47502294 | 10.42701104 | 1808 | 2354 |
| 0.4 | 0.033451774 | 0.028436491 | 10.27203577 | 10.21281549 | 1908 | 2470 |
| 0.5 | 0.028844447 | 0.019506374 | 10.26041789 | 10.04790929 | 1625 | 2622 |
| 0.6 | 0.028121199 | 0.019009872 | 10.32034706 | 10.23355049 | 1766 | 2466 |
| 0.7 | 0.027940839 | 0.019117107 | 10.20982302 | 10.12943395 | 1968 | 2285 |
| 0.8 | 0.028185234 | 0.020425496 | 10.32522286 | 10.22963096 | 1648 | 2566 |
| 0.9 | 0.028846332 | 0.018137853 | 10.35108365 | 10.13430657 | 1728 | 2444 |
| 0.57 | 0.026531364 | 0.01710199 | 10.09874984 | 10.00587208 | 1878 | 2350 |
Paired T-test result based on lucas hidden layers
| MSE | MAPE | Training Time (s) | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | |||||||
| CNN | S-CNN | CNN | S-CNN | CNN | S-CNN | CNN | S-CNN | CNN | S-CNN | CNN | S-CNN | |
| Lucas Hidden Layers | 0.0312 | 0.0312 | 0.0311 | 0.0311 | 0.0709 | 0.0559 | 0.0608 | 0.0604 | 0.0001 | 0.0001 | 0.0001 | 0.0001 |
Paired T-test result based on alpha
| MSE | MAPE | Training Time (s) | ||||
|---|---|---|---|---|---|---|
| Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | Scenario 1 | Scenario 2 | |
| Alpha (α) | 0.0003 | 0.0003 | 0.0001 | 0.0001 | 0.0001 | 0.0001 |
Forecasting comparison
| Scenario | 1 (70%:30%) | 2 (80%:20%) | ||||
|---|---|---|---|---|---|---|
| Method | MSE | MAPE | Training Time (s) | MSE | MAPE | Training Time (s) |
| MLP | 0.642934239 | 11.50568091 | 1340 | 0.635622626 | 10.94862916 | 1637 |
| LSTM | 1.701240209 | 11.39517313 | 2498 | 0.835000882 | 10.60269367 | 2927 |
| CNN | 0.029351031 | 10.58145225 | 2013 | 0.018452732 | 10.33549610 | 2597 |
| S-CNN | 0.026531364 | 10.09874984 | 1878 | 0.017101990 | 10.00587208 | 2350 |
Comparison with others Dataset
| Method | MSE | MAPE | Training Time (s) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Dataset 2 | Dataset 3 | Dataset 4 | Dataset 2 | Dataset 3 | Dataset 4 | Dataset 2 | Dataset 3 | Dataset 4 | |
| MLP | 1.846002201 | 5.650526501 | 0.910128122 | 10.98622048 | 9.98928011 | 9.92809521 | 3332 | 2720 | 3214 |
| LSTM | 1.844627951 | 5.597686482 | 0.577352403 | 10.98383963 | 9.98898726 | 9.93592591 | 3041 | 3263 | 3996 |
| CNN | 0.126788296 | 0.194362491 | 0.145794914 | 10.79435834 | 7.41526792 | 10.81731200 | 3911 | 3260 | 3420 |
| S-CNN | 0.100307581 | 0.170961120 | 0.101014294 | 10.60502933 | 7.30898141 | 10.68965315 | 3870 | 2660 | 3403 |