| Literature DB >> 35178079 |
Jihoon Moon1, Sungwoo Park2, Seungmin Rho1, Eenjun Hwang2.
Abstract
Daily peak load forecasting (DPLF) and total daily load forecasting (TDLF) are essential for optimal power system operation from one day to one week later. This study develops a Cubist-based incremental learning model to perform accurate and interpretable DPLF and TDLF. To this end, we employ time-series cross-validation to effectively reflect recent electrical load trends and patterns when constructing the model. We also analyze variable importance to identify the most crucial factors in the Cubist model. In the experiments, we used two publicly available building datasets and three educational building cluster datasets. The results showed that the proposed model yielded averages of 7.77 and 10.06 in mean absolute percentage error and coefficient of variation of the root mean square error, respectively. We also confirmed that temperature and holiday information are significant external factors, and electrical loads one day and one week ago are significant internal factors.Entities:
Mesh:
Year: 2022 PMID: 35178079 PMCID: PMC8847022 DOI: 10.1155/2022/6892995
Source DB: PubMed Journal: Comput Intell Neurosci
Summary of recent STLF studies based on ML techniques.
| Author (Year) | Dataset | Granularity | ML method | Rolling procedure | Model interpretability |
|---|---|---|---|---|---|
| Lee and Han [ | South Korea provided by Korea Power Exchange (KPX) | Daily peak load | MLR | Yes | Yes |
| Fan et al. [ | Australian Energy Market Operator (AEMO) | 8 h | KNN | No | No |
| Dong et al. [ | Qingdao City in China | 1 h | Bagging | No | No |
| Sun et al. [ | Tai'an City, Shandong Province in China | 1 h | ANN | No | No |
| Truong et al. [ | Residential building with a renewable energy system | 1 h | AANN | No | No |
| Fan et al. [ | New South Wales (NSW) in Australia | 30 min | EMD | No | Yes |
| Zhang et al. [ | Queensland (QLD) in Australia | 30 min | VMD | Yes | No |
| Bouktif et al. [ | Metropolitan France | 30 min | GA | Yes | Yes |
| Wang et al. [ | University campus in Florida | 1 h | RF | No | Yes |
| Ruiz-Abellón et al. [ | University campus in Spain | 1 h | Bagging | No | Yes |
| Abbasi et al. [ | AEMO | 30 min | XGB | No | Yes |
| Zhang et al. [ | More than 1,400 enterprises in Yangzhong High-Tech Zone, China | Daily | K-means clustering | No | Yes |
Figure 1Architecture of interpretable short-term electrical load forecasting model (DA: day-ahead, WA: week-ahead, DPLF: daily peak load forecasting, and TDLF: total daily load forecasting).
Figure 2Data preprocessing for Cubist modeling (DA: day-ahead, WA: week-ahead, DPLF: daily peak load forecasting, and TDLF: total daily load forecasting).
Building information.
| Dataset # | Number of buildings | Building type (description) | Location | Dataset period | Public access |
|---|---|---|---|---|---|
| Building 1 | 1 | Commercial (office) | Richland, Washington | Jan. 2, 2009–Dec. 31, 2011 | Yes |
| Building 2 | 1 | Commercial (office) | Richland, Washington | Jan. 2, 2009–Dec. 31, 2011 | Yes |
| Cluster 1 | 16 | Educational (dormitory) | Seoul, South Korea | Jan. 1, 2016–Dec. 31, 2018 | No |
| Cluster 2 | 32 | Educational (humanities bldg.) | Seoul, South Korea | Jan. 1, 2016–Dec. 31, 2018 | No |
| Cluster 3 | 5 | Educational (engineering bldg.) | Seoul, South Korea | Jan. 1, 2016–Dec. 31, 2018 | No |
Statistics on daily peak electrical load data (unit: kW).
| Statistics | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 |
|---|---|---|---|---|---|
| Number of valid cases | 1094 | 1094 | 1096 | 1096 | 1096 |
| Mean | 49.16 | 54.46 | 1575.94 | 4132.02 | 2606.01 |
| Standard deviation | 21.69 | 21.15 | 308.91 | 1327.07 | 451.57 |
| Trimmed mean | 50.40 | 56.23 | 1552.32 | 4325.76 | 2623.04 |
| Median | 48.59 | 54.52 | 1561.97 | 4176.24 | 2670.00 |
| Median absolute deviation | 19.87 | 18.52 | 321.31 | 1537.16 | 437.66 |
| Minimum | 8.86 | 10.97 | 878.40 | 1426.56 | 1579.20 |
| Maximum | 141.11 | 135.00 | 2623.68 | 6900.48 | 3549.60 |
| Range | 132.25 | 124.03 | 1745.28 | 5473.92 | 1970.40 |
| Skew | 0.34 | 0.05 | 0.42 | –0.25 | –0.31 |
| Kurtosis | 0.43 | 0.09 | –0.15 | –1.03 | –0.71 |
| Standard error | 0.66 | 0.64 | 9.33 | 40.09 | 13.66 |
Statistics on total daily electrical load data (unit: kW).
| Statistics | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 |
|---|---|---|---|---|---|
| Number of valid cases | 1094 | 1094 | 1096 | 1096 | 1096 |
| Mean | 719.31 | 852.63 | 29802.41 | 62563.10 | 49440.52 |
| Standard deviation | 250.53 | 290.95 | 5350.39 | 17167.95 | 6409.23 |
| Trimmed mean | 723.27 | 850.15 | 29390.76 | 62514.48 | 49506.54 |
| Median | 714.88 | 844.97 | 29569.92 | 62872.23 | 49771.80 |
| Median absolute deviation | 229.40 | 244.27 | 5407.28 | 20269.87 | 6802.47 |
| Minimum | 198.22 | 242.09 | 19013.04 | 27961.44 | 32546.40 |
| Maximum | 1527.30 | 2130.50 | 49235.76 | 98475.84 | 64403.70 |
| Range | 1329.08 | 1888.41 | 30222.72 | 70514.40 | 31857.30 |
| Skew | 0.16 | 0.44 | 0.45 | –0.13 | –0.11 |
| Kurtosis | –0.32 | 0.98 | –0.03 | –1.06 | –0.64 |
| Standard error | 7.57 | 8.80 | 161.61 | 518.58 | 193.86 |
Residual standard error and R-squared statistics for daily peak electrical load data.
| Statistics | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1D | 2D | 1D | 2D | 1D | 2D | 1D | 2D | 1D | 2D | |
| Residual standard error (unit: kW) | 17.7 | 17.1 | 17.4 | 16.5 | 285.8 | 271.4 | 1128 | 1121 | 365.9 | 351.9 |
| Multiple R-squared (unit: %) | 33.9 | 38.6 | 32.6 | 39.7 | 14.6 | 23.2 | 27.9 | 29.1 | 36.4 | 41.3 |
| Adjusted R-squared (unit: %) | 33.7 | 38.2 | 32.4 | 39.4 | 14.4 | 22.8 | 27.7 | 28.7 | 36.2 | 41.0 |
Residual standard error and R-squared statistics for total daily electrical load data.
| Statistics | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1D | 2D | 1D | 2D | 1D | 2D | 1D | 2D | 1D | 2D | |
| Residual standard error (unit: kW) | 211 | 203 | 256 | 228 | 4958 | 4698 | 14680 | 14670 | 5699 | 5417 |
| Multiple R-squared (unit: %) | 29.3 | 34.9 | 22.9 | 39.0 | 14.4 | 23.3 | 27.1 | 27.4 | 27.4 | 34.6 |
| Adjusted R-squared (unit: %) | 29.1 | 34.6 | 22.7 | 38.7 | 14.1 | 22.9 | 26.9 | 27.0 | 27.2 | 34.2 |
Figure 3Example of mid-term forecast provided by the KMA.
Input variables for day-ahead and week-ahead forecasts.
| IV # | Input variables for day-ahead forecasting | Input variables for week-ahead forecasting | Variable type |
|---|---|---|---|
| 01 | Monthx | Monthx | Continuous [–1, 1] |
| 02 | Monthy | Monthy | Continuous [–1, 1] |
| 03 | Dayx | Dayx | Continuous [–1, 1] |
| 04 | Dayy | Dayy | Continuous [–1, 1] |
| 05 | Day of the weekx | Day of the weekx | Continuous [–1, 1] |
| 06 | Day of the weeky | Day of the weeky | Continuous [–1, 1] |
| 07 | Holiday | Holiday | Binary |
| 08 | Minimum temperature | Minimum temperature | Continuous |
| 09 | Average temperature | Average temperature | Continuous |
| 10 | Maximum temperature | Maximum temperature | Continuous |
| 11 | Holiday (the day before seven days) | Holiday (the day before four weeks) | Binary |
| 12 | Electrical load (the day before seven days) | Electrical load (the day before four weeks) | Continuous |
| 13 | Holiday (the day before six days days) | Holiday (the day before six three weeks) | Binary |
| 14 | Electrical load (the day before six days) | Electrical load (the day before three weeks) | Continuous |
| 15 | Holiday (the day before five days) | Holiday (the day before five two weeks) | Binary |
| 16 | Electrical load (the day before five days) | Electrical load (the day before two weeks) | Continuous |
| 17 | Holiday (the day before four days) | Holiday (the day before one week) | Binary |
| 18 | Electrical load (the day before four days) | Electrical load (the day before one week) | Continuous |
| 19 | Holiday (the day before three days) | — | Binary |
| 20 | Electrical load (the day before three days) | — | Continuous |
| 21 | Holiday (the day before two days) | — | Binary |
| 21 | Electrical load (the day before two days) | — | Continuous |
| 23 | Holiday (the day before one day) | — | Binary |
| 24 | Electrical load (the day before one day) | — | Continuous |
Figure 4Flowchart of interpretable electrical load forecasting based on Cubist modeling (DA: day-ahead, WA: week-ahead, DPLF: daily peak load forecasting, and TDLF: total daily load forecasting).
Figure 5. Time-series cross-validation for day-ahead forecasting and week-ahead forecasting. (a) Day-ahead forecasting. (b) Week-ahead forecasting.
List of the hyperparameters used to build optimal forecasting models.
| Methods | Package | Hyperparameters and their range |
|---|---|---|
| MLR [ | lm | None (automatic identification) |
| PLS [ | pls, caret | ncomp (vector of positive integers): 1 : 1 less than the number of input variables |
| MARS [ | earth, caret | degree (maximum degree of interactions): 1 : 3 |
| nprune (number of terms retained in the final model): 2, 13, 24, 35, 46, 56, 67, 78, 89, 100 | ||
| KNN [ | caret | k (number of neighbors): 2 |
| SVR [ | kernlab, caret | sigma (sigma): 0.35, 0.4, 0.1 |
| C (cost): 1, 3, 5, 8, 10, 12 | ||
| DT [ | rpart | maxdepth (maximum depth of any node of the final tree): automatic identification |
| Bagging [ | Ipred | None (automatic identification) |
| RF [ | randomForest | mtry (number of variables randomly chosen at each split): number of input variables divided by 3 |
| ntree (number of trees to grow): 128 | ||
| GBM [ | gbm | n.trees (number of trees to grow): 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000 |
| interaction.depth (maximum depth of each tree): 5 | ||
| shrinkage (shrinkage or learning rate parameter): 0.001 | ||
| bag.fraction (subsampling rate): 0.5 | ||
| XGB [ | xgboost, caret | nrounds (number of trees to grow): 50, 100, 250, 500 |
| eta (shrinkage or learning rate parameter): 0.01, 0.1, 1 | ||
| lambda (L2 regularization term on weights): 0.1, 0.5, 1 | ||
| alpha (L1 regularization term on weights): 0.1, 0.5, 1 | ||
| CatBoost [ | catboost, caret | learning rate (shrinkage or learning rate parameter): 0.03, 0.1 |
| depth (maximum depth of each tree): 4, 6, 10 | ||
| l2_leaf_reg (coefficient at the L2 regularization term of the cost function): 1, 3, 5, 7, 9 | ||
| Cubist [ | Cubist | committees (sequence generation of rule-based models (similar to boosting)): 1, 10, 50, 100 |
| neighbors (single integer value to adjust the rule-based predictions from the training set): 0, 1, 5, 9 |
MAPE comparison of DA-DPLF (%).
| Methods | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | |
| MLR | 21.15 | 20.88 | 18.27 | 17.95 | 5.55 | 5.55 | 8.91 | 8.95 | 3.68 | 3.81 |
| PLS | 21.10 | 33.30 | 18.45 | 28.21 | 5.51 | 10.77 | 8.95 | 22.90 | 3.70 | 9.15 |
| MARS | 20.32 | 18.83 | 18.08 | 16.28 | 4.96 | 4.70 | 5.90 | 5.95 | 2.83 | 2.87 |
| KNN | 22.35 | 22.15 | 19.97 | 19.49 | 6.65 | 6.02 | 9.49 | 8.80 | 3.98 | 3.87 |
| SVR | 22.17 | 20.82 | 17.37 | 15.95 | 5.94 |
| 7.42 | 6.56 | 2.81 |
|
| DT | 22.11 | 23.20 | 26.04 | 23.10 | 8.71 | 8.05 | 12.07 | 11.56 | 5.22 | 5.05 |
| Bagging | 23.08 | 22.11 | 22.13 | 20.56 | 7.30 | 6.58 | 10.91 | 10.39 | 4.67 | 4.43 |
| RF | 19.72 | 17.78 | 18.42 | 15.42 | 5.90 | 5.12 | 6.45 | 5.99 | 2.98 | 2.63 |
| GBM | 18.17 | 16.99 | 17.90 | 15.38 | 5.42 | 4.87 | 6.21 | 5.82 | 2.99 | 2.71 |
| XGB | 19.93 | 17.17 | 18.50 | 15.08 | 6.23 | 5.32 | 6.25 | 5.70 | 2.94 | 2.68 |
| CatBoost | 22.61 | 20.31 | 19.31 | 16.88 | 6.49 | 5.49 | 7.48 | 6.65 | 3.58 | 3.05 |
| Cubist | 18.60 |
| 14.97 |
| 4.90 | 4.68 | 5.09 |
| 3.15 | 2.78 |
Values in bold indicate the lowest values for the respective datasets.
CVRMSE comparison of DA-DPLF (%).
| Methods | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | |
| MLR | 24.03 | 23.85 | 19.35 | 19.18 | 7.36 | 7.35 | 10.03 | 10.00 | 5.50 | 5.85 |
| PLS | 24.07 | 34.33 | 19.47 | 27.73 | 7.31 | 12.83 | 10.04 | 22.38 | 5.50 | 11.38 |
| MARS | 28.66 | 23.11 | 18.47 | 17.88 | 6.94 | 6.45 | 6.83 | 6.62 | 3.41 | 3.71 |
| KNN | 29.75 | 27.70 | 22.28 | 22.15 | 9.71 | 8.70 | 12.05 | 11.06 | 5.42 | 5.30 |
| SVR | 30.28 | 26.61 | 17.65 | 17.44 | 11.06 | 7.66 | 8.78 | 7.66 | 4.53 | 3.99 |
| DT | 26.34 | 26.86 | 24.73 | 22.67 | 12.28 | 10.66 | 12.61 | 12.49 | 5.81 | 5.62 |
| Bagging | 26.90 | 25.18 | 20.54 | 19.59 | 10.81 | 9.01 | 11.04 | 10.71 | 5.27 | 5.00 |
| RF | 27.93 | 23.33 | 18.26 | 17.03 | 8.81 | 7.19 | 7.09 | 6.55 | 3.57 |
|
| GBM | 26.68 |
| 18.08 | 16.92 | 8.07 | 6.75 | 6.91 | 6.44 | 3.53 | 3.28 |
| XGB | 29.37 | 23.13 | 18.43 | 17.60 | 9.32 | 7.27 | 7.28 | 6.65 | 3.56 | 3.32 |
| CatBoost | 31.20 | 26.08 | 19.48 | 18.07 | 9.45 | 7.56 | 8.09 | 7.31 | 4.35 | 3.90 |
| Cubist | 26.90 | 23.22 | 17.34 |
| 6.72 |
|
| 5.67 | 4.45 | 3.62 |
Values in bold indicate the lowest values for the respective datasets.
MAPE comparison of DA-TDLF (%).
| Methods | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | |
| MLR | 11.18 | 11.12 | 10.98 | 10.66 | 3.61 | 3.63 | 7.06 | 7.11 | 2.55 | 2.73 |
| PLS | 11.18 | 26.27 | 11.10 | 27.34 | 3.60 | 8.62 | 7.03 | 18.13 | 2.57 | 7.49 |
| MARS | 9.39 |
| 11.59 | 9.97 | 3.28 | 3.36 | 4.30 | 4.26 | 2.08 | 2.09 |
| KNN | 13.03 | 12.05 | 18.82 | 17.67 | 6.00 | 5.05 | 7.42 | 6.89 | 4.79 | 4.63 |
| SVR | 9.84 | 8.74 | 17.14 | 10.74 | 3.97 |
| 5.19 | 4.46 | 2.89 | 2.15 |
| DT | 17.40 | 17.29 | 23.68 | 19.71 | 6.96 | 6.16 | 9.12 | 8.86 | 4.72 | 4.22 |
| Bagging | 15.89 | 15.24 | 18.90 | 16.11 | 5.80 | 5.01 | 8.33 | 8.00 | 4.18 | 3.48 |
| RF | 10.92 | 9.95 | 15.01 | 9.58 | 4.44 | 3.97 | 4.51 | 4.29 | 2.46 | 2.05 |
| GBM | 10.74 | 9.77 | 14.44 | 10.04 | 4.08 | 3.59 | 4.61 | 4.42 | 2.50 | 2.08 |
| XGB | 10.98 | 9.86 | 13.66 | 9.47 | 4.54 | 3.84 | 4.54 | 4.38 | 2.63 | 2.18 |
| CatBoost | 11.49 | 10.20 | 14.94 | 10.07 | 4.92 | 3.81 | 5.27 | 4.70 | 2.95 | 2.23 |
| Cubist | 8.80 | 8.89 | 11.20 |
| 3.26 | 3.24 |
| 3.60 | 2.36 |
|
Values in bold indicate the lowest values for the respective datasets.
CVRMSE comparison of DA-TDLF (%).
| Methods | Building 1 | Building 2 | Cluster 1 | Cluster 2 | Cluster 3 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | Holdout | TSCV | |
| MLR | 14.23 | 14.17 | 12.98 | 12.79 | 4.84 | 4.83 | 8.31 | 8.31 | 4.43 | 4.45 |
| PLS | 14.13 | 26.59 | 12.95 | 25.95 | 4.82 | 10.19 | 8.31 | 19.07 | 4.45 | 9.14 |
| MARS | 12.59 |
| 12.85 | 11.60 | 4.44 | 4.39 | 4.99 | 5.00 | 3.44 | 3.82 |
| KNN | 17.23 | 16.73 | 19.36 | 18.61 | 8.41 | 6.97 | 10.37 | 9.65 | 6.57 | 6.45 |
| SVR | 12.45 | 11.80 | 16.67 | 12.56 | 8.67 | 4.84 | 7.12 | 6.05 | 5.54 | 5.04 |
| DT | 22.07 | 22.38 | 23.08 | 21.57 | 9.61 | 8.38 | 10.41 | 10.46 | 5.41 | 4.95 |
| Bagging | 19.91 | 19.96 | 18.19 | 17.53 | 8.37 | 7.05 | 9.57 | 9.19 | 4.73 | 4.04 |
| RF | 14.48 | 13.59 | 14.89 | 11.68 | 6.86 | 5.65 | 5.43 | 5.16 | 2.95 |
|
| GBM | 13.47 | 12.86 | 14.54 | 12.17 | 6.25 | 5.10 | 5.56 | 5.31 | 2.94 | 2.79 |
| XGB | 14.86 | 13.77 | 14.71 | 12.45 | 7.00 | 5.45 | 5.65 | 5.48 | 3.12 | 3.66 |
| CatBoost | 14.34 | 13.52 | 15.07 | 12.34 | 7.34 | 5.36 | 6.74 | 5.86 | 3.57 | 2.85 |
| Cubist | 12.09 | 11.72 | 13.16 |
| 4.46 |
|
| 4.24 | 3.27 | 3.34 |
Values in bold indicate the lowest values for the respective datasets.
MAPE comparison of WA-DPLF (%).
| Datasets | Evaluation | Forecasting methods | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MLR | PLS | MARS | KNN | SVR | DT | Bagging | RF | GBM | XGB | CatBoost | Cubist | ||
| Building 1 | Holdout | 25.89 | 26.25 | 22.15 | 24.11 | 20.03 | 26.97 | 24.64 | 19.92 | 19.44 | 19.45 | 21.15 | 20.68 |
| TSCV (avg.) | 26.08 | 28.62 | 21.06 | 23.54 | 19.36 | 25.58 | 23.59 | 19.48 | 19.16 | 19.00 | 20.15 | 19.13 | |
|
| |||||||||||||
| Building 2 | Holdout | 21.52 | 21.60 | 19.28 | 21.97 | 18.92 | 24.82 | 22.04 | 18.60 | 18.31 | 17.44 | 18.76 | 16.92 |
| TSCV (avg.) | 21.26 | 23.89 | 17.10 | 21.17 | 16.99 | 23.58 | 20.97 | 15.87 | 15.56 | 15.65 | 16.87 | 14.55 | |
|
| |||||||||||||
| Cluster 1 | Holdout | 8.00 | 8.01 | 6.82 | 8.53 | 7.91 | 9.60 | 8.89 | 7.89 | 7.55 | 8.25 | 7.89 | 6.77 |
| TSCV (avg.) | 8.16 | 10.47 | 7.01 | 8.07 | 6.80 | 9.47 | 8.71 | 7.26 | 6.99 | 7.77 | 7.26 | 6.53 | |
|
| |||||||||||||
| Cluster 2 | Holdout | 11.27 | 11.38 | 8.75 | 9.79 | 8.11 | 12.48 | 11.49 | 7.75 | 7.60 | 7.87 | 8.48 | 7.04 |
| TSCV (avg.) | 11.18 | 18.27 | 7.32 | 9.88 | 7.56 | 11.92 | 11.07 | 7.46 | 7.18 | 7.41 | 8.24 | 6.87 | |
|
| |||||||||||||
| Cluster 3 | Holdout | 4.77 | 4.85 | 3.82 | 4.86 | 3.14 | 5.76 | 5.24 | 3.50 | 3.53 | 3.57 | 3.93 | 3.40 |
| TSCV (avg.) | 4.93 | 7.38 | 3.57 | 4.63 | 2.83 | 5.56 | 4.94 | 3.19 | 3.22 | 3.43 | 3.57 | 3.25 | |
CVRMSE comparison of WA-DPLF (%).
| Datasets | Evaluation | Forecasting methods | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MLR | PLS | MARS | KNN | SVR | DT | Bagging | RF | GBM | XGB | CatBoost | Cubist | ||
| Building 1 | Holdout | 31.49 | 31.55 | 48.77 | 33.57 | 32.38 | 34.54 | 32.32 | 31.44 | 29.74 | 32.20 | 32.90 | 30.09 |
| TSCV (avg.) | 31.40 | 35.83 | 39.15 | 32.88 | 29.03 | 32.16 | 29.13 | 27.82 | 27.40 | 26.73 | 29.61 | 27.44 | |
|
| |||||||||||||
| Building 2 | Holdout | 23.31 | 23.36 | 19.83 | 23.62 | 19.77 | 23.07 | 20.54 | 19.09 | 18.93 | 19.51 | 20.17 | 19.02 |
| TSCV (avg.) | 23.04 | 27.03 | 18.99 | 22.79 | 19.43 | 23.84 | 19.98 | 18.23 | 17.50 | 18.42 | 19.31 | 18.07 | |
|
| |||||||||||||
| Cluster 1 | Holdout | 10.89 | 10.89 | 9.21 | 11.97 | 11.94 | 13.40 | 12.45 | 11.05 | 10.57 | 11.69 | 10.98 | 9.17 |
| TSCV (avg.) | 10.82 | 13.66 | 9.28 | 11.39 | 9.81 | 12.76 | 11.77 | 10.02 | 9.57 | 10.59 | 9.87 | 8.92 | |
|
| |||||||||||||
| Cluster 2 | Holdout | 12.13 | 12.19 | 9.60 | 11.94 | 9.11 | 13.29 | 11.95 | 8.47 | 8.57 | 9.36 | 9.41 | 8.08 |
| TSCV (avg.) | 12.18 | 20.30 | 8.35 | 12.10 | 8.57 | 12.90 | 11.60 | 8.29 | 8.24 | 8.70 | 9.00 | 8.00 | |
|
| |||||||||||||
| Cluster 3 | Holdout | 6.61 | 6.47 | 4.58 | 5.93 | 4.21 | 6.33 | 5.80 | 4.12 | 4.12 | 4.25 | 4.92 | 4.14 |
| TSCV (avg.) | 7.15 | 10.07 | 4.44 | 5.69 | 3.89 | 6.13 | 5.50 | 3.94 | 3.87 | 4.21 | 4.43 | 4.01 | |
MAPE comparison of WA-TDLF (%).
| Datasets | Evaluation | Forecasting methods | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MLR | PLS | MARS | KNN | SVR | DT | Bagging | RF | GBM | XGB | CatBoost | Cubist | ||
| Building 1 | Holdout | 17.18 | 17.61 | 12.68 | 16.02 | 12.51 | 20.41 | 16.69 | 12.56 | 12.45 | 13.37 | 13.31 | 11.62 |
| TSCV (avg.) | 17.23 | 18.35 | 12.05 | 15.11 | 11.96 | 18.40 | 15.95 | 11.80 | 11.87 | 12.05 | 12.54 | 11.46 | |
|
| |||||||||||||
| Building 2 | Holdout | 17.47 | 17.86 | 15.77 | 21.78 | 17.01 | 24.42 | 19.64 | 16.28 | 16.45 | 15.13 | 16.98 | 15.24 |
| TSCV (avg.) | 16.19 | 16.80 | 12.95 | 20.37 | 11.28 | 17.82 | 15.59 | 10.82 | 11.88 | 10.85 | 12.57 | 10.94 | |
|
| |||||||||||||
| Cluster 1 | Holdout | 6.76 | 6.87 | 6.11 | 8.58 | 6.70 | 9.03 | 7.73 | 7.04 | 6.61 | 7.08 | 7.14 | 5.90 |
| TSCV (avg.) | 6.90 | 9.05 | 5.71 | 8.01 | 5.49 | 8.76 | 7.82 | 6.46 | 6.05 | 6.54 | 6.25 | 5.51 | |
|
| |||||||||||||
| Cluster 2 | Holdout | 9.73 | 9.92 | 5.98 | 7.90 | 6.18 | 9.79 | 9.13 | 5.89 | 5.57 | 5.85 | 6.47 | 4.91 |
| TSCV (avg.) | 9.84 | 15.07 | 5.63 | 7.76 | 5.63 | 9.81 | 9.18 | 5.86 | 5.53 | 5.67 | 6.15 | 5.11 | |
|
| |||||||||||||
| Cluster 3 | Holdout | 4.51 | 4.54 | 3.45 | 5.97 | 3.88 | 5.40 | 4.79 | 3.20 | 3.28 | 3.29 | 3.63 | 2.96 |
| TSCV (avg.) | 4.76 | 5.97 | 3.05 | 5.56 | 3.62 | 4.87 | 4.24 | 2.75 | 2.86 | 2.91 | 3.06 | 2.88 | |
CVRMSE comparison of WA-TDLF (%).
| Datasets | Evaluation | Forecasting methods | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MLR | PLS | MARS | KNN | SVR | DT | Bagging | RF | GBM | XGB | CatBoost | Cubist | ||
| Building 1 | Holdout | 21.71 | 22.25 | 15.16 | 19.58 | 15.71 | 24.91 | 20.65 | 16.35 | 15.10 | 16.57 | 17.49 | 15.04 |
| TSCV (avg.) | 21.59 | 25.73 | 14.18 | 18.26 | 15.33 | 22.97 | 20.26 | 15.43 | 14.82 | 15.73 | 16.33 | 14.82 | |
|
| |||||||||||||
| Building 2 | Holdout | 19.73 | 20.43 | 16.75 | 22.77 | 17.00 | 23.67 | 19.69 | 16.15 | 16.77 | 15.88 | 17.20 | 16.33 |
| TSCV (avg.) | 19.27 | 22.77 | 14.39 | 21.44 | 13.95 | 20.83 | 17.68 | 13.73 | 13.98 | 14.21 | 15.42 | 13.41 | |
|
| |||||||||||||
| Cluster 1 | Holdout | 9.21 | 9.27 | 7.82 | 11.24 | 10.24 | 12.06 | 10.60 | 9.57 | 8.93 | 9.84 | 9.67 | 7.78 |
| TSCV (avg.) | 9.17 | 11.85 | 7.39 | 10.60 | 7.94 | 11.47 | 10.52 | 8.62 | 8.08 | 8.73 | 8.30 | 7.39 | |
|
| |||||||||||||
| Cluster 2 | Holdout | 11.35 | 11.43 | 7.47 | 10.80 | 8.04 | 11.46 | 10.16 | 6.93 | 6.72 | 7.37 | 7.82 | 6.08 |
| TSCV (avg.) | 11.55 | 17.95 | 6.63 | 10.62 | 7.18 | 11.48 | 10.45 | 6.94 | 6.74 | 7.23 | 7.47 | 6.27 | |
|
| |||||||||||||
| Cluster 3 | Holdout | 6.48 | 6.24 | 4.19 | 7.78 | 6.55 | 5.96 | 5.32 | 3.82 | 3.82 | 3.94 | 4.35 | 3.68 |
| TSCV (avg.) | 7.43 | 7.83 | 3.95 | 7.49 | 6.40 | 5.69 | 4.89 | 3.41 | 3.44 | 3.72 | 3.82 | 3.65 | |
Ranks of each model based on the performance metrics and average rank.
| Methods | MAPE | CVRMSE | Average rank | ||||||
|---|---|---|---|---|---|---|---|---|---|
| DA-DPLF | DA-TDLF | WA-DPLF | WA-TDLF | DA-DPLF | DA-TDLF | WA-DPLF | WA-TDLF | ||
| MLR | 8.2 | 7.4 | 9.8 | 9.4 | 7.8 | 7 | 9.4 | 9.6 | 8.6 |
| PLS | 12 | 12 | 12 | 11.4 | 12 | 12 | 11.8 | 12 | 11.9 |
| MARS | 4.8 | 2.8 | 5.4 | 4.6 | 3.8 | 2.6 | 6 | 3.2 | 4.2 |
| KNN | 9 | 9.6 | 8.2 | 9.8 | 9.8 | 9.8 | 9.2 | 9.8 | 9.4 |
| SVR | 4 | 4.4 | 3.8 | 3.8 | 7 | 6.2 | 4.6 | 4.8 | 4.8 |
| DT | 11 | 10.8 | 10.8 | 10.8 | 10.6 | 10.6 | 10.4 | 10 | 10.6 |
| Bagging | 9.8 | 9.4 | 9.2 | 8.6 | 8.6 | 9 | 8 | 8.2 | 8.9 |
| RF | 4 | 4.4 | 4.2 | 3.2 | 3.2 | 4.2 | 3.8 | 3.6 | 3.8 |
| GBM | 3.2 | 4.2 | 2.6 | 3.2 |
| 3.8 |
| 3 | 3.0 |
| XGB | 3.2 | 4.8 | 4 | 4.6 | 4.2 | 6 | 4.6 | 5.6 | 4.6 |
| CatBoost | 6.8 | 6.6 | 5.8 | 6.2 | 6.8 | 5 | 6.2 | 6.2 | 6.2 |
| Cubist |
|
|
|
| 2.2 |
| 2.2 |
|
|
Values in bold indicate the lowest values for the respective electrical load forecasting types (DA: day-ahead; WA: week-ahead).
Prediction performance of Lee and Han's MLR and Cubist.
| Datasets | Metrics | DPLF | TDLF | ||
|---|---|---|---|---|---|
| MLR [ | Cubist | MLR [ | Cubist | ||
| Building 1 | MAPE | 28.98 | 16.98 | 17.19 | 8.89 |
| CVRMSE | 29.58 | 23.22 | 19.57 | 11.72 | |
|
| |||||
| Building 2 | MAPE | 24.44 | 13.51 | 16.05 | 8.39 |
| CVRMSE | 24.57 | 16.16 | 17.82 | 10.55 | |
|
| |||||
| Cluster 1 | MAPE | 7.22 | 4.68 | 5.00 | 3.24 |
| CVRMSE | 9.17 | 6.29 | 6.49 | 4.36 | |
|
| |||||
| Cluster 2 | MAPE | 13.62 | 5.03 | 16.59 | 3.60 |
| CVRMSE | 10.22 | 5.67 | 13.66 | 4.24 | |
|
| |||||
| Cluster 3 | MAPE | 6.20 | 2.78 | 5.12 | 1.98 |
| CVRMSE | 8.83 | 3.62 | 7.71 | 3.34 | |
Paired sample t-test of holdout and TSCV.
| Statistics | MAPE | CVRMSE |
|---|---|---|
|
| 11.136 | 11.167 |
| Degrees of freedom (df) | 219 | 219 |
| Significance level of the | 2.2 × 10−16 | 2.2 × 10−16 |
| Confidence interval (conf.int) of the mean differences at 95% | [0.787, 1.125] | [0.821, 1.173] |
| Mean differences between pairs (sample estimates) | 0.956 | 0.997 |
Results of Wilcoxon signed-rank and Friedman tests.
| Methods | Wilcoxon signed-rank test | Friedman test | ||
|---|---|---|---|---|
| MAPE | CVRMSE | MAPE | CVRMSE | |
| MLR | 1.907 × 10−6 | 1.907 × 10−6 | 2.2 × 10−16 | 2.2 × 10−16 |
| PLS | 1.907 × 10−6 | 1.907 × 10−6 | ||
| MARS | 3.624 × 10−5 | 0.005841 | ||
| KNN | 1.907 × 10−6 | 1.907 × 10−6 | ||
| SVR | 0.009463 | 5.722 × 10−6 | ||
| DT | 1.907 × 10−6 | 1.907 × 10−6 | ||
| Bagging | 1.907 × 10−6 | 1.907 × 10−6 | ||
| RF | 0.000168 | 0.001432 | ||
| GBM | 8.202 × 10−5 | 0.019580 | ||
| XGB | 6.294 × 10−5 | 0.000210 | ||
| CatBoost | 1.907 × 10−6 | 0.000175 | ||
Figure 6Example of variable importance for DA-DPLF (Building 1).
Figure 7Example of variable importance for DA-TDLF (Building 1).
Figure 8Example of variable importance for WA-DPLF (Building 1).
Figure 9Example of variable importance for WA-TDLF (Building 1).