| Literature DB >> 35941276 |
Justyna Stańczyk1, Joanna Kajewska-Szkudlarek2, Piotr Lipiński3, Paweł Rychlikowski3.
Abstract
Modern solutions in water distribution systems are based on monitoring the quality and quantity of drinking water. Identifying the volume of water consumption is the main element of the tools embedded in water demand forecasting (WDF) systems. The crucial element in forecasting is the influence of random factors on the identification of water consumption, which includes, among others, weather conditions and anthropogenic aspects. The paper proposes an approach to forecasting water demand based on a linear regression model combined with evolutionary strategies to extract weekly seasonality and presents its results. A comparison is made between the author's model and solutions such as Support Vector Regression (SVR), Multilayer Perceptron (MLP), and Random Forest (RF). The implemented daily forecasting procedure allowed to minimize the MAPE error to even less than 2% for water consumption at the water supply zone level, that is the District Metered Area (DMA). The conducted research may be implemented as a component of WDF systems in water companies, especially at the stage of data preprocessing with the main goal of improving short-term water demand forecasting.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35941276 PMCID: PMC9360038 DOI: 10.1038/s41598-022-17177-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Review of the water demand forecasting methodology.
| Authors | Year | Prediction methods | Forecast horizon | Length of the time series |
|---|---|---|---|---|
| Alvisi and Franchini[ | 2014 | MCP ANN Patt_for Hybrida | Hourly demand over a time horizon of 24 h | 3 years |
| Baker et al.[ | 2014 | AHM Transfer/-noisea MLR | Daily | 6 years |
| Candelieri et al.[ | 2014 2015 | SVMa ANN | Daily | 2 years |
| Chen and Boccelli[ | 2014 | SARIMA | Daily | 39 days |
| Kofinas et al.[ | 2014 | WAM ANNa ARIMAa | Monthly, seasonal | 3 years |
| Romano and Kapelan[ | 2014 | EANN | Daily, hourly | 181 days |
| Tiwari et al.[ | 2015 2016 | WBANN ELMW ANNB | Monthly Weekly Daily | 3 years |
| Arandia et al.[ | 2016 | SARIMA | Sub-hourly, hourly and daily water demand | 10/19 months |
| Brentan et al.[ | 2017 | SVM-AFS | Monthly Weekly Daily, hourly | 1,5 year |
| Ghiassi et al.[ | 2017 | DAN2a FTDNN KNN | Monthly Weekly Daily | 8 years |
| Duerr et al.[ | 2018 | Regression models AR(1) ARIMA GAM Spatio-temporal (ST) gaussian process models GBM RF BART | Monthly | 12 years |
| Kozłowski et al.[ | 2018 | Phase Trend Method PTM Harmonic analysis | Hourly demand over a time horizon of 1 month | 2 months |
| Xenochristou et al.[ | 2018 | Random Forests | Daily | 3 years |
| Ambrosio et al.[ | 2019 | Committee machinesa: MLP SVM ELM RF ANFIS GMDH | Hourly | 33 months |
| Xu et al.[ | 2019 | CDBESNa SVR ESN CDBNN | Hourly | 11 months |
| Guo et al.[ | 2020 | SMWOAa ASLWOA WMWOA WOA | Yearly demand for water resources over a time horizon of 5 years | 13 years |
| Karamaziotis et al.[ | 2020 | ARIMAa ETS Theta Opt.Theta MAPA MLP Ensemble | Monthly | 7 years |
| Smolak et al.[ | 2020 | ET SVR RFa ARIMA/ARIMAX Blind | Weekly Daily | 51 days |
| Bata et al.[ | 2020 | RT with SOM | Hourly Daily Weekly | 4 months |
| Shirkoohi et al.[ | 2021 | ANN with genetic algorithm | 15-min | 5 years and 23 months |
aThe best matching effects when compared to the others.
Figure 1The methodology scheme.
Figure 2The general scheme of the evolutionary algorithm.
Mean absolute percentage error for linear regression and selected input data.
| Feature subset description | MAPE (–) | Input data |
|---|---|---|
| Best combination without history | 0.03662 | Days of the week, wind speed, maximum temperature |
| All features | 0.02814 | All 15 features |
| Best one feature | 0.02676 | History0 |
| Best two features | 0.02463 | History0, history1 |
| Best for one week history | 0.02405 | History0, days of the week, precipitation, wind speed, minimum temperature |
| Best three features | 0.02399 | History0, history1, wind speed |
| Best without week days | 0.02383 | History0, history1, precipitation, wind speed, mean temperature |
| Best feature combination | 0.02379 | History0, history1, days of the week, precipitation, wind speed, ground temperature |
Forecast mean absolute percentage error comparison for selected methods.
| Methods | All features | Selected features |
|---|---|---|
| Linear Regression | 0.02150 | 0.02197 |
| SVR-linear | 0.02481 | 0.02449 |
| RF-5 | 0.02396 | 0.02408 |
| RF-10 | 0.02404 | 0.02319 |
| CART-Tree | 0.03700 | 0.03356 |
| MLP20-15 | 0.03986 | 0.03508 |
| MLP30-10 | 0.04610 | 0.03162 |
| MLP15-10 | 0.05010 | 0.02969 |
| Average | 0.03342 | 0.02796 |
Prediction error for a method containing an evolutionary approach.
| Test dataset | Dataset | |||
|---|---|---|---|---|
| No. | Period | Training | Test | |
| 1 | 2016–01-28 | 2016–02-27 | 0.02054 | 0.01467 |
| 2 | 2016–04-22 | 2016–05-22 | 0.02000 | 0.02425 |
| 3 | 2016–09-15 | 2016–10-15 | 0.02055 | |
| 4 | 2016–11-05 | 2016–12-05 | 0.02051 | 0.01913 |
| 5 | 2017–10-31 | 2017–11-30 | 0.02008 | 0.01782 |
| 6 | 2018–06-01 | 2018–07-01 | 0.01889 | 0.01917 |
| 7 | 2018–07-21 | 2018–08-20 | 0.01806 | |
| 8 | 2019–01-16 | 2019–02-15 | 0.01981 | 0.01640 |
| 9 | 2019–06-01 | 2019–07-01 | 0.02019 | 0.02948 |
| 10 | 2019–10-27 | 2019–11-26 | 0.02093 | 0.02476 |
| 0.01996 | 0.02119 | |||
The best and the worst results on the test datasets are emphasised in bold.
Figure 3Research results for the best variant of forecasting—test dataset (dataset no. 3).
Figure 4Research results for the best variant of forecasting—training dataset (dataset no. 3).
Figure 5Research results for the worst variant of forecasting—test dataset (dataset no. 7).
Figure 6Research results for the worst variant of forecasting—training dataset (dataset no. 7).
Figure 7Weekly periodicity vector of water demand.
Figure 8The evolution of the vectors of weekly periodicity in the successive iterations.
Figure 9The evolution of the vectors of weekly periodicity in the successive iterations.
| Methods | Average | Set 1 | Set 2 | Set 3 | Set 4 | Set 5 | Set 6 | Set 7 | Set 8 | Set 9 | Set 10 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Linear regression | 0.02197 | 0.01628 | 0.02329 | 0.01408 | 0.02201 | 0.01896 | 0.02024 | 0.03414 | 0.01675 | 0.02895 | 0.02500 |
| SVR-linear | 0.02449 | 0.01430 | 0.02184 | 0.02063 | 0.02806 | 0.02153 | 0.02240 | 0.03502 | 0.01996 | 0.02991 | 0.03126 |
| RF-5 | 0.02408 | 0.02626 | 0.02204 | 0.01868 | 0.02340 | 0.02030 | 0.02493 | 0.03134 | 0.01701 | 0.03116 | 0.02570 |
| RF-10 | 0.02319 | 0.02359 | 0.02211 | 0.01780 | 0.02551 | 0.01870 | 0.02254 | 0.03035 | 0.01663 | 0.02939 | 0.02533 |
| CART-Tree | 0.03356 | 0.03937 | 0.02674 | 0.03355 | 0.03578 | 0.02990 | 0.02995 | 0.03084 | 0.02347 | 0.05135 | 0.03466 |
| MLP20-15 | 0.03508 | 0.04454 | 0.03069 | 0.02690 | 0.03521 | 0.03822 | 0.02439 | 0.03599 | 0.04117 | 0.03774 | 0.03592 |
| MLP30-10 | 0.03162 | 0.02257 | 0.02337 | 0.03470 | 0.02322 | 0.02532 | 0.03753 | 0.03745 | 0.02285 | 0.04982 | 0.03933 |
| MLP15-10 | 0.02969 | 0.02444 | 0.02445 | 0.02669 | 0.02540 | 0.02656 | 0.03170 | 0.04073 | 0.03646 | 0.02603 | 0.03443 |