| Literature DB >> 35955054 |
Huan Wu1,2, Shuiping Cheng1, Kunlun Xin1, Nian Ma2,3, Jie Chen2,4, Liang Tao2, Min Gao5.
Abstract
Water pollution seriously endangers people's lives and restricts the sustainable development of the economy. Water quality prediction is essential for early warning and prevention of water pollution. However, the nonlinear characteristics of water quality data make it challenging to accurately predicted by traditional methods. Recently, the methods based on deep learning can better deal with nonlinear characteristics, which improves the prediction performance. Still, they rarely consider the relationship between multiple prediction indicators of water quality. The relationship between multiple indicators is crucial for the prediction because they can provide more associated auxiliary information. To this end, we propose a prediction method based on exploring the correlation of water quality multi-indicator prediction tasks in this paper. We explore four sharing structures for the multi-indicator prediction to train the deep neural network models for constructing the highly complex nonlinear characteristics of water quality data. Experiments on the datasets of more than 120 water quality monitoring sites in China show that the proposed models outperform the state-of-the-art baselines.Entities:
Keywords: multi-task learning; multiple indicator prediction; water quality prediction
Mesh:
Year: 2022 PMID: 35955054 PMCID: PMC9368028 DOI: 10.3390/ijerph19159699
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Figure 1The basic framework of multi-task learning. The blue part represents the shared parameter layer, and the orange and yellow parts represent the models for different tasks forming the tower layer.
Figure 2Hard parameter sharing structure of multi-indicator water quality prediction. The dark blue part represents the shared parameter layer, and the orange and yellow parts represent the models for different tasks forming the tower layer.
Figure 3Soft parameter sharing structure of multi-indicator water quality prediction.
Figure 4Gating parameter sharing structure of multi-indicator water quality prediction.
Figure 5Gated hidden parameter sharing structure of multi-indicator water quality prediction.
Figure 6An example of the kind of time series for each indicator.
The structure of the proposed four water quality prediction models.
| Name | Layer | Design |
|---|---|---|
| Mt-Hard | Shared parameter layer | 1 × (MLP + Relu) |
| Tower layer | pH: 3 × (MLP + ReLU) | |
| Mt-Soft | Shared parameter layer | pH, DO, CODMn, NH3-N: 1 × (MLP + ReLU) |
| Tower layer | pH: 3 × (MLP + ReLU) | |
| Mt-Gate | Shared parameter layer | pH, DO, CODMn, NH3-N: 1 × (MLP + ReLU) |
| Tower layer | pH: Softmax + 3 × (MLP + ReLU) | |
| Mt-GH | Shared parameter layer | pH, DO, CODMn, NH3-N: 1 × (MLP + ReLU) |
| Tower layer | pH: Softmax + 3 × (MLP + ReLU) |
Dataset statistics.
| Name | Number of Sites | D-s | D-l |
|---|---|---|---|
| Time | Time | ||
| Total data set | 120 | 2013.1–2015.2 | 2012.6–2018.4 |
| Pearl River | 8 | 2013.1–2015.2 | 2012.6–2018.4 |
| The Yangtze River | 22 | 2013.1–2015.2 | 2012.6–2018.4 |
| Songhua River | 11 | 2013.1–2015.2 | 2012.6–2018.4 |
| Liaohe River | 7 | 2013.1–2015.2 | 2012.6–2018.4 |
| The Yellow River | 12 | 2013.1–2015.2 | 2012.6–2018.4 |
| Huaihe River | 26 | 2013.1–2015.2 | 2012.6–2018.4 |
| Haihe River | 6 | 2013.1–2015.2 | 2012.6–2018.4 |
| Taihu Lake | 6 | 2013.1–2015.2 | 2012.6–2018.4 |
| Poyang Lake | 4 | 2013.1–2015.2 | 2012.6–2018.4 |
| Other | 18 | 2013.1–2015.2 | 2012.6–2018.4 |
Comparison of the overall performance of prediction for single-indicator on D-s dataset.
| Model | pH | DO | CODMn | NH3-N | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAE | MAPE | RMSE | MAE | MAPE | RMSE | MAE | MAPE | RMSE | MAE | MAPE | |
| Linear [ | 0.560 | 0.413 | 0.054 | 2.657 | 1.978 | 0.299 | 2.793 | 1.236 | 0.354 | 1.366 | 0.423 | 0.948 |
| XGB [ | 0.327 | 0.245 | 0.032 | 1.594 | 1.135 | 0.121 | 1.732 | 0.614 | 0.179 | 0.487 | 0.182 | 0.335 |
| MLP [ | 0.299 | 0.211 | 0.028 | 1.218 * | 0.828 | 0.910 |
|
| 0.178 | 0.474 | 0.164 * |
|
| CNN [ | 0.429 | 0.327 | 0.043 | 2.066 | 1.506 | 0.167 | 2.541 | 1.197 | 0.350 | 0.718 | 0.347 | 0.702 |
| LSTM [ | 0.294 | 0.208 | 0.027 | 1.396 | 0.956 | 0.103 | 1.658 | 0.617 | 0.174 | 0.467 | 0.171 | 0.444 |
| GRU [ | 0.282 * | 0.206 * | 0.027 * | 1.230 | 0.821 * | 0.087 * | 1.742 | 0.610 | 0.169 | 0.457 * | 0.172 | 0.434 |
| ATT [ | 0.478 | 0.387 | 0.050 | 1.672 | 1.079 | 0.106 | 2.035 | 0.618 | 0.168 * | 0.681 | 0.193 | 0.416 |
| Mt-Hard | 0.270 | 0.217 | 0.028 | 1.273 | 0.869 | 0.094 | 1.535 | 0.602 | 0.169 | 0.432 | 0.244 | 0.640 |
| Mt-Soft | 0.293 | 0.209 | 0.028 | 1.186 | 0.801 | 0.087 | 1.534 | 0.597 | 0.178 | 0.430 | 0.168 | 0.386 |
| Mt-Gate | 0.292 | 0.211 | 0.027 | 1.235 | 0.851 | 0.091 | 1.547 | 0.585 |
| 0.448 | 0.178 | 0.386 |
| Mt-GH |
|
|
|
|
|
| 1.515 | 0.592 | 0.173 |
|
| 0.331 |
| Improv. | 7.1% | 12.1% | 11.1% | 3.0% | 3.0% | 1.1 | - | - | 1.6% | 11.7% | 6.3% | - |
* In the table, the bold numbers are the best, and the numbers with asterisk are the second best.
Comparison of the overall performance of prediction for single-indicator on D-l dataset.
| Model | pH | DO | CODMn | NH3-N | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | |
| CNN | 0.456 | 0.047 | 0.357 | 2.049 | 0.192 | 1.547 | 1.487 | 0.346 | 1.051 | 0.323 | 0.946 | 0.188 |
| LSTM | 0.268 | 0.025 | 0.195 | 1.198 | 0.110 | 0.828 | 0.939 | 0.189 | 0.582 | 0.238 | 0.430 | 0.118 |
| GRU |
|
|
| 1.200 | 0.107 |
| 0.944 | 0.190 | 0.579 | 0.219 | 0.510 | 0.116 |
| Mt-Soft | 0.252 | 0.023 | 0.178 | 1.196 | 0.106 | 0.823 | 0.949 | 0.188 | 0.577 | 0.222 | 0.395 | 0.114 |
| Mt-Gate | 0.251 | 0.023 | 0.178 | 1.197 | 0.108 | 0.823 | 0.939 | 0.194 | 0.582 | 0.217 | 0.415 | 0.109 |
| Mt-GH | 0.256 | 0.023 | 0.181 |
|
| 0.824 |
|
|
|
|
|
|
| Mt-GH | 0.256 | 0.023 | 0.181 |
|
| 0.824 |
|
|
|
|
|
|
Comparison of four indicators and three indicators multi-task learning models on D-l.
| pH | DO | CODMn | NH3-N | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | |
| 4-task |
|
|
|
|
|
|
|
|
|
|
|
|
| 3-task1 | 0.324 | 0.031 | 0.235 | 1.248 | 0.116 | 0.877 | 0.959 | 0.194 | 0.592 | — | — | — |
| 3-task2 | — | — | — | 1.277 | 0.116 | 0.892 | 0.965 | 0.193 | 0.594 | 0.234 | 0.514 | 0.136 |
| 3-task3 | 0.316 | 0.030 | 0.226 | 1.242 | 0.111 | 0.861 | — | — | — | 0.222 | 0.650 | 0.127 |
| 3-task4 | 0.342 | 0.033 | 0.255 | — | — | — | 0.977 | 0.189 | 0.596 | 0.221 | 0.440 | 0.107 |
Comparison of the overall performance of prediction for multi-indicator.
| Model | RMSE | MAE | MAPE |
|---|---|---|---|
| Linear [ | 7.376 | 4.052 | 1.656 |
| XGB [ | 4.139 | 2.177 | 0.667 |
| MLP [ | 3.476 * | 1.792 * | 0.615 * |
| CNN [ | 5.753 | 3.377 | 1.262 |
| LSTM [ | 3.814 | 1.952 | 0.748 |
| GRU [ | 3.709 | 1.809 | 0.716 |
| ATT [ | 4.866 | 2.277 | 0.730 |
| Mt-Hard | 3.511 | 1.932 | 0.930 |
| Mt-Soft | 3.443 | 1.775 | 0.679 |
| Mt-Gate | 3.523 | 1.823 | 0.670 |
| Mt-GH |
|
|
|
| Improv. | 3.3% | 3.8% | 1.6% |
* In the table, the bold numbers are the best, and the numbers with asterisk are the second best.
The performance comparison of tower structures in the Multi-Task-GH model.
| Tower | pH | DO | CODMn | NH3-N | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAE | MAPE | RMSE | MAE | MAPE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | |
| LSTM | 7.009 | 6.975 | 0.904 | 9.178 | 8.872 | 0.919 | 4.523 | 0.735 | 2.978 | 1.080 | 0.515 | 0.343 |
| GRU | 0.847 | 0.396 | 0.051 | 2.311 | 1.642 | 0.172 | 2.705 | 0.369 | 1.313 | 1.120 | 0.627 | 0.423 |
| CNN | 0.464 | 0.359 | 0.046 | 1.949 | 1.407 | 0.154 | 2.576 | 0.371 | 1.304 | 0.950 | 0.896 | 0.367 |
| ATT | 7.725 | 7.708 | 1.0 | 1.459 | 0.952 | 0.098 | 1.987 |
| 0.614 | 0.474 | 0.430 | 0.186 |
| MLP |
|
|
|
|
|
|
| 0.173 |
|
|
|
|
Figure 7The difference between predictions and real data.
Figure 8The learning curve of the Multi-Task-GH model.
Hyper-parameters.
| Name | Experimental Set |
|---|---|
| MLP layer | 1, 2, 3 |
| MLP hidden unit units | 16, 32, 64 |
| Epoch | 100, 200, 400 |
| Batch size | 8 |
| Learning rate | 0.001 |
| node number | 120 |
| Sequence length | 10 |
| Prediction targets | 4 |
The details of input-output parameters.
| Inputs | Outputs | Time Windows Size | Mean | Standard Variance | Maximum | Minimum |
|---|---|---|---|---|---|---|
| pH | pH | 10 | 7.240 | 0.330 | 7.950 | 6.390 |
| DO | DO | 10 | 7.340 | 0.640 | 10.000 | 6.200 |
| CODMn | CODMn | 10 | 1.820 | 0.450 | 3.300 | 0.800 |
| NH3-N | NH3-N | 10 | 0.194 | 0.045 | 0.340 | 0.110 |