| Literature DB >> 29937531 |
Abstract
Harmful algal blooms are an annual phenomenon that cause environmental damage, economic losses, and disease outbreaks. A fundamental solution to this problem is still lacking, thus, the best option for counteracting the effects of algal blooms is to improve advance warnings (predictions). However, existing physical prediction models have difficulties setting a clear coefficient indicating the relationship between each factor when predicting algal blooms, and many variable data sources are required for the analysis. These limitations are accompanied by high time and economic costs. Meanwhile, artificial intelligence and deep learning methods have become increasingly common in scientific research; attempts to apply the long short-term memory (LSTM) model to environmental research problems are increasing because the LSTM model exhibits good performance for time-series data prediction. However, few studies have applied deep learning models or LSTM to algal bloom prediction, especially in South Korea, where algal blooms occur annually. Therefore, we employed the LSTM model for algal bloom prediction in four major rivers of South Korea. We conducted short-term (one week) predictions by employing regression analysis and deep learning techniques on a newly constructed water quality and quantity dataset drawn from 16 dammed pools on the rivers. Three deep learning models (multilayer perceptron, MLP; recurrent neural network, RNN; and long short-term memory, LSTM) were used to predict chlorophyll-a, a recognized proxy for algal activity. The results were compared to those from OLS (ordinary least square) regression analysis and actual data based on the root mean square error (RSME). The LSTM model showed the highest prediction rate for harmful algal blooms and all deep learning models out-performed the OLS regression analysis. Our results reveal the potential for predicting algal blooms using LSTM and deep learning.Entities:
Keywords: LSTM; algal blooms; artificial intelligence; chlorophyll-a; deep learning
Mesh:
Year: 2018 PMID: 29937531 PMCID: PMC6069434 DOI: 10.3390/ijerph15071322
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Research framework showing the three stages used to compare deep learning models.
Figure 2The deep learning model structure using nine input variables, three hidden layers, and one output layer (all fully connected).
Figure 3The deep learning prediction process using temporal data.
Variables used in model tests and their basic statistical descriptions.
| Variable Name | Variable Description | Source | Number of Data | Average | Standard Deviation | Minimum Value | Maximum Value |
|---|---|---|---|---|---|---|---|
| temperature | water temperature (°C) | Ministry of Environment | 4464 | 17.40 | 8.16 | 0.30 | 34.30 |
| pH | potential of hydrogen | 4464 | 8.00 | 0.54 | 5.70 | 9.70 | |
| DO | dissolved oxygen (mg/L) | 4464 | 10.70 | 2.66 | 2.20 | 19.20 | |
| BOD | biochemical oxygen demand (mg/L) | 4464 | 2.00 | 1.26 | 0.30 | 9.60 | |
| COD | chemical oxygen demand (mg/L) | 4464 | 5.80 | 1.82 | 1.80 | 19.50 | |
| cyanobacteria | cyanobacteria cell number | 4464 | 4041 | 20,695 | 0 | 556,740 | |
| chlorophyll | chlorophyll-a | 4464 | 23.76 | 23.15 | 0.10 | 177.90 | |
| water level | water level (el.m) | Ministry of Land, Infrastructure and Transport | 4464 | 19.49 | 13.78 | 1.50 | 47.52 |
| pondage | pondage (million m3) | 4464 | 43.67 | 30.64 | 4.829 | 205.58 |
Number of Data: 279 × 16 Dammed Pools.
Multiple linear regression analysis results.
| Variable Name | Coefficient | Standard Error | |
|---|---|---|---|
| temperature | 3.262 × 10−1 | 5.918 × 10−2 | 3.74 × 10−8 *** |
| pH | 3.218 × 10−1 | 6.706 × 10−1 | 6.31 × 10−1 |
| DO | 1.466 | 1.912 × 10−1 | 2.13 × 10−14 *** |
| BOD | 2.222 | 3.455 × 10−1 | 1.39 × 10−10 *** |
| COD | 2.580 | 2.635 × 10−1 | 2 × 10−16 *** |
| cyanobacteria | −6.105 × 10−5 | 1.46 × 10−5 | 2.93 × 10−5 *** |
| water level | −4.891 × 10−1 | 2.85 × 10−2 | 2 × 10−16 *** |
| pondage | −1.260 × 10−1 | 1.076 × 10−2 | 2 × 10−6 *** |
| _cons | −4.18 | 4.692 | 3.73 × 10−1 |
| 2.2 × 10−16 | |||
| R2 | 0.3032 | ||
| adjusted R2 | 0.302 | ||
| number of observations | 4464 |
Significance level: *** p < 0.001.
Figure 4Comparison of MLP (left) and LSTM (right) model results for chlorophyll-a at South Korean dammed pools sites. Blue lines show actual values and red lines show predicted values.
Results of RMSE comparison by epoch.
| Measuring Point | MLP | LSTM | ||||||
|---|---|---|---|---|---|---|---|---|
| Epoch | 100 | 300 | 500 | 700 | 100 | 300 | 500 | 700 |
| Ipo | 7.84871 | 8.42762 | 9.2777 | 10.7004 | 7.67382 | 8.31658 | 8.73067 | 9.06951 |
| Yeoju | 5.49547 | 5.6166 | 6.07492 | 4.49033 | 5.61138 | 5.73774 | 5.81824 | 6.13268 |
| Gangcheon | 3.64954 | 3.99566 | 4.431429 | 35.5032 | 3.60946 | 3.83244 | 3.86594 | 3.86588 |
| Sejong | 39.6119 | 35.9101 | 33.4814 | 10.3041 | 30.8273 | 31.0018 | 30.9622 | 31.7447 |
| Gongju | 42.7369 | 35.3198 | 33.7732 | 12.2273 | 31.9164 | 31.9498 | 32.1146 | 33.1228 |
| Baekje | 36.477 | 27.7607 | 26.5994 | 12.7383 | 27.3673 | 27.1477 | 27.1138 | 27.0187 |
| Sangju | 14.7071 | 14.0804 | 14.0761 | 26.0294 | 14.4853 | 14.4771 | 14.2902 | 14.1571 |
| Nakdan | 9.96028 | 10.4088 | 10.5436 | 14.0869 | 9.84722 | 9.50639 | 10.137 | 10.0699 |
| Gumi | 11.0802 | 10.2963 | 10.1364 | 32.6677 | 10.5159 | 10.2251 | 10.0779 | 10.2275 |
| Chilgok | 10.3898 | 10.2936 | 9.80221 | 29.1884 | 11.0027 | 10.5753 | 10.2638 | 10.204 |
| Gangjeong goryeoung | 9.2862 | 8.20598 | 9.2837 | 5.8793 | 7.85588 | 8.00324 | 8.78411 | 8.99846 |
| Dalseong | 10.2755 | 11.3021 | 11.7977 | 9.91085 | 12.6251 | 12.7122 | 13.1197 | 13.4175 |
| Hapcheon | 15.0435 | 13.9717 | 13.9468 | 27.4545 | 14.1113 | 13.9893 | 14.1398 | 13.9613 |
| Changnyeong haman | 12.1064 | 12.2411 | 12.0053 | 12.288 | 13.2302 | 12.6724 | 12.501 | 12.416 |
| Seungchon | 36.0572 | 29.4183 | 29.0971 | 9.44017 | 30.4663 | 36.1613 | 37.7719 | 40.2004 |
| Juksan | 29.7646 | 28.017 | 28.4197 | 14.114 | 26.3498 | 26.865 | 26.9653 | 26.7871 |
| Sum of RMSE |
|
|
|
|
|
|
|
|
Results of RMSE comparison.
| Measuring Point | OLS | MLP | RNN | LSTM |
|---|---|---|---|---|
| Ipo | 13.21 | 9.28 | 7.93 | 7.67 |
| Yeoju | 9.13 | 6.07 | 5.60 | 5.61 |
| Gangcheon | 6.50 | 4.43 | 3.58 | 3.61 |
| Sejong | 29.78 | 33.48 | 30.42 | 30.83 |
| Gongju | 32.30 | 33.77 | 32.08 | 31.92 |
| Baekje | 25.30 | 26.60 | 25.95 | 27.37 |
| Sangju | 10.18 | 14.08 | 14.37 | 14.49 |
| Nakdan | 11.88 | 10.54 | 9.34 | 9.85 |
| Gumi | 13.32 | 10.14 | 10.26 | 10.52 |
| Chilgok | 11.82 | 9.80 | 10.55 | 11.00 |
| Gangjeong goryeoung | 10.02 | 9.28 | 8.11 | 7.86 |
| Dalseong | 19.63 | 11.80 | 13.24 | 12.63 |
| Hapcheon | 14.87 | 13.95 | 14.35 | 14.11 |
| Changnyeong haman | 19.40 | 12.01 | 12.83 | 13.23 |
| Seungchon | 34.24 | 29.10 | 33.25 | 30.47 |
| Juksan | 22.44 | 28.42 | 26.22 | 26.35 |
| RMSE average | 17.75 | 16.42 | 16.13 | 16.09 |