Literature DB >> 30248912

Algal Bloom Prediction Using Extreme Learning Machine Models at Artificial Weirs in the Nakdong River, Korea.

Hye-Suk Yi1,2, Sangyoung Park3, Kwang-Guk An4, Keun-Chang Kwak5.   

Abstract

In this study, we design an intelligent model to predict chlorophyll-a concentration, which is the primary inpan>dicator of algal blooms, using extreme learning machine (ELM) models. Modeling algal blooms is important for environmental management and ecological risk assessment. For this purpose, the performance of the designed models was evaluated for four artificial weirs in the Nakdong River, Korea. The Nakdong River has harmful annual algal blooms that can affect health due to exposure to toxins. In contrast to conventional neural network (NN) that use backpropagation (BP) learning methods, ELMs are fast learning, feedforward neural networks that use least square estimates (LSE) for regression. The weights connecting the input layer to the hidden nodes are randomly assigned and are never updated. The dataset used in this study includes air temperature, rainfall, solar radiation, total nitrogen, total phosphorus, N/P ratio, and chlorophyll-a concentration, which were collected on a weekly basis from January 2013 to December 2016. Here, upstream chlorophyll-a concentration data was used in our ELM2 model to improve algal bloom prediction performance. In contrast, the ELM1 model only uses downstream chlorophyll-a concentration data. The experimental results revealed that the ELM2 model showed better performance in comparison to the ELM1 model. Furthermore, the ELM2 model showed good prediction and generalization performance compared to multiple linear regression (LR), conventional neural network with backpropagation (NN-BP), and adaptive neuro-fuzzy inference system (ANFIS).

Entities:  

Keywords:  ANFIS; Nakdong River; environmental management; extreme learning machine; harmful algal bloom; prediction modeling; regulated river

Mesh:

Substances:

Year:  2018        PMID: 30248912      PMCID: PMC6210959          DOI: 10.3390/ijerph15102078

Source DB:  PubMed          Journal:  Int J Environ Res Public Health        ISSN: 1660-4601            Impact factor:   3.390


1. Introduction

Recent climate change and economic development around the world has been at the cost of environmental deterioration over the same time period, and development has impacted human life directly and inpan>directly. Among environmental degradation issues, algal blooms refer to the explosive increase and high concentration of phytoplankton in the ecosystem. This has become a challenge facing human society today as algal blooms are becoming more prevalent throughout the world [1,2]. Water quality, hydrology, and climate are the main factors influencing algal blooms in terms of chlorophyll dynamics. Although the general relationship between environmental conditions and phytoplankton dynamics has been extensively studied in the past [3,4,5], controlling algal blooms is difficult because the relationship between environmental factors and algal blooms has different characteristics that depend on geography and time. In terms of model development, there are typically two types of forecasting models: deductive and inductive models. Deductive models are based on existing theories and knowledge, which enable users to simulate the behavior of various systems. Numerical modeling as a form of deductive modeling has an advantage in forecasting, considering geographical and hydrodynamic variations [6]. However, this method requires numerous input data, calibration, and validation, resulting in uncertainty in the model parameters [7,8]. As an alternative, computational artificial intelligence techniques have been developed as more efficient tools in recent years for predicting or forecasting algal blooms. With the development of artificial inpan>telligence models, an artificial neural network (ANN) was applied to predict algal blooms by assessing eutrophication and simulating chlorophyll-a concentration. Support vector machine and deep learning techniques were also applied to predict phytoplankton abundance. Zhang [2] presented a novel prediction approach for algal blooms based on deep learning to represent and predict highly dynamic and complex phenomena. Tian [3] performed an optimization of a traditional ANN model for predicting chlorophyll dynamics with the goal of decrease the cost of in-situ aquatic environmental monitoring and increase the accuracy of bloom forecasting. Loi [9] attempted to develop an ELM-based predictive model to simulate dynamic changes in phytoplankton abundance in the Macau Reservoir, given a variety of water variables. Rogers [10] presented a new approach to nonlinear groundwater management with the aid of ANNs and optimized aquifer remediation. Considering the drawbacks of ANNs, extreme learning machine (ELM) was recently thought to be a better solution. Extreme means that its learning speed is extremely fast while it has higher generalization than gradient-descent-based learning [11]. In fact, it has been shown that ELM can be 10 times faster compared to some traditional algorithms with backpropagation [12]. ELM has been used to assess the stability of electric power systems, optimize the lifetime of transportation systems, and predict electrical power. Xu [13] developed an ELM-based predictor for real-time frequency stability assessment to enhance the dynamic security of power systems. Sun [14] proposed a two-stage approach to optimize the lifetime of transportation systems by combining linear programming (LP) with an ELM. Vergara [15] applied two machine learning techniques to predict active electrical power in buildings by comparing ELM against multilayer perceptron methods. Yeom [16] proposed a new design method based on an ELM with automatic knowledge representation from numerical datasets for short-term electricity-load forecasting. ELM has been applied to water resource management, inpan>cludinpan>g discharge forecastinpan>g, future projection-based climate change scenarios usinpan>g an onlinpan>e sequential ELM (OS-ELM), and algal bloom prediction usinpan>g an ELM on reservoirs. Yadav [17] applied OS-ELM as a new technpan>ique capable of updatinpan>g the model equation based on new data entry without much inpan>crease inpan> computational cost; this was used inpan> flood forecastinpan>g on the Neckar River, Germany. Yinpan> [18] projected future variability of reference evapotranspiration usinpan>g an ELM and support vector regression inpan> a mounpan>tainpan>ous inpan>land watershed in north-west China. Wang [19] proposed a hybrid mechanism modeling method, which synthesized the advantages of an ecological dynamic model and a data-driven model. To obtain an appropriate model, a function model library with key impact factors (IFs) in algal bloom formation was first established, and then Tabu search and a genetic algorithm were applied for model structure optimization and parameter calibration, respectively. Wang [20] constructed a mechanism-based model according to algal bloom nonlinear temporal dynamics to reflect nonlinear dynamic changes in the algal bloom formation mechanism. However, studies have not focused on predicting algal blooms using machine learning with water quality and climate data in regulated rivers. Wang [21] proposed an integrated variable fuzzy evaluation model, which has the precision of the algorithm and operability for the assessment of river water quality based on case studies of reservoir and river water. Olyaie [22] compared various artificial intelligence methods, particularly ANNs, adaptive neuro-fuzzy inference system, and coupled wavelet and neural network for estimating the suspended sediment load in a river system. Fotovatikhah [23] applied flood management systems, which are the most promising approaches with respect to accuracy and error rate for flood debris forecasting and management. In this study, we attempt to apply ELM-based models to predict chlorophyll-a concentration as an indicator of algal blooms, which was monitored on a weekly basis in four weirs in the Nakdong River, South Korea. Despite a growing awareness of the problems associated with algal blooms in rivers, and particularly in regulated rivers, the drivers of bloom formation and abundance in rivers are not well understood [24]. The Nakdong River is a regulated river that includes many artificial weirs. There has been more social interest in this topic after the construction of weirs in the Nakdong River, and algae blooms have occurred more often downstream rather than upstream. It has also been difficult to predict algal blooms using numerical models. Thus, it is important to rapidly and accurately predict algal blooms. It is more difficult to understand and predict algal blooms in regulated rivers that are connected in sequence because hydrodynamics and water quality are more diverse than in other rivers. ELM models were applied to 4 weirs in the Nakdong River to explore the best input structure and develop a model describing weekly chlorophyll-a concentration while considering parameter minimization, water quality monitoring, and overfitting. Upstream chlorophyll-a concentration data was used in our ELM2 model to improve algal bloom prediction performance. In contrast, the ELM1 model only uses downstream chlorophyll-a concentration data. The ELM1 and ELM2 models showed good performance compared to the conventional neural network with backpropagation (NN-BP) and adaptive neuro-fuzzy inference system (ANFIS). This paper is organized in the following manner: Section 2 describes the study area and the water quality variables that are relevant to the Nakdong River. Section 3 describes the ELM architecture and learninpan>g method. Section 4 presents our simulation results for algal bloom prediction from water quality and weather data. Finally, concluding comments are presented in Section 5.

2. Study Area

With a length of 525 km, the Nakdong River is the longest river in South Korea, and its watershed area is 23,384 km2, which is equivalent to approximately 20% of the counpan>try’s area. The Nakdong River has eight weirs which were built inpan> sequence startinpan>g inpan> 2012. Inpan> particular, four of these weirs (Gangjeong-Goryeong weir, Dalseong weir, Hapcheon-Changnyeong weir, and Changnyeong-Haman weir) inpan> the mid-lower Nakdong River region experience harmful algal blooms every summer, causing many problems for agricultural, residential, and commercial water supplies. Harmful algal blooms refer to toxic, hypoxia-generating cyanobacterial bloom genera controlled by the synergistic effects of nutrients (nitrogen and phosphorus), light, temperature, water residence, and biotic interactions [25]. Since the construction of the weirs, the public and the government have been interested in managing algal blooms. Figure 1 shows the locations of the four weirs on the Nakdong River in South Korea and the watershed area.
Figure 1

The study area and main streams.

All data in this study were obtained from the Korean governmental database system, including the Water Environment Inpan>formation System and Korea Meteorological Adminpan>istration System. Monitorinpan>g stations were located 500 m upstream from each weir, and water quality data were gathered weekly since weir construction. Chlorophyll-a was used as the primary indicator for algal blooms, and other water quality variables were monitored, including water temperature, pH, dissolved oxygen, electrical conductivity, total nitrogen, nitrate, ammonia nitrogen, total phosphorus, phosphate, biological oxygen demand, chemical oxygen demand, and total organic carbon. In the Gangjeong-Goryeong weir, chlorophyll-a concentration was 19.0 μg/L on average and 106.7 μg/L maximum. In the Dalseong weir, chlorophyll-a concentration was 26.0 μg/L on average and 104.1 μg/L maximum. In the Hapcheon-Changnyeong weir, chlorophyll-a concentration was 23.2 μg/L on average and 100.7 μg/L maximum. In the Changnyeong-Haman weir, which is located downstream, the average chlorophyll-a concentration was the highest at 25.2 μg/L and 123.3 μg/L maximum. Table 1 shows chlorophyll-a, total nitrogen, and total phosphorus statistical values for the 4 weirs from 2013 to 2016. Correlation analysis results showed that correlation coefficient between total nitrogen and chlorophyll-a concentration was 0.263 with a positive correlation, the correlation coefficient between total phosphorous and chlorophyll-a was −0.013 with a negative correlation, the correlation coefficient between N/P ratio and chlorophyll-a was 0.092 with a positive correlation in Gangjeong-Goryoung weir. In the Changnyoung-Haman weir, the correlation coefficient between total nitrogen and chlorophyll-a was −0.036 with a negative correlation, the correlation coefficient between total phosphorous and chlorophyll-a was 0.144 with a positive correlation, the correlation coefficient between N/P ratio and chlorophyll-a was −0.239 with a negative correlation.
Table 1

Water quality variables at 4 weirs in the Nakdong River (2013 to 2016).

VariablesGangjeong-Goryeong WeirDalseong WeirHapcheon-Changnyeong WeirChangnyeong-Haman Weir
Chlorophyll-a (μg/L)19.0 (2.2–106.7)26.0 (2.7–104.1)23.2 (1.7–100.7)25.2 (2.9–123.3)
Total Nitrogen (mg/L)2.605 (1.201–4.100)3.723 (1.814–6.433)3.397 (1.842–6.207)2.778 (1.249–5.483)
Total Phosphorus (mg/L)0.048 (0.012–0.157)0.061 (0.017–0.163)0.058 (0.016–0.163)0.054 (0.015–0.174)

Note: Average (Min.–Max.).

There are many methods for classifying ecosystems into trophic categories using nutrients and algal biomass. The boundaries placed between these categories by aquatic scientists are similar, but they are not universal. The United States Environmental Protection Agency (US EPA) has suggested that an annual average chlorophyll-a concentration exceedinpan>g 10 μg/L inpan>dicates a eutrophic state [26]. The 4-year average total phosphorus concentration in the Nakdong River ranged between 19.0 and 26.0 μg/L. Forsberg and Ryding suggested that annual average total nitrogen concentration values ranging from 0.6 to 1.5 mg/L also indicates a eutrophic state [27]. The 4-year average total nitrogen concentration in the Nakdong River ranged from 2.605 to 3.723 mg/L. Also, The Organization for Economic Co-operation and Development (OECD) suggested that an annual average total phosphorus concentration exceeding 0.035 mg/L indicates a eutrophic state [28]. The 4-year average chlorophyll-a concentration in the Nakdong River varied between 0.048 and 0.061 mg/L. Thus, the Nakdong River can be considered to be in a eutrophic state based on all three metrics. Figure 2 and Figure 3 show water quality variations in the Gangjeong-Goryeong and Dalseong weirs from 2013 to 2016.
Figure 2

Weekly total nitrogen, total phosphorus, and chlorophyll-a data at the Gangjeong-Goryeong weir from 2013 to 2016 (n = 201).

Figure 3

Weekly total nitrogen, total phosphorus, and chlorophyll-a data at Dalseong weir from 2013 to 2016 (n = 205).

3. Extreme Learning Machine

3.1. Architecture and Learning Method for ELM

ELM was originally proposed as a learning scheme for single hidden layer feedforward neural networks (SLFNs). It was later extended to generalized SLFNs, where the hidden layer need not be neuron-like [29,30]. In the past, gradient-descent-based approaches were used for feedforward neural networks, where all parameters required tuning, which usually requires significant time. ELM has only one hidden layer, and the parameters of this hidden layer need not be tuned, including the input weights and hidden node biases. On the contrary, these hidden node parameters are assigned randomly, which means that they may be independent of the training data. ELM speed can be thousands of times faster than traditional feedforward network n class="Disease">learning algorithms with backpropagation while obtainpan>inpan>g better generalization performance. The ELM structure has inpan>put, hidden, and output layers, as shownpan> inpan> Figure 4.
Figure 4

Architecture of the conventional ELM predictor.

The ELM model has advantages regarding real-time learning and good prediction capability. The output function from generalized SLFNs is expressed as: where w and b are the weight and bias between input layer and hidden layer, respectively. The output weights β are parameters to be estimated. The output function in the hidden layer mapping is as follows: The output functions of the hidden nodes can be used by various activation functions. The well-known activation functions are sigmoid networks, Radial-Basis Function (RBF) networks, polynomial networks, complex networks, and sine function as follows: where conventional random projection is just a specific case of ELM random feature mapping when a linear additive hidden node is used. It not only proves the existence of the networks, but it also provides learning solutions. In what follows, we shall review the processing procedure of an ELM as a predictor. For a training set, given the activation function and hidden node number, the ELM algorithm can be summarized as three steps: (1) randomly generate the input weights, (2) calculate the hidden layer output matrix, and (3) calculate the output weights matrix. In marked contrast to traditional n class="Disease">learning algorithms, ELM requires no iterative adjustment of the network parameters durinpan>g trainpan>inpan>g. Given a trainpan>inpan>g set , hidden node output function , and number of hidden nodes L, the ELM determinpan>es the hidden node parameters and output weights usinpan>g the followinpan>g steps: (Step 1) Randomly assign hidden node parameters (Step 2) Calculate the hidden layer output matrix (Step 3) Calculate the output weights β using a least squares estimate (LSE): where is the Moore-Penrose generalized inverse of matrix H. When is nonsingular, is the pseudo-inverse of H. This is a standard LSE problem, and the best solution for β is expressed as follows [16]:

3.2. Model Application

Water quality and weather data inpan> the Nakdong River were collected weekly from January 2013 to December 2016 by the Minpan>istry of Environment, Korea Meteorological Adminpan>istration. We excluded 2012 and 2017 because the study area had an unstable ecosystem inpan> 2012 durinpan>g the early period after weir construction. A low water level was maintained intermittently due to social and political issues in 2017. The dataset includes daily air temperature, rainfall, and solar radiation, as well as weekly total nitrogen, total phosphorus, N/P ratio (ratio of total nitrogen to total phosphorus) and chlorophyll-a as inputs to the ELM model. The weekly chlorophyll-a concentration was used as a model output, which is the primary indicator of algal blooms. Parameters like phosphate, nitrate, ammonia nitrogen, which were used for predicting algal blooms, were not applied for model training due to model input minimization and optimization [6,7,8,9]. Table 2 shows input variations, periods, and sources.
Table 2

The variables for chlorophyll-a prediction model.

ItemsVariablesSource
WeatherAir temperature, Rainfall, Solar radiationKorea Meteorological Administration(http://kma.go.kr)
Water qualityTotal Nitrogen, Total Phosphorus, N/P ratio, chlorophyll-aMinistry of Environment, National Institute of Environmental Research(http://water.nier.go.kr)
The ELM1 model was applied to predict algal blooms usinpan>g total pan> class="Chemical">nitrogen, total phosphorus, N/P ratio, temperature, precipitation, solar radiation, and chlorophyll-a concentration as independent parameters, and the ELM2 model used upstream chlorophyll-a as an independent parameter to improve the predictive power. Figure 5 shows the diagram of the ELM model (ELM2), where t indicates each week, AT is air temperature, and RF is rainfall, SR is solar radiation, TN is total nitrogen, TP is total phosphorus, NP is a ratio of total nitrogen over total phosphorus, Chla is chlorophyll-a concentration, and Chla_u is the upstream chlorophyll-a concentration, respectively. Air temperature data were collected daily and the weekly average values were used, where the 4-year average annual air temperature was 13.4–14.6 °C. Rainfall data were also collected daily and the weekly accumulated values were used, where the 4-year average annual rainfall was 1051–1438 mm/year. Solar radiation data were collected daily and the weekly average values were used for algal bloom prediction, where the 4-year average annual solar radiation was 14.1–14.7 MJ/m2. Water quality data with chlorophyll-a concentration were collected weekly and the weekly data were used for algal bloom prediction. Chlorophyll-a concentration was used data from 7 days prior.
Figure 5

Diagram for ELM model (ELM2). AT: air temperature; RF: rainfall; SR: solar radiation; TN: total nitrogen; TP: total phosphorus; NP: ratio of total nitrogen over total phosphorus; Chla: chlorophyll-a concentration; Chla_u: upstream chlorophyll-a concentration.

50% of the dataset was used for training and the other 50% was used for algal bloom prediction in each weir. The performance of the models in 4 weirs was evaluated using the Pearson correlation coefficient (R2) and root-mean-square error (RMSE) between the observed and predicted values. These indicators are defined as follows: where n is the number of data; and are are observed data and the mean of observed data, respectively, and is the value predicted from the model.

4. Results and Discussion

4.1. Experimental Results

We applied ELM to n class="Chemical">chlorophyll-a concenpan>tration prediction inpan> four weirs located inpan> the Nakdong River. Inpan> the designpan> of ELM, sigmoid networks were adopted as the activation funpan>ction. We performed the additional experimenpan>ts for RBF funpan>ction anpan>d sinpan>e funpan>ction. The experimenpan>tal results revealed that RBF anpan>d sinpan>e funpan>ction showed a similar performanpan>ce inpan> comparison to sigmoid funpan>ction. ELM models were constructed to determinpan>e the optimum number of nodes inpan> the hiddenpan> layer. The number of hiddenpan> nodes inpan> this study is determinpan>ed whenpan> the performanpan>ce of test set for model validation reaches a minpan>imum while as the number of hiddenpan> nodes inpan>creases from 2 to 30. The trainpan>inpan>g anpan>d testinpan>g performanpan>ce of the ELM are shownpan> inpan> Figure 6.
Figure 6

Performance of ELM as a function of the number of hidden nodes. (a) Gangjeong-Goryeong weir and (b) Dalseong weir.

The performance of the ELM1 models is shown in Table 3 and Figure 7. The prediction results for chlorophyll-a concentration inpan> Gangjeong-Goryeong weir show R2 = 0.61 for trainpan>inpan>g and 0.47 for testinpan>g, and RMSE of 8.6 μg/L for trainpan>inpan>g and 14.5 μg/L for testinpan>g. The prediction results inpan> Dalseong weir show R2 = 0.55 for trainpan>inpan>g and 0.44 for testinpan>g, and RMSE of 12.6 for trainpan>inpan>g and 13.5 for testinpan>g. The ELM model shows better performance inpan> Gangjeong-Goryeong weir than inpan> Dalseong weir. The prediction results inpan> Hapcheon-Changnyeong weir show R2 = 0.38 for trainpan>inpan>g and 0.41 for testinpan>g, and RMSE of 15.3 for trainpan>inpan>g and 13.1 for testinpan>g. The prediction results inpan> Changnyeong-Haman weir show R2 = 0.29 for trainpan>inpan>g and 0.36 for testinpan>g, and RMSE of 16.6 for trainpan>inpan>g and 12.4 for testinpan>g. The Akaike inpan>formation criterion (AIC) was developed for comparinpan>g models, based on inpan>formation theory [31]. AIC applied to Gangjeong-Goryeong weir has a value of 371.2 for trainpan>inpan>g and 452.2 for testinpan>g, and 444.6 for trainpan>inpan>g and 455.8 for testinpan>g inpan> Dalseong weir. The AIC value inpan> Hapcheon-Changnyeong weir is 461.3 for trainpan>inpan>g and 436.1 for testinpan>g data sets, and 469.0 for trainpan>inpan>g and 421.9 for testinpan>g data sets inpan> Changnyeong-Haman weir. The predictive power of the ELM model was found to be better inpan> upstream weirs than inpan> downpan>stream weirs. This is because the downpan>stream Nakdong River has more algal bloominpan>g factors, such as tributaries, water intakes, and dam discharge, which are difficult to control and manage.
Table 3

Statistical analysis of ELM1 model results at four artificial weirs.

ELM1 ModelGangjeong-Goryeong WeirDalseong WeirHapcheon-Changnyeong WeirChangnyeong-Haman Weir
R 2 Training0.610.550.380.29
Testing0.470.440.410.36
RMSETraining8.612.615.316.6
Testing14.513.513.112.4
AICTraining371.2444.6461.3469.0
Testing452.2455.8436.1421.9
Figure 7

Training and testing results from the ELM1 model for chlorophyll-a prediction. (a) Training results and (b) testing results.

To improve the accuracy of the chlorophyll-a concenpan>tration prediction model, the upstream pan> class="Chemical">chlorophyll-a concentration was used as an independent variable in the ELM2 model. The ELM2 (ELM1) model showed better performance with R2 = 0.71 (0.61) for training and 0.45 (0.47) for testing, RMSE = 6.8 (8.6) for training and 13.8 (14.5) for testing, and AIC = 333.8 (371.2) for training and 452.2 (446.2) for testing in Gangjeong-Goryeong weir. The ELM2 (ELM1) model showed better performance with R2 = 0.76 (0.55) for training and 0.45 (0.44) for testing, RMSE = 8.9 (12.6) for training and 13.4 (13.5) for testing, and AIC = 388.1 (444.6) for training and 456.9 (455.8) for testing in Dalsone weir. Table 4 and Figure 8 show the results from the ELM2 model in Gangjeong-Goryeong weir and Dalseong weir.
Table 4

Statistical analysis of ELM2 model at four artificial weirs.

ELM 2 ModelGangjeong-Goryeong WeirDalseong WeirHapcheon-Changnyeong WeirChangnyeong-Haman Weir
R 2 Training0.710.760.440.32
Testing0.450.450.430.46
RMSETraining6.88.914.616.3
Testing13.813.413.111.4
AICTraining333.8388.1455.8468.3
Testing446.2456.9437.5410.5
Figure 8

Training and testing results from the ELM2 model for chlorophyll-a prediction. (a) Training results and (b) testing results.

The ELM2 (ELM1) model showed better performance with R2 = 0.44 (0.38) for training and 0.43 (0.41) for testing, RMSE of 14.6 (15.3) for training and 13.1 (13.1) for testing, and AIC of 455.8 (461.3) for training and 437.5 (436.1) for testing in Hapcheon-Changnyeong weir. The ELM2 (ELM1) model showed better performance with R2 = 0.32 (0.29) for training and 0.46 (0.36) for testing, RMSE = 16.3 (16.6) for training and 11.4 (12.4) for testing, and AIC = 468.3 (469.0) for training and 410.5 (421.9) for testing in Changnyeong-Haman weir. The ELM2 results from both downstream weirs were similar to the ELM1 model (Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13). This is because the downstream Nakdong River has more algal blooming factors such as tributaries, water inpan>takes, and dam discharge, which are difficult to control and manage. Thus, these algal bloominpan>g factors need to be applied to the ELM2 model for more accurate prediction. Onpan> the other hand, upstream chlorophyll-a concentration can be a good indicator to predict algal blooms in upstream weirs. Moreover, we compared with the well-known conventional neural network with BP (Back-Propagation) in Table 5. Here, the learning rate was 0.001 and the number of epochs was 1000. In the case of Gangjeong-Goryeong weir, the RMSE values for training and testing set were 9.27 and 15.73, respectively. We also obtained RMSE values of 11.44 and 14.12 for training and testing data in Dalseong weir, respectively. In Hapcheon-Changnyeong weir, the RMSE values for training and testing are 14.69 and 13.43, respectively. We also obtained RMSE values of 16.68 and 11.35 for training and testing in Changnyeong-Haman weir, respectively. Also, we compared with multiple LR (Linear Regression) in Table 5. In the case of Gangjeong-Goryeong weir, the RMSE values for training and testing set were 11.3 and 17.5, respectively. We also obtained RMSE values of 15.3 and 20.7 for training and testing data in Dalseong weir, respectively. In Hapcheon-Changnyeong weir, the RMSE values for training and testing are 14.7 and 13.9, respectively. We also obtained RMSE values of 16.9 and 14.0 for training and testing in Changnyeong-Haman weir, respectively.
Figure 9

Performance of the ELM1 and ELM2 models in Gangjeong-Goryeong weir. (a) ELM1 model and (b) ELM2 model.

Figure 10

Performance of the ELM1 and ELM2 models in Dalseong weir. (a) ELM1 model and (b) ELM2 model.

Figure 11

Performance of the ELM1 and ELM2 models in Hapcheon-Changnyeong weir. (a) ELM1 model and (b) ELM2 model.

Figure 12

Performance of the ELM1 and ELM2 models in Changnyeong-Haman weir. (a) ELM1 model and (b) ELM2 model.

Figure 13

Comparison between the ELM1 and ELM2 model results for chlorophyll-a prediction in all four weirs. (a) Training results and (b) testing results. GG: Gangjeong-Goryeong weir; D: Dalseong weir; HC: Hapcheon-Changnyeong weir; CH: Changnyeong-Haman weir.

Table 5

RMSE results of other methods comparing ELM2 at four artificial weirs.

ModelRMSEGangjeong-Goryeong WeirDalseong WeirHapcheon-Changnyeong WeirChangnyeong-Haman Weir
ELM2Training6.88.914.616.3
Testing13.813.413.111.4
Multiple LRTraining11.315.314.716.9
Testing17.520.713.914.0
NN with BPTraining9.311.414.716.7
Testing15.714.113.411.4
ANFIS-FCM (r = 2)Training7.89.313.314.2
Testing16.713.215.113.0
ANFIS-FCM (r = 3)Training6.78.912.912.2
Testing29.916.815.214.6
Furthermore, we compared with adaptive neuro-fuzzy inference system (ANFIS) frequently used in conjunction with regression and prediction problems. The effectiveness of ANFIS has been demonstrated in real-world application [32,33,34,35,36,37]. This ANFIS has also known as the most representative neuro-fuzzy model [38]. Here fuzzy c-means (FCM) clustering is used to determine fuzzy if-then rules in the design of ANFIS. This ANFIS-FCM program is available in Fuzzy Toolbox of MATLAB R2018a [38]. As listed in Table 5, the experiments are performed as the number of fuzzy rule (r) increases. The ANFIS with four or more fuzzy rules is excluded due to overfitting problems that the number of parameter exceeds the number of training data. The ANFIS-FCM is performed by one-pass based on LSE without learning. The result clearly showed that the generalization capability of ELM2 outperformed that of ANFIS-FCM. n class="Chemical">Water residenpan>ce time was applied as an inpan>depenpan>denpan>t parameter to improve the model performance, and we compared these results with the ELM2 model. The Gangjeong-Goryeong weir results show an RMSE value of 6.2 for trainpan>inpan>g and 14.4 for testinpan>g, and the Dalseong results show an RMSE value of 9.6 for trainpan>inpan>g and 13.3 for testinpan>g. The Hapcheon-Changnpan>yeong weir results show an RMSE value of 15.3 for trainpan>inpan>g and 13.0 for testinpan>g, and the Changnpan>yeong-Haman weir results show 16.2 for trainpan>inpan>g and 11.7 for testinpan>g. The performance results inpan>cludinpan>g n class="Chemical">water residence time were similar to the ELM2 model. The prediction results for Gangjeong-Goryeong weir show that R2 improved by 16.4% for training and −4.3% for testing, and RMSE improved by 20.9% for training and 4.8% for testing. The prediction results for Dalseong weir show that R2 improved by 38.2% for training and 2.3% for testing, and RMSE improved by 29.4% for training and 0.7% for testing. The prediction results for Hapcheon-Changnyeong weir show that R2 improved by 15.8% for training and 4.9% for testing, and RMSE improved by 4.6% for training and 0.0% for testing. The prediction results for Changnyeong-Haman weir show that R2 improved by 10.3% for training and 27.8% for testing, and RMSE improved by 1.8% for training and 8.1% for testing. Figure 13 shows a performance comparison between the ELM1 and ELM2 model results in all four weirs.

4.2. ELM Performance Discussion

As shown in Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13 and Table 5, it was found from these experimental results that ELM showed good performance and generalization capability in comparison to multiple LR, NN with n class="Chemical">BP, and ANFIS-FCM. Therefore, the features of ELM can be summarized as follows: ELM consists of a simple tuning-free three-step algorithm. The learning speed of ELM is extremely fast. The hidden node parameters are independent of training data. Although hidden nodes are important, they need not be tuned. ELM could generate the hidden node parameters before using the training data. ELM can be effectively applied to most real-world problems such as compression, feature learning, clustering, regression and classification. In general, the superiority of ELM has been demonstrated in comparison with the conventional neural network through real-world applications in the previous literature [8,9,39,40,41]. On the other hand, the ANFIS has been successfully applied to several water quality and algal bloom problems [32,33,34,35,36,37]. With similar experimental methods as previous literatures, we shall perform the experiments on the effects of learninpan>g inpan> the design of ANFIS-FCM. Figure 14 shows the error curves for 100 epochs of trainpan>inpan>g inpan> the design of ANFIS-FCM (r = 2). The learninpan>g method for ANFIS-FCM is performed by forward and backward pass inpan> the hybrid learninpan>g procedure based on BP and LSE. As shown in Figure 14, the training error decreased in the case of four weirs, but the test error increased. Usually we use the test error as a true measure of the performance. Thus, the best model is obtained when the test RMSE is minimal. However, although further training decreased the training error, it will degrade the performance of the ANFIS-FCM on unforeseen inputs. That is, the performance after the first epoch is meaningless for this algal bloom data in the case of ANFIS-FCM. For this reason, the ANFIS-FCM was designed by one-pass without learning as the same manner of ELM2 in Table 5. These results lead us to the conclusion that the design of ELM2 for algal bloom prediction is the innovative approach to constructing computationally intelligent model through the performance comparison with the representative models in statistics, neural network, and neuro-fuzzy system, respectively.
Figure 14

RMSE curves obtained by training of ANFIS-FCM for four weirs (num. of rule = 2). (a) Gangjeong-Goryeong weir; (b) Dalseong weir; (c) Hapcheon-Changnyeong weir; (d) Changnyeong-Haman weir.

5. Conclusions

As chlorophyll-a concentration is the primary inpan>dicator of algal blooms, an ELM was applied to predict chlorophyll-a concentration in the Nakdong River, Korea. Water quality and weather data were collected on a weekly basis from January 2013 to December 2016 by the Ministry of Environment and Korea Meteorological Administration. Parameters in the dataset include air temperature, rainfall, solar radiation, total nitrogen, total phosphorus, N/P ratio, and chlorophyll-a concentration. 50% of the dataset was used for training, and the other 50% of the dataset was used for testing in each weir. Prediction of chlorophyll-a concentration in Gangjeong-Goryeong and Dalseong weirs, which are located upstream, shows good results for training and testing. Prediction in Hapcheon-Changnyeong and Changnyeong-Haman weirs, which are located downstream, shows worse training and testing results compared to the two upstream weirs. The predictive power of the ELM models was found to be better in the upstream weirs. To improve the accuracy of the chlorophyll-a concentration prediction model, upstream chlorophyll-a concentration was used as an independent variable in the ELM2 model. Compared to the original ELM1 prediction model, the ELM2 model showed better performance with higher R2 and lower RMSE values for training and testing datasets in the upstream weirs. The ELM2 model also showed better performance with higher R2 and lower RMSE values for training and testing datasets in the downstream weirs. However, the results from downstream weirs showed similar performance as the previous ELM1 model. This is because the downstream Nakdong River has more diverse algal blooming factors such as tributaries, water intakes, and dam discharge, which are difficult to control and manage. ELM-based prediction models for chlorophyll-a concentration inpan> the Nakdong River are proposed inpan> this paper. The purpose of this study is to apply an ELM algorithm for algal bloom prediction inpan> a regulated river with artificial weirs. We inpan>cluded the chlorophyll-a concentration measured from the upstream weir to improve the performance of the algal bloom prediction model. Because we have a small dataset for use in the ELM, we can improve the accuracy of the algal bloom prediction model by examining the water quality in tributaries and accumulating more data in the future. The two ELM models in this study showed superior prediction power, and the upstream chlorophyll-a concentration shows improved prediction accuracy regarding phytoplankton dynamics in a river with sequential weirs. These results showed extreme learning machine can handle more the nonlinearity of algal bloom than linear regression, neural network, and neuro-fuzzy system. Furthermore, these results lead us to the conclusion that the presented ELMs are the effective models for monitoring and managing algal blooms in the regulated river. In future research, we will develop algal blooms prediction models for artificial weirs on the Nakdong and Youngsan rivers in Korea using the Recurrent Neural Network (RNN) and Deep Neural Network (DNN) methods.
  8 in total

Review 1.  Progress in understanding harmful algal blooms: paradigm shifts and new technologies for research, monitoring, and management.

Authors:  Donald M Anderson; Allan D Cembella; Gustaaf M Hallegraeff
Journal:  Ann Rev Mar Sci       Date:  2012

2.  Universal approximation using incremental constructive feedforward networks with random hidden nodes.

Authors:  Guang-Bin Huang; Lei Chen; Chee-Kheong Siew
Journal:  IEEE Trans Neural Netw       Date:  2006-07

3.  Ecology. Controlling eutrophication: nitrogen and phosphorus.

Authors:  Daniel J Conley; Hans W Paerl; Robert W Howarth; Donald F Boesch; Sybil P Seitzinger; Karl E Havens; Christiane Lancelot; Gene E Likens
Journal:  Science       Date:  2009-02-20       Impact factor: 47.728

4.  Use of fuzzy logic models for prediction of taste and odor compounds in algal bloom-affected inland water bodies.

Authors:  Slawa Bruder; Meghna Babbar-Sebens; Lenore Tedesco; Emmanuel Soyeux
Journal:  Environ Monit Assess       Date:  2013-11-15       Impact factor: 2.513

5.  A comparison of various artificial intelligence approaches performance for estimating suspended sediment load of river systems: a case study in United States.

Authors:  Ehsan Olyaie; Hossein Banejad; Kwok-Wing Chau; Assefa M Melesse
Journal:  Environ Monit Assess       Date:  2015-03-19       Impact factor: 2.513

6.  The relative importance of water temperature and residence time in predicting cyanobacteria abundance in regulated rivers.

Authors:  YoonKyung Cha; Kyung Hwa Cho; Hyuk Lee; Taegu Kang; Joon Ha Kim
Journal:  Water Res       Date:  2017-07-18       Impact factor: 11.236

7.  Performance of ANFIS versus MLP-NN dissolved oxygen prediction models in water quality monitoring.

Authors:  A Najah; A El-Shafie; O A Karim; Amr H El-Shafie
Journal:  Environ Sci Pollut Res Int       Date:  2013-08-16       Impact factor: 4.223

8.  Controlling cyanobacterial harmful blooms in freshwater ecosystems.

Authors:  Hans W Paerl
Journal:  Microb Biotechnol       Date:  2017-06-21       Impact factor: 5.813

  8 in total
  3 in total

1.  Construction of Predictive Model for Type 2 Diabetic Retinopathy Based on Extreme Learning Machine.

Authors:  Lei Liu; Mengmeng Wang; Guocheng Li; Qi Wang
Journal:  Diabetes Metab Syndr Obes       Date:  2022-08-24       Impact factor: 3.249

2.  Prediction of Epidemic Peak and Infected Cases for COVID-19 Disease in Malaysia, 2020.

Authors:  Abdallah Alsayed; Hayder Sadir; Raja Kamil; Hasan Sari
Journal:  Int J Environ Res Public Health       Date:  2020-06-08       Impact factor: 3.390

3.  Developing an Ensemble Predictive Safety Risk Assessment Model: Case of Malaysian Construction Projects.

Authors:  Haleh Sadeghi; Saeed Reza Mohandes; M Reza Hosseini; Saeed Banihashemi; Amir Mahdiyar; Arham Abdullah
Journal:  Int J Environ Res Public Health       Date:  2020-11-13       Impact factor: 3.390

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.