Literature DB >> 36119861

Long Short-Term Memory-based simulation study of river happiness evaluation - A case study of Jiangsu section of Huaihe River Basin in China.

Tingting Zhu1, Juqin Shen2, Fuhua Sun2.   

Abstract

Real-time prediction of the state of the river itself and the degree of its benefit to the people is the leading way to achieve human-water harmony. Using the indicator scoring method as the evaluation method, we used the river evaluation data and results with time series characteristics as features and labels and applied the concept of transfer learning to Long Short-Term Memory to establish six subsystems, including water safety, water quality, economic contribution, water ecology, water management and water culture, to conduct a real-time rolling evaluation simulation study on the degree of river happiness in the Jiangsu section of the Huaihe River Basin in China. The empirical results show that the maximum Root Mean Square Error (RMSE) of the training set and test set of each system is 0.0226, and the lowest coefficient of determination R2 is 0.9011, which proves that the model fits well, according to which the relevant data of the watershed in June 2022 are brought in, and the evaluation result is obtained as 89.77 points. The overall trend is good, but a certain tendency to fall back at the level of economic contribution can be found, and the reasons are analyzed objectively.
© 2022 The Authors.

Entities:  

Keywords:  Long short-term memory; River happiness evaluation simulation; Transfer learning

Year:  2022        PMID: 36119861      PMCID: PMC9479020          DOI: 10.1016/j.heliyon.2022.e10550

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

On September 18, 2019, General Secretary Xi Jinping held a symposium on ecological protection and high-quality development of the Yellow River Basin in Zhengzhou, China. He issued a great call to “make the Yellow River a happy river for the benefit of the people” (Liu and Cheng, 2020). Since then, the river evaluation has worked with the “happiness scale” as the core idea came into being. A happy river maintains its health and supports the economic and social development of the watershed. In addition, it embodies the idea of “harmony between human and water”. It allows for a high level of security and satisfaction for the people in the watershed (Happy River Research Group, 2020). From the meaning of Happy River, we can extract its vital influencing factors. They include safe operation, continuous supply, ecological health, and human-water harmony (Zuo et al., 2020a). Safe operation means the sound structure and function of the river itself. Happy rivers have smooth water and sand channels and flood and drought prevention (Dupuits et al., 2019; Hubble, 2010). Continuous supply means the river can provide sufficient and high-quality water resources for residential life, industry, agriculture, etc (Gumbo and Kapangaziwiri, 2021). Ecological health encompasses water ecology and water environmental health. First, the river should have a good quality water body and sediment. Secondly, rivers should have high biodiversity (Wolfram et al., 2021). Human-water harmony means that people’s development and protection of rivers can be synergistic (Zuo et al., 2020b). These factors influence the evaluation of Happy River. Happy River Evaluation is the process by which water practitioners score a particular river through a series of procedures. In general, the procedure includes the identification of critical influencing factors, selection of evaluation indicators and selection of evaluation methods (Chen et al., 2022a). Scholars have conducted preliminary studies on the evaluation indicators of happy rivers. Throughout the research journey, the changes in the evaluation target layer are shown below. Initially, the happiness of rivers was evaluated in terms of their natural attributes, human and social attributes, and the degree of human-water harmony (Han and Xia, 2020). As the research progressed, academics proposed a more comprehensive set of goals. They are “flood prevention and security, quality water resources, healthy water ecology, livable water environment, and advanced water culture” (Jin et al., 2022; Sally, 2021; Chen et al., 2022b; Xia et al., 2022). We can find that the target system is missing expectations in terms of water management. In addition, scholars have conducted preliminary studies on the evaluation methods of happy rivers. At present, the evaluation methods of Happy River can be divided into two types according to their intrinsic nature – subjective and subjective-objective combination. The main idea of subjective evaluation is to analyze the influencing factors, form a complete evaluation system, and use different Weighting and evaluation methods to score. The models used in the existing studies include the entropy-weighted physical element model, cloud model, fuzzy evaluation method, and improved grey TOPSIS model (Wang et al., 2021a, Wang et al., 2021b; Huang et al., 2021; Han and Xia, 2020; Chen, 2021). The above models have apparent shortcomings. They are more subjective and have a cumbersome evaluation process. As a result, scholars have begun to explore evaluation methods that combine subjectivity and objectivity. The idea of combining subjectivity and objectivity is to train a neural network based on existing evaluation data with black box principles to produce a prediction that approximates the actual value (Qiaozhen et al., 2019). Currently, the mainstream neural network models include Radial Basis Function (RBF), Back Propagation (BP), Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM). BP, RBF, and LSTM have been repeatedly applied to evaluate water ecological health and quality (Tong et al., 2022; Cui, 2012; Shi et al., 2021). In order to overcome the shortcomings of the above study, we added the evaluation index of “water management” and applied LSTM to the evaluation of Happy River. “Efficient water management” means that the management and services related to river management are not absent and efficient. Key influencing factors include the professional quality of water staff, the degree of information management of water work, etc. The increase in water management expectations has enriched the Happy River. Besides, we chose LSTM as the evaluation method. This is because it can handle time series data better than BP and RBF (Smagulova and James, 2019). This evaluation is precisely a time series data processing process. As mentioned previously, LSTM has been used several times to evaluate water quality with good fitting results (Zhou et al., 2021). The Happy River evaluation only replaced the evaluation index compared to the water quality evaluation. Therefore, we performed the migration and retraining of the model. In this process, we need to adjust the hyperparameters of the LSTM so that the model’s fit remains superior. In summary, the innovations of this paper are shown below. We have enriched the evaluation index system of Happy River and led the evaluation of Happy River into a new era of objectivity and efficiency using LSTM.

Research methodology and data sources

Research methodology

Empowerment methods

This research is based on the expert scoring method, entropy method and CRITIC for comprehensive weighting. Expert scoring is a method of calculating weights using experts' assessment of the importance of indicators. It is a subjective empowerment method (Chen et al., 2018). The entropy method is a method of weight calculation using data entropy information, i.e., the amount of information. It is a method applicable when there are fluctuations between data and, at the same time, will use the data fluctuations as a kind of information (Dash and Kalamdhad, 2021). CRITIC is a method for weight calculation using correlations between data (Zhu and Chang, 2020; Yjc and Dza, 2021). First, we believe that river evaluation is highly specialized and prone to information asymmetry in water resources. Professional advice from the water sector was helpful in the evaluation itself. Therefore, the index assignment table provided by experts from the China Institute of Water Resources and Hydropower Research was selected for this study. Second, we used a combination of the entropy-critic for objective empowerment to reduce the evaluation’s subjective arbitrariness. The CRITIC method does not measure the degree of dispersion between indicators. The entropy method tends to ignore the correlations that may be contained among the indicators. However, the two complement each other perfectly. We believe that such a combination approach not only entirely takes into account the variability of the data of each indicator but also can take into account the correlation between the data (Fu and Chu, 2020).

Transfer learning

Transfer learning is the learning process of taking a model learned in an old domain and applying it to a new domain based on similarities between data, tasks, or models (Panigrahi et al., 2021). Transfer learning can be classified according to four criteria: the presence or absence of labels in the target domain, learning methods, features, and offline versus online forms (Chen, 2019), as shown in Figure 1. The transfer learning covered in this study is model-based migration. Model-based migration refers to the method of finding the parameter information shared between them the source and target domains to achieve migration, and this form of migration requires the assumption that the data in the source domain and the data in the target domain can share some parameters of the model (Fernandes and Cardoso, 2017; Kaya et al., 2019; Bayoudh et al., 2020). Finetune method is the original developed method for deep network migration and the one used in this paper. Existing studies have shown that: deep migration networks are more effective than random initialization of weights; 2, deep migration networks have advantages in suppressing data variability; migration of network layers can accelerate network learning and optimization; and the first three layers of neural networks are general features, which will be more effective for migration (Yosinski et al., 2014).
Figure 1

Schematic diagram of transfer learning classification.

Schematic diagram of transfer learning classification.

Long Short-Term Memory

Recurrent neural networks (RNNs) are a powerful class of neural network models for processing and predicting sequential data (Yang et al., 2018). Long Short-Term Memory (LSTM) operates on a similar principle to RNNs. However, because the structure of the LSTM black box (Internal Unit of LSTM) is richer and more detailed, it has more powerful information storage and prediction capabilities. It is a model that overcomes the inherent flaws of RNN - gradient disappearance and gradient explosion (Bengio, 2002). Due to its excellent properties, many scholars are devoted to using LSTM for research related to sequence data, such as behaviour simulation, image recognition, medical diagnosis, etc (Yin et al., 2016; Zhang et al., 2020a, Zhang et al., 2020b; Xia et al., 2018). The LSTM architecture introduced in this study is from Graves and Schmidhuber (2005). Its basic structure is a chain. As shown in Figure 2, there are several black boxes A in the whole chain. Let us take the second black box in the figure as an example to explain the role of the black box. The black box A can form a nonlinear mapping between the input value xt and the output value ht at time t. Specifically, the black box does not tell us the mathematical expression between the input and output values, but if we give the black box a large number of input and output values, it will be trained to produce a neural network with high accuracy. At this point, when we enter a new value, the black box will tell us an output value that is very close to the actual value. This is how the black box, or LSTM, works. In LSTM, the black box A can also keep the information at moment t (Ct and ht) and transmit them to moment t + 1. This transmitted information will modify the input values at moment t + 1. Therefore, the most remarkable feature of LSTM is its powerful time series data processing capability.
Figure 2

Diagram of LSTM structure.

Diagram of LSTM structure. In the following, we expand the black box A, as shown in Figure 3. The labelled boxes “σ” are called gates in the LSTM cell. The ft is called the forget gate, it is called the update gate, and ot is called the output gate. The “gates” can be considered a fully connected layer, and these gates help the LSTM store and update the information (Houdt et al., 2020). Specifically, gating is implemented by Sigmoid functions and dot product operations, and gating does not provide additional information. In addition, from Figure 3, the LSTM black box has three input and three output values. The three input values are the cell state at moment t − 1 (Ct−1), the hidden state at moment t − 1 (ht−1) and the sample vector at moment t (xt). The three output values are two hidden states at moment t (ht) and one cell status at moment t (Ct). The output value at moment t − 1 is the input value at moment t. Therefore, we only need to explore how the output values are formed.
Figure 3

Diagram of LSTM internal structure.

Diagram of LSTM internal structure. The general form of the gating control is shown in Eq. (1). Where σ(x) = 1/(1 + exp(−x)), is called the Sigmoid function. The Sigmoid function is often used as an activation function for LSTM because of two main properties. First, the Sigmoid function is easy to derive, facilitating the subsequent use of gradient-based parameter optimization algorithms. Second, the Sigmoid function can control the passage rate of information. When the output value is 0, the gating does not pass any information. When the output value is 1, the gating passes all the information. In addition, x and h are the input and output values of the black box, respectively. w and b are the weight matrix and bias term, respectively. As shown in Figure 3, the value of gating at time t is related to x and h. Hochreiter defines the general expression for gating. Multiplying x and h by the corresponding weight matrix, adding bias term, and performing the activation function operation can generate the gate unit (Yu et al., 2019). The initial weights and bias terms are random, but this does not affect the final training accuracy of the neural network. This is because until a mature LSTM is formed, it belongs to a supervised learning neural network. Then, we can keep adjusting the hyperparameters for trial and error. Eventually, we will harvest an LSTM with high accuracy in the continuous forward and backward propagation of information. After obtaining the general expression for gating, we can obtain the specific expressions for the three gates. The expression of the forgetting gate is shown in Eq. (2). The expression of the update gate is shown in Eq. (3). The expression of the output gate is shown in Eq. (4). As shown in Figure 3, C, the memory cell at moment t, consists of the summation of information from two sources. On the one hand, it results from element-by-element multiplication of ft and C. On the other hand, it is the result of the element-by-element multiplication of i and C′. Thus, C is essentially the sum of the information at moment t−1 (after forgetting) and the information at moment t (after updating). The expression of C is shown in Eq. (5). Where ⊙ represents the corresponding element multiplication. “Tanh” is a hyperbolic tangent function. This function is also widely used in LSTM like the Sigmoid function. Finally, we give the expression for the last output value at moment t (the hidden state at moment t − h). As shown in Figure 3, the information in h results from C passing through the tanh activation function and multiplying it element by element with o. The formula for ht is shown in Eq. (6).

Data sources

The data required for the construction of the long and short-term neural network evaluation simulation model are the sample and the label, the sample is the original data of the river happiness degree evaluation index system, and the label is the happiness degree score. The degree of river happiness is the collection and the collective name of the degree of health of the river itself, the degree of supporting high-quality economic and social development of the basin, the degree of carrying cultural soft power, the degree of human-water harmony, etc. Based on the study of relevant literature, the professional recommendations of the “China River and Lake Happiness Index Report 2020” were used as the main body and combined with the critical speech of General Secretary Jinping Xi (Xi, 2019), three levels were developed, including “target level, guideline level and indicator level”. “excellent water security”, “quality water resources”, “positive water economy”, “harmonious water ecology”, “efficient water management”, “advanced water culture” 6 subsystems, a total of 34 indicators of the evaluation system were made. A superior water security subsystem indicates that the river can combat flood and drought hazards. Key influencing factors include the extent of flooding in the basin, flood recovery efficiency, and flood prevention efficiency. A superior quality water subsystem means the river has excellent and stable water quality. Key influencing factors include surface water and groundwater quality conditions, etc. A positive water economy subsystem indicates that the river can satisfy agricultural, industrial, and domestic water use and high water use efficiency. Key influencing factors include the degree of water resource development and utilization, water supply security, etc. Balanced aquatic ecosystems indicate that the river ecosystem is stable and biodiversity-rich. Key influencing factors include the degree of natural habitat retention, the degree of soil and water conservation, and the degree of biodiversity. An efficient water management subsystem indicates that the management and services associated with river management are not absent and efficient. Key influencing factors include the professional quality of water staff, the degree of information management of water work, etc. The advanced water culture subsystem indicates that the transmission and innovation of river-related culture are not absent and efficient. Key influencing factors include the impact of the water landscape, public awareness of water conservation, etc. As shown in Table 1, the above data constitute a sample set. Based on professionalism, data differentiation, data relevance of the choice of three assignment methods - expert scoring method, entropy method, CRITIC and indicator scoring method with a comprehensive score as a label set.
Table 1

Evaluation indicators of happy river.

Target LevelGuideline LevelPrimary Indicator LevelSecondary Indicator Level
Assessment Of Watershed Happiness AExcellent Water Security B1Flooding Human Mortality RateC1
Flooding Economic Loss RateC2
Rate Of Flood Prevention Projects Meeting StandardsC3Dike Flood Control Project Standard Attainment RateD1
Reservoir Flood Control Project Standard Attainment RateD2
Sluice Gates Flood Control Project Standard Attainment RateD3
Flood ResilienceC4
Quality Water Resources B2River And Lake Water Quality IndexC5
Qualified Rate Of Centralized Drinking Water Sources For Surface WaterC6
Groundwater Resources Protection IndexC7
Positive Water Economy B3Water ResourcesDevelopment Utilization RateC8
Water Security RateC9Urban And RuralWater Supply Penetration RateD4
Proportion Of Actual Irrigated AreaD5
Water Withdrawal of 10,000 RMB of Industrial Added ValueD6
The Ability of Water Resources to Support High-quality DevelopmentC10GDP Output Per Unit Of WaterD7
Water Use Elasticity CoefficientD8
Resident Well-being IndexC11GDP Per CapitaD9
Engel CoefficientD10
Average Life ExpectancyD11
Harmonious Water Ecology B4Retention Rate Of Natural Habitats In Rivers And LakesC12Water Area Retention RateD12
Percentage Of River Vertical Connectivity Above Medium LevelD13
Rate Of Ecological Flow Of Important Rivers And Lakes Meeting StandardsC13
Soil And Water Conservation RateC14
Aquatic Biodiversity IndexC15
Urban And Rural Residents Pro-Water IndexC16
Efficient Water Management B5Percentage Of Middle And Senior Workers In The Water SectorC17
Water Education Base Opening RateC18
Water Resources Management Information System ConstructionC19
Water Institutional ReformC20
Advanced Water Culture B6Historical Water Culture Protection And Inheritance IndexC21Historic Water Cultural Heritage Preservation IndexD14
Historical Water Culture Dissemination PowerD15
Modern Water Culture Creation Innovation IndexC22
Water Landscape Impact IndexC23
Public Water Governance Awareness ParticipationC24Public Water Awareness Penetration RateD16
Public Participation In Water GovernanceD17

The index calculation method and assignment method are shown in Table 2.

Evaluation indicators of happy river. The index calculation method and assignment method are shown in Table 2.
Table 2

Schematic table of the calculation and scoring methods of the main evaluation indicators.

IndexCalculation methodAssignment method
C1C1′ = the average of the monthly flooding mortality rate in the last twelve months within the basin, where the monthly flooding mortality rate = the total flooding death and disappearance population in that month (unit: person)/the total population in the basin in that month (unit: million people) ∗ 100%C1′ = 0, C1 = 100.C1′ ≥ 0.42 persons per million, C1 = 0.Other cases are assigned points by linear interpolation
C2C2′ = the average monthly flood economic loss rate in the last twelve months within the basin, where the monthly flood economic loss rate = direct economic loss due to flood in that month (unit: million yuan)/GDP in that month within the basin (unit: million yuan) ∗100%C2′ = 0%, C2 = 100.C2′ ≥1.5%, C2 = 0.Other cases are assigned by linear interpolation
D1D1′ = the length of dykes that meet the standard (unit: km)/the total length of planned dikes (unit: km)∗100%D1 = D1′∗100
D2D2′ = the number of reservoirs that can play a usual role in flood control according to the design/the total number of reservoirs with flood control function ∗100%, where reservoirs are calculated according to large and medium-sized, small, and their weights is 0.6 and 0.4 respectively.D2 = D2′∗100
D3D3′ = Number of sluice gates that can play a usual role in flood control according to the design/total number of sluice gates with flood control function planned∗100%D3 = D3′∗100
C4The expert experience scoring method is used to evaluate four parameters: economic strength of the basin, development level, rescue and relief capacity, and post-disaster recovery action powerThe total score of 4 parameters is 100 points, based on the expert experience scoring method, and using the weighted average method to calculate the score of post-flood recovery capacity, the weight of the four parameters are 0.3, 0.2, 0.25, 0.25
C5The paper conducts this evaluation based on the proportion of I-III river lengths and the proportion of poor V river lengths. The proportion of I-III river lengths is the proportion of the length of rivers with water quality categories better than and equal to III to the length of the evaluated rivers. The proportion of poor V river length is the proportion of the length of rivers with the water quality category of poor V to the length of the evaluated rivers.The table of river water quality indicators uses the relevant provisions of the Technical Regulations for Surface Water Resources Quality Evaluation (SL395-2007)
C6C6′ = the number of qualified surface water centralized drinking water sources/total number of surface water centralized drinking water sources ∗ 100%C6 = C6′∗100
C7C7′ = total regional shallow groundwater extraction/regional groundwater extractable volumeC7′ ≤ 0.3, C7 = 100.C7′ is reduced by 10 points for each 0.1 increase in C7′.C7′ ≥ 1.3, C7 = 0.
C8C8′ = water supply volume/total water resources∗100%. Where the water supply volume does not include the net transfer of water (transfer in - transfer out) and the water supply volume of other water resourcesC8′ ≤ 40%, C8 = 100.C8′ ≤ 50%, C8 = 80.C8′ ≤ 67%, C8 = 60.C8′≤75%, C8 = 40.C8′≤90%, C8 = 20.C8′>90%, C8 = 0.The assignment criteria table is based on the “Technical Guidelines for River and Lake Health Assessment” (SL/T793-2020)
D4D4′=(urban water supply penetration rate∗urban population + county water supply penetration rate∗county population + formed town water supply penetration rate∗formed town population + rural tap water penetration rate∗rural population)/total basin population∗100%D4 = D4′∗100
D5D5′ = actual irrigated area of arable land/irrigated area∗100%D5 = D5′∗100
D6D6′ = industrial water consumption (unit: billion cubic meters)/industrial added value (unit: million yuan)∗100%D6 = D6′∗100
D7D7′ = 10,000/10,000 Yuan GDP water consumptionD7 = D7′/baseline value∗100; if D7 ≥ 100, count 100.Where the baseline value is taken as the median water consumption level of high-income countries,US$130m3, which translates into a GDP output of 531 yuan per square of water (in RMB)
D8D8′ = average monthly growth rate of water consumption/average monthly growth rate of GDP (less than 1, 100 points, 1–2, 80 points, etc.)D8′ ≤ 1, D8 = 100.D8′ is reduced by 10 points for every 1 increase in D8
D9D9′ = basin GDP/basin populationThe arithmetic mean of annual data was used for monthly data.D9 = D9′/benchmark value ∗ 100; if D9 ≥ 100, count 100.Where the benchmark value is taken from the lower level of high-income countries, i.e. US$20,000, with an exchange rate of 689.85 RMB/US$100
D10D10=ENCiCAPiCAPiENCi is the Engel coefficient of municipality i in the basin, CAPi is the population of municipality i in the basinD10 = benchmark value/D10′∗100; if D10 ≥ 100, count 100.Where the benchmark value is taken as the middle level of the UN′s affluence standard (20%–30%), i.e. 25%
D11D11=ALEiCAPiCAPiALEi is the average life expectancy of municipality i in the basin, CAPi is the population of municipality i in the basinD11 = D11′/baseline value∗100; if D11 ≥ 100, count 100.Where the baseline value is taken as the median of 81 years in high-income countries
D12D12′ = area of watershed space (rivers, lakes, reservoirs, beaches, mudflats, swamps) (unit: km2)/area of watershed space in 1980s (unit: km2)D12 = D12′∗100
D13D13=i=1naibiLj100bi=bLi+bQi2bLi=(Lai/Lj)(Lbi/Lj)(Lai/Lj)+(Lbi/Lj)/2αbQi=Qi/QjβWhere: D13 is the longitudinal connectivity index of the river segment; ai is the barrier coefficient corresponding to the barrage of the ith type; bi is the location correction factor of the barrier of the ith type; bLi is the location correction factor characterizing the influence of the location of the barrier on the longitudinal connectivity of the river at this level, characterizing the influence of the location of the barrier on the longitudinal connectivity of the river at this level; bQi is the influence of the location of the barrier on the connectivity between the river segment and the confluent main stream (estuary). Lai is the distance of the barrier from the source of the river; Lbi is the distance of the barrier from the estuary (or confluence into the main stream); Qi is the multi-year average natural runoff at the barrier; Qj is the multi-year average natural runoff at the estuary (or confluence into the main stream); and, α, β is the standardization coefficient, taking the values of 0.78 and 0.5 respectively.D13=j=1nD13Ljj=1nLjWhere: D13 is the vertical connectivity index of the major rivers in the primary zone, n is the number of rivers in the region with an area greater than 10,000 km2; Lj is the length of the jth riverAccording to the existing results of the national water ecology protection and restoration plan for major rivers and lakes, the national water resources protection plan, etc., combined with the actual basin, the standardization method of the vertical connectivity index of major rivers is determined: D13 (1-D13″/2.5)∗100; when D13″ > 2.5, D13 = 0
C13C13′ = Number of control sections (points) that meet the ecological flow target/number of evaluation sections (points)∗100%C13=C13′∗100
C14C14′ = Area with soil erosion intensity below mild/Area of evaluation area∗100%C14 = C14′/soil and water conservation rate threshold∗100
C15C15′ = the diversity indices of aquatic organisms (benthos, algae, phytoplankton, zooplankton) in the basin for the month
C16C16′ = Number of National Scenic Water Conservancy Areas in the basin (unit: one)/basin area (unit: 100,000 km2)C16′ = 0, C16 = 0;C16′ ∈ (0,1], C16 = 20;C16′ ∈ (1,5], C16 = 40;C16′ ∈ (5,10, C16 = 60;C16′ ∈ (10,20], C16 = 80;C16′ ∈ (20,+∞], C16 = 100;
C17C17′ = Number of senior workers in local water conservancy sector (unit: person)/Total number of workers in water conservancy sector (unit: person)C17 = C17′∗100
C18C18′ = Number of national water education bases within the basin (unit: one)/Total number of national water education bases (unit: one)C18 = C18′∗100
C19Has been established to the county water resources management information system at all levels for 100 points, has been established to the municipal water management information system for 80 points, other cases, 60 points
C20Has completed the municipalities, districts and counties water system reform 100 points, has completed the district and county water system reform 80 points, other cases, 60 points
D14D14′ = (number of world-class heritage ∗5 + number of national heritage ∗2 + number of provincial heritage) (unit: one)/basin area (unit: 100,000 km2)D14′ = 0, D14 = 0.D14′ ≥ 10, D14 = 100.Other cases according to linear interpolation assignment of points
D15D15′=(Number of national museums or bases∗2 + number of provincial museums or bases) (unit: one)/watershed area (unit: 100,000km2)D15′ = 0, D15 = 0.D15′ ≥ 6, D15 = 100.Other cases are assigned by linear interpolation
C22C22′ = [Number of national-level current year (scientific research projects with acceptance conditions + scientific research papers + awards + authorized patents) ∗2 + number of provincial-level current year (scientific research projects with acceptance conditions + scientific research papers + awards + invention patents)]/basin area (unit: 100,000 km2)C22′ = 0, C22 = 0.C22′ ≥ 6, C22 = 100.Other cases are assigned by linear interpolation
C23C23′ = [Number of world-class natural heritage water landscapes∗5 + number of national-level (natural heritage water landscapes + wetland parks + national parks)∗2 + number of provincial-level (natural heritage water landscapes + wetland parks + national parks)]/total resident population in the watershed (unit: million people)C23′ ≤ 1, C23 = 50.C23′≥ 10, C23 = 100.Other cases were assigned by linear interpolation
D16Questionnaire surveyUsing questionnaires to analyze the popularity of public awareness of water, respect for water, care for water and water conservation, each questionnaire has a total score of 100, and the average score is calculated according to all questionnaires.
D17Questionnaire surveyUsing questionnaires, statistical analysis of public participation in activities related to water science, water construction, water supervision, etc., with a total score of 100 points for each questionnaire and an average score calculated based on all questionnaires

Empirical analysis

Study area

The Jiangsu section of the Huaihe River Basin in China, which mainly flows through the north-central region of Jiangsu Province, China, involves eight prefecture-level cities, namely Xuzhou, Nantong, Lianyungang, Huaian, Yancheng, Yangzhou, Taizhou and Suqian, and is located at 116°22′–121°00′E and 32°23′–35°07′N. As shown in Figure 4, it is connected to the Tong Yang Canal and Yangtze River basin in the south, reaching the Yiliu hilly mountains, and the Yellow River basin in the north, reaching the Yimeng Mountains. The easternmost section of the Huaihe River Basin. With the waste Yellow River as the boundary, it is divided into the Yishushi and the lower reaches of the Huai River, with the cities of Xuzhou and Lianyungang belonging to the Yishushi region and the remaining six cities belonging to the lower reaches of the Huai River. There are many lakes and rivers in the Jiangsu section of the basin, including Hongze Lake, the Beijing-Hangzhou Grand Canal and the Huai-Shu New River, among which the Hongze Lake Wetland is an important freshwater wetland reserve in China with a good environment and a variety of biological and plant resources.
Figure 4

Jiangsu section of Huaihe River Basin, China.

Jiangsu section of Huaihe River Basin, China.

Data processing

Data smoothness analysis

The Augmented Dickey-fuller Test (ADF) test method uses an autoregressive model and optimizes the information criterion for multiple lagged values, which can determine the trend strength of the time series. If the original hypothesis of the test can be expressed as a unit root, indicating that it is non-stationary, the alternative hypothesis is stationary. The p-value in the statistical test is the probability. If it is less than or equal to the threshold value (0.05), it indicates that the original hypothesis is rejected (the data is smooth); if it is higher than the threshold value (0.05), it indicates that the null hypothesis cannot be rejected (the data is non-smooth). The ADF value is the ADF test value, the ADF value is compared with the critical value, and generally, the critical value is chosen to be 1%. In this study, Econometrics Views software (Eviews) was used to conduct the test, and the results showed that the p-values of the 34 characteristic series were less than 0.05, the ADF value of each characteristic was negative, and these ADF values were less than the critical value (1%), indicating that the original hypothesis was rejected, i.e., the time series was smooth. In summary, the original data is smooth, and the next stage requires a data cleaning process for this data.

Data cleaning

The data cleaning process removes the scattered null values in the data and for individual missing values. According to the data set and based on the research objective of this paper, the 5-bit moving average method is used to fill in the data. The formula for the missing value Mt at time t is shown in Eq. (7).where: M is the missing value at moment t; X, X, X, X, X denote the five data values preceding moment t, respectively. In summary, the data cleaning in this study is completed by removing the wrong values and filling the missing values.

Data pre-processing

Firstly, Data is normalized to resolve the difference in magnitude; secondly, the processed data is divided into the training set and test set, and 10% of the data is selected as the test set and 90% of the data as the training set in this experiment (Yang et al., 2018).

Analysis of results

First, we make a comprehensive assignment using the expert scoring method, entropy method and CRITIC method. The results of the weight assignment are shown in Table 3. Table 3 shows that the weights derived from the entropy weighting method and CRITIC have some differences, but the general trend is the same. They are calculated using the correlation of data information. The results of the expert scoring method differ slightly from these trends. It is based on the work experience of water experts. We can get a perfect score for each system when we do an arithmetic average of the three results. The total scores of water security, water resources, water economy, water ecology, water management, and water culture subsystems are 23.13, 11.37, 20.07, 20.03, 12.7 and 12.7, respectively.
Table 3

Schematic table of the results of the three methods of assigning weights.

Expert Scoring MethodC1C2D1D2D3C4C5C6C7C8D4D5D6D7D8D9D10
0.0750.0750.0300.0300.0150.0250.0600.0450.0450.0500.0300.0230.0230.0310.0310.0200.023
D11D12D13C13C14C15C16C17C18C19C20D14D15C22C23D16C17
Entropy Method0.0200.0150.0150.0450.0300.0300.0150.0100.0300.0300.0300.0150.0100.0250.0250.0150.010
0.0230.0120.0220.0160.0110.1610.0400.0350.0250.0520.0060.0110.0090.0110.0030.0080.010
D11D12D13C13C14C15C16C17C18C19C20D14D15C22C23D16C17
0.0090.0100.1460.0260.0300.0320.0240.0130.0910.0390.0170.0110.0080.0170.0090.0360.026
CriticC1C2D1D2D3C4C5C6C7C8D4D5D6D7D8D9D10
0.0470.0180.0250.0450.0220.0420.0400.0250.0260.0310.0180.0460.0190.0200.0280.0240.021
D11D12D13C13C14C15C16C17C18C19C20D14D15C22C23D16C17
0.0250.0380.0300.0250.0520.0200.0180.0190.0290.0310.0420.0190.0360.0370.0340.0230.025
Schematic table of the calculation and scoring methods of the main evaluation indicators. Schematic table of the results of the three methods of assigning weights. Second, we arithmetically average the scores obtained using the three weights. We used this score as the evaluation result of the happiness level of the river during the ten years. The scores are shown in Table 4. Table 4 shows that the river happiness in the Jiangsu section of the Huaihe River Basin in China is good from 2012 to 2021. From 76.4 points in 2012 to 87.34 points in 2021, the scoring rate has increased by 14.32%. Among them, water security, water resources, water economy, water ecology, water management, and water culture subsystem score increases of 7.68 %, 1.41 %, 26.21 %, 1.42 %, 62.33 %, and 14.60 %, respectively. The score of each subsystem in 2021 reached 92.78%, 82.23%, 84.7%, 89.07%, 83.46%, and 87.17%, respectively. We can see that the water safety, ecology, and culture subsystems are currently scoring high. This indicates that they are in good condition. In addition, the water economy and water management subsystem scores had a higher rate of increase. This indicates that they have made greater progress.
Table 4

Happiness score of Jiangsu section of Huaihe river basin, 2012–2022.

YearWater SafetyWater QualityEconomic ContributionWater EcologyWater ManagementWater CultureTotal Score
201219.939.2213.4717.596.539.6676.40
201320.839.4413.6917.836.629.8178.22
201421.799.1113.9617.866.699.8879.29
201521.078.5514.5717.557.2810.0679.08
201621.009.1015.1417.779.8910.3683.26
201721.008.6016.2717.8110.0710.4084.14
201821.009.2717.3417.749.9910.7486.08
201920.999.4816.8317.7010.4310.8686.29
202021.559.3216.6717.9610.5710.9587.03
202121.469.3517.0017.8410.6011.0787.34
202221.979.9416.5818.5910.9811.7189.77
Happiness score of Jiangsu section of Huaihe river basin, 2012–2022. Finally, we use the raw data as features and the scores as labels. We selected 108 samples (108 monthly data from 2012 to 2020) as the training set and 12 samples (12 monthly data from 2021) as the test set. We use this to build an LSTM simulation model of the degree of river happiness. In the model construction, we tried to borrow parameters from other good training results as the initial parameters of the model. After continuous debugging, we came up with the best model parameters. The model parameters are shown in Table 5. As mentioned earlier, we divided the system into six subsystems: water security, water resources, water economy, water ecology, water management, and water culture for LSTM modelling. The results of the model fitting are shown in Table 6. The training set of each system was associated with a maximum RMSE of 0.0226 and a minimum coefficient of determination R2 of 0.9699. The maximum RMSE for each system test set was 0.0193, and the lowest coefficient of determination R2 was 0.9011. These low error rates demonstrate the goodness of the model fit. We use the first subsystem – water security – to demonstrate the fitting effect. The other subsystems are explained similarly, so we will not expand the description. The fit of the “water security” subsystem is shown in Figures 5 and 6. We can see from the figure that the model fits well in both the training and test sets. The other subsystem fits are shown in Annexes 4–13.
Table 5

Selection of LSTM parameters.

numHiddenUnitsminiBatchSizeLearnRateDropPeriod
12864250
LearnRateDropFactorMaxEpochsInitialLearnRate
0.25000.001
Table 6

Schematic table of the fitting effects of the training set and test set for each subsystem.

Train Set
Test set
RMSER2Total RMSERMSER2Total RMSE
Water Safety0.00510.98850.01450.01930.98840.0113
Water Quality0.01630.97800.00450.9011
Economic Contribution0.00730.98680.01740.9041
Water Ecology0.01070.98740.01480.9723
Water Management0.02260.98980.00910.9458
Water Culture0.00130.96990.00890.9012
Figure 5

Schematic diagram of the fitting effect of the training set of the water safety subsystem.

Figure 6

Schematic diagram of the fitting effect of the test set of the water safety subsystem.

Selection of LSTM parameters. Schematic table of the fitting effects of the training set and test set for each subsystem. Schematic diagram of the fitting effect of the training set of the water safety subsystem. Schematic diagram of the fitting effect of the test set of the water safety subsystem. After demonstrating the feasibility of the model, we bring in the data for the June 2022 river. Subsequently, we obtained the scores of river happiness in the Jiangsu section of the Huaihe River Basin. The scores are shown in Table 4. The overall score was 89.77, an increase of 2.78% compared to 2021. The scores of each subsystem increased by 2.38%, 6.31%, 4.2%, 3.58%, and 5.78%, respectively, compared to 2021. The score of the water economy subsystem decreased by 2.47% compared to 2021. In summary, the general trend of happy river development is positive. However, we can find a tendency for the economic contribution level to fall back. The system of water’s contribution to the economy has entered a period of stability. This is concentrated in demand stability, cost stability, channel stability, etc. Shipping, hydropower generation, drinking water supply, river and lake biological supply and other channels have formed a relatively mature pricing system and trading market. This is the main reason the water economic system is more stable and making small steps forward at the macro level. However, to alleviate the new crown epidemic, the state has taken a series of impact measures on shipping, river and lake bio-supply, etc. These measures have caused problems such as increased transportation costs, lower income levels of the population and reduced demand. These are a large part of the reason for this retreat in water economy subsystem scores. The relevant literature also confirms this finding (Du et al., 2021; Sun et al., 2021; Zhang et al., 2020b).

Conclusion

We identified two shortcomings after combing through the existing studies on the evaluation of Happy River. First, the target layer of the existing indicator system is missing expectations in terms of water management. To this end, we have added indicators such as “the opening rate of water education bases, the degree of construction of water resources management information systems, and the proportion of senior staff in the water sector” and used them as the basis for water management evaluation. Second, the existing evaluation methods are more subjective, and the evaluation process is cumbersome. For this purpose, we focus on the evaluation method that combines subjectivity and objectivity - neural networks. We chose LSTM as a method for the evaluation of the Happy River because of its feasibility for water quality evaluation. The empirical results show that the maximum RMSE between each system’s training and test sets is 0.0226. The lowest coefficient of determination R2 for each system was 0.9011. This indicates that the model fits well. Compared with the existing research results, we have enriched the evaluation index system of Happy River and led the evaluation of Happy River into a new era of objectivity and efficiency by using LSTM. Of course, we found the following limitations to this study. Due to the small sample size, we divided the whole system into six subsystems for modelling. Therefore, we will improve the model by adding optimization algorithms such as genetic algorithms and particle swarm algorithms in our future research. We expect the optimization algorithm to significantly improve the accuracy of the model fit to realize the modelling of the whole system.

Declaration

Author contribution statement

Tingting Zhu: Conceived and designed the experiments; Performed the experiments; Wrote the paper. Juqin Shen: Contributed reagents, materials, analysis tools or data. Fuhua Sun: Analyzed and interpreted the data.

Funding statement

This work was supported by Social Science Foundation of Jiangsu Province [19GLD002], Fundamental Research Funds for the Central Universities [2018B58814], Central University Basic Scientific Research Business Expenses Special Funds [2019B69214], Water Conservancy Science and Technology Project of Jiangsu Province [2019013].

Data availability statement

The authors do not have permission to share data.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.
  5 in total

1.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

Authors:  Alex Graves; Jürgen Schmidhuber
Journal:  Neural Netw       Date:  2005 Jun-Jul

2.  Learning long-term dependencies with gradient descent is difficult.

Authors:  Y Bengio; P Simard; P Frasconi
Journal:  IEEE Trans Neural Netw       Date:  1994

3.  A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures.

Authors:  Yong Yu; Xiaosheng Si; Changhua Hu; Jianxun Zhang
Journal:  Neural Comput       Date:  2019-05-21       Impact factor: 2.026

4.  Water quality and ecological risks in European surface waters - Monitoring improves while water quality decreases.

Authors:  Jakob Wolfram; Sebastian Stehle; Sascha Bub; Lara L Petschick; Ralf Schulz
Journal:  Environ Int       Date:  2021-03-05       Impact factor: 9.621

5.  Discussion on the existing methodology of entropy-weights in water quality indexing and proposal for a modification of the expected conflicts.

Authors:  Siddhant Dash; Ajay S Kalamdhad
Journal:  Environ Sci Pollut Res Int       Date:  2021-05-27       Impact factor: 4.223

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.