Literature DB >> 33266848

Application of Entropy Spectral Method for Streamflow Forecasting in Northwest China.

Gengxi Zhang1,2, Zhenghong Zhou1, Xiaoling Su1,2, Olusola O Ayantobo3.   

Abstract

Streamflow forecasting is vital for reservoir operation, flood control, power generation, river ecological restoration, irrigation and navigation. Although monthly streamflow time series are statistic, they also exhibit seasonal and periodic patterns. Using maximum Burg entropy, maximum configurational entropy and minimum relative entropy, the forecasting models for monthly streamflow series were constructed for five hydrological stations in northwest China. The evaluation criteria of average relative error (RE), root mean square error (RMSE), correlation coefficient (R) and determination coefficient (DC) were selected as performance metrics. Results indicated that the RESA model had the highest forecasting accuracy, followed by the CESA model. However, the BESA model had the highest forecasting accuracy in a low-flow period, and the prediction accuracies of RESA and CESA models in the flood season were relatively higher. In future research, these entropy spectral analysis methods can further be applied to other rivers to verify the applicability in the forecasting of monthly streamflow in China.

Entities:  

Keywords:  burg entropy; configurational entropy; relative entropy; spectral analysis; streamflow forecasting

Year:  2019        PMID: 33266848      PMCID: PMC7514611          DOI: 10.3390/e21020132

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


1. Introduction

Accurate streamflow forecasting is vital for flood control, reservoir management, restoration of river environment, irrigation, and navigation, among other uses [1]. Moreover, it can also provide guidelines for policy makers in the utilization and management of water resources and the formulation of water environmental health protection plans. So far, the simulation of monthly streamflow is a hotspot for hydrologic researchers but is still in exploration and development due to the limitations of forecasting methods. As a traditional method, time series analyses such as autoregressive (AR) or autoregressive moving average (ARMA) models are often used to simulate streamflow, but they cannot address the issue of seasonality that exists in the monthly streamflow series [2]. Fortunately, entropy spectral analysis can extract significant information from streamflow process and forecast monthly streamflow accurately coupled with the time series analysis method. Actually, the spectral method has been successfully used by some researchers for monthly streamflow forecasting with different types of entropy including Burg entropy [3], configuration entropy [1,2], and minimum relative entropy [4,5]. Burg [6] proposed Burg entropy theory (BET) in the frequency domain and then further developed the maximum Burg entropy spectral method (BESA) for time series forecasting. As a classic method for hydrologic forecasting, BESA has been widely used in groundwater level forecasting [7], flood forecasting [8], and streamflow forecasting [3] and has shown an advantage for long-term streamflow forecasting. However, BESA has lower resolution in determining multi-peak spectra, and the monthly streamflow hardly exist in only one period. Maximum configuration entropy spectral method (CESA) is a substitute for the forecasting of multi-peak spectra series. The concept of the maximum configuration entropy spectral method (CESA) was initially proposed by Frieden [9] in the identification of images. Thereafter, Gull and Daniell [10] applied the concept in the field of astronomy for image reconstruction. In the field of time series analysis, the CESA performs better than the BESA in the determination of spectral density function in the ARMA model and the MA model but has no practical advantage in the AR model [11]. The CESA has been applied for streamflow forecasting by Cui [2] and has shown better reliability than BESA. As an extension of BESA, minimum relative entropy spectral analysis (RESA) proposed by Shore [12,13] was also applied to the time series forecasting. In RESA, the spectral power was considered as a random variable. Tzannes et al. [14] and Woodbury and Ulrych [15] developed RESA and extended the theory and practice of minimum relative entropy. The RESA spectra have higher resolution and are more accurate in detecting peak location than other methods for spectral computation [16]. The RESA method has been used for monthly streamflow forecasting [4,5,16] and has smaller errors than the other two entropy spectral methods. However, there is very little research that has reported the application of these methods in streamflow forecasting in China. Moreover, not many researchers have given attention to the selection of streamflow length for a training period. Therefore, the main objectives of this paper are (1) to use three entropy spectral methods for monthly streamflow forecasting in Northwest China, (2) to select the appropriate training period for the models, and (3) to compare the three models and select the best model for streamflow forecasting in Northwest China.

2. Methods

Suppose there is a streamflow series, y(t). Convert it to the frequency domain f. If f is considered a random variable, the spectral density function is normalized as a probability function. Burg entropy can then be expressed as: where W = 1/2∙Δt is the Nyquist fold-over frequency and Δt is the sampling period. The definition of configuration entropy is similar to Burg entropy and is defined as: With the given prior spectral density function q(f), the relative entropy can be defined as: The prior spectral density is like background noise in the peak of observed periodicity. When spectral density is uniform, the relative entropy reduces to a configuration entropy. The processes shown in Figure 1 mainly include (1) calculating parameters; (2) determining spectral density function; (3) extending autocorrelation function; and (4) forecasting streamflow and comparing the three methods for the selection of the most appropriate method.
Figure 1

The flow chart of streamflow forecasting using entropy spectral method. RE: average relative error; RMSE: root mean square error; R: correlation coefficient; DC: determination coefficient.

2.1. Deriving Spectral Density Function

In order to obtain the least biased spectral density, under some given constraints, the Burg and configuration entropies are maximized while the relative entropy is minimized before spectral density estimation. According to the relationship of spectral density function and autocorrelation function, the constraints could be given as: where , is the autocorrelation function of n-th lag; N usually equals 1/4 to 1/2 of the streamflow length series. Subject to the constraints, entropy can be maximized or minimized using the Lagrangian function, which can be formulated as: where λ is the Lagrangian multiplier and H(f) is the entropy function. The partial derivatives of L to the spectral density are taken and then equated to zero. The least biased spectral densities obtained by maximizing Burg entropy and configuration entropy and minimizing relative entropy respectively, are expressed as follows:

2.2. Calculating Lagrangian Multipliers

The methods of calculating Lagrangian multipliers are different due to the variation in the forms of spectral densities. For Burg entropy, Levinson–Burg algorithms [6,17] are applied to determine Lagrangian multipliers. While in the case of configuration entropy and relative entropy, cepstrum algorithms are applied to calculate Lagrangian multipliers. By taking the inverse Fourier transform of the log magnitude of Equation (8), we can obtain: Take the prior and posterior cepstrum of autocorrelations which are transformed from the prior and posterior spectral densities and expressed as e and e in the following equations: Then Equation (9) can be abbreviated as: where δ is a delta function. Lagrangian multipliers can be solved using N linear functions from Equation (12) of the relative entropy For configuration entropy, e = 0, Lagrangian multipliers can then be calculated by:

2.3. Forecasting Streamflow

BESA allows autocorrelation to expand as a linear combination of previous autocorrelation parameters with predicted coefficients. can be expressed as: where a is obtained using the reflection recursion method proposed by Burg [18]. For the configurational and relative entropies, the autocorrelations are extended as: and According to the extended autocorrelation functions, the forecasting equations of the three spectral entropies methods are obtained as follows: where C(T + k) is the cepstrum of streamflow series, and it always equals . m is the order of the model, which is determined by BIC criteria. where N is the length of streamflow series and is the variance of residual of observed and forecasted streamflow.

2.4. Evaluating the Precision of Forecasting Results

In this paper, we selected average relative error (RE), root mean square error (RMSE), correlation coefficient (R) and determination coefficient (DC) as evaluation indicators for the forecasted results. The RE, RMSE, R and DC are expressed as: where represents the average value of observed streamflow x(t), represents the average value of the forecasted streamflow f(t), and n is forecasting period (month). According to the “forecasting norm for hydrology intelligence”, the determination coefficient (DC) is classified into three levels as shown in Table 1.
Table 1

Model forecasting accuracy rating.

CriterionABC
DC ≥0.90.9~0.70.7~0.5

3. Application

3.1. Data Preprocessing

Observed streamflow data from five hydrological stations, Yingluoxia, Zamusi, Jiutiaoling, Xiangtang and Tangnaihai, in Northwest China were selected to verify these three spectral entropy methods. These five hydrological stations are located in the Yellow River, Heihe River and Shiyang River, respectively. Tangnaihai station is located at the mainstream of the Yellow River, while Xiangtang is located at the tributary of the Yellow river. Zamusi and Jiutiaoling stations are situated on the Shiyang River. Yingluoxia station is located at the Heihe River and it marks the boundary between the upstream and middle reaches [1]. Basic information on the five hydrological stations are shown in Figure 2 and Table 2.
Figure 2

Location of hydrologic stations in Northwest China.

Table 2

Basic information of streamflow data for selected hydrologic stations [1].

Hydrologic StationLongitudeLatitudeRiverCatchment Area (km2)Control Area (km2)Annual Runoff (m3/s)
Yingluoxia100°11′ E38°48′ NHei River130,00010.00951
Zamusi102°34′ E37°42′ NZamu River8518518
Jiutiaoling102°03′ E37°52′ NXiying River1120107710
Xiangtang102°51′ E36°22′ NDatong River15.13315,12688
Tangnaihai100°09′ E35°30′ NYellow River752,443121,972633
The entropy spectral analysis model belongs to the autocorrelation methods, and the input data should be a standardized stationary random sequence. To meet the requirement, the streamflow sequences should be transformed using the Box–Cox method. Box–Cox transformation can eliminate data skewness and make data errors present a normal distribution [17]. In addition, standardized transformation was also performed on the sequences. To test whether transformed sequences were stable, we verified the unit root of sequences. If the unit root exists in the sequence, it is not a stationary random sequence and vice versa [19]. The adftest function in the econometric toolbox of MATLAB 2010b (2010b, MathWorks, Beingjing, China, https://ww2.mathworks.cn/products/matlab-online.html) was used to test whether the unit root exists. The adftest function assumes that the unit root does not exist in the sequence. If the hypothesis is true, the logical value of H is 1 and the confidence can be returned. If the hypothesis is false, the logical value of H is 0. The test results of all transformed streamflow sequences for five hydrological stations show that all the sequences are stable and homogeneous (Table 3).
Table 3

Adftest test results of monthly streamflow in each hydrologic station.

Hydrologic StationsYingluoxiaZamusiJiutiaolingXiangtangTangnaihai
Returned value11111
p Value0.0010.0010.0010.0010.001
Confidence coefficient (%)99.999.999.999.999.9

3.2. Determining Training Period

In previous research, the training periods were always less than 100 months, and very few papers discussed the influence of the training period on the forecasted results. In this paper, we selected the observed streamflow data from the years 2008 to 2012 as the validation period. Additionally, observed data from 3 to 40 months were selected as a training period to evaluate the influence of the training period on forecasted results. In order to determine the period of models, the relationship between the different training periods and the optimal order of the models are explained in Figure 3. As seen in Figure 3, when the training period is short, the optimal fitting order of models is lower, and then the optimal order of models tends to be stable with the increase of training period.
Figure 3

The model order corresponding to the different calibration periods. (a) Yingluoxia station; (b) Jiutiaoling station; (c) Zamusi station; (d) Xiangtang station; (e) Tangnaihai station.

Beyond that, the relationship between the training period and the DC of the validation period were explored (Figure 4). The forecasting effect is weak and not stable enough when the training period is less than 15 years. However, increase in the training period increases and stabilizes the DC. In order to make use of the expert opinion to increase the forecasting precision, the calibration period was determined as 26 years.
Figure 4

Evaluation index (DC) corresponding to different lengths of calibration period. (a) Yingluoxia station; (b) Jiutiaoling station; (c) Zamusi station; (d) Xiangtang station; (e) Tangnaihai station.

3.3. Estimating Spectral Density

Spectral analysis is a powerful method employed to check the periodicity by finding out the frequency of spectrum peaks. The spectral densities estimated by these three spectral entropy methods were compared to the spectral density estimated by fast Fourier transform (FFT) (Figure 5). Five representative rivers were chosen to show the ability of BESA, CESA, and RESA to estimate the spectral densities. For RESA, a prior spectral density function was hypothesized from data information. The determination process of prior spectral density functions is described in Appendix A. It can be discovered from Figure 5 that all of the stations displayed a peak at frequency 1/12. On the other hand, there were other peaks near frequency 1/4th and 1/6th in the spectral density at Zamusi, Jiutiaoling and Tangnaihai stations.
Figure 5

Spectral density estimated by BESA, CESA, RESA and fast Fourier transform(FFT) method for five hydrological stations in Northwest China. (a) Yingluoxia station; (b) Jiutiaoling station; (c) Zamusi station; (d) Xiangtang station; (e) Tangnaihai station.

For uni-peak streamflow series, the BESA, CESA, and RESA can check the periodicity equally as well as FFT. However, for multi-peak streamflow series, the BESA did not perform as effectively in detecting the principal periodicity. On the contrary, the CESA and RESA correctly checked the most significant peak at the 1/12th frequency. However, CESA always neglects all secondary spectral peaks to keep the peak at 1/12th most significant. The RESA detected less significant peaks, and was consistent with the FFT results. In order to examine whether this variation would affect the forecasting precision, we used these three methods to forecast streamflow in five hydrological stations for selecting the optimal model in northwest China.

3.4. Streamflow Forecasting Analysis

Streamflow was forecasted using three spectral entropy methods for five hydrological stations (Figure 6 and Figure 7) with a validation period of five years. The results indicated that the forecasting accuracy was worse in Tangnaihai station where the DC is less than 0.6 (Table 4) and belongs to level C compared with the other four stations. The reason for this may be that the catchment area of Tangnaihai station is much wider than other stations. Moreover, the intensive anthropogenic activities might also have a severe impact on the streamflow of Tangnaihai station. Therefore, it is difficult to accurately forecast streamflow with only streamflow from previous months using autoregression-based models.
Figure 6

Streamflow forecasting using entropy spectral methods for five hydrological stations. (a) Yingluoxia station; (b) Jiutiaoling station; (c) Zamusi station; (d) Xiangtang station; (e) Tangnaihai station.

Figure 7

Comparison between observed and forecasted streamflow. (a) Yingluoxia station; (b) Jiutiaoling station; (c) Zamusi station; (d) Xiangtang station; (e) Tangnaihai station.

Table 4

Three models’ performance metrics in each of the selected hydrological station.

Hydrological StationBESACESARESA
RE RMSEm3/s R DC RE RMSEm3/s R DC RE RMSEm3/s R DC
Yingluoxia0.17318.3480.9340.8590.19416.8070.9420.8820.19616.5710.9440.885
Zamusi0.2163.3950.9240.7340.2593.3030.8920.7480.2683.5210.8760.714
Jiutiaoling0.2244.9720.9110.7160.2734.7710.9110.7390.2764.5920.9110.758
Xiangtang0.26027.2370.9240.7970.23225.7600.9070.8180.23422.6360.9280.859
Tangnaihai0.315303.3030.7650.5450.324302.7490.7800.5470.326291.9220.7960.579
By comparing the forecasting accuracy of the three models for five hydrological stations during the validation period, we discovered that the rank of forecasting accuracy with the evaluation criteria of DC, RMSE and R for the three models was in the order RESA > CESA > BESA for Yingluoxia station (Table 4). However, for Zamusi station of Shiyang River, the accuracy of the CESA model was higher than the other models (Table 4, Figure 6). For the remaining three hydrological stations, the accuracy was similar for the three models, and the RESA model was more accurate than CESA and BESA models using DC, RMSE and R criteria. However, the RE between the observed streamflow and forecasted streamflow using BESA was smaller than other methods. The reason for this is that RE reflects the linear error between observed values and forecasted values, while the RMSE, R and DC reflect the quadratic power error between observed values and forecasted values. When the forecasting error of the flood season was smaller, the RMSE, R and DC would be effective. However, when the forecasting error of the non-flood season were smaller, RE would be better. To verify this conjecture, the whole period was divided into flood season from July to October and low-flow season from January to June, November, and December in each year. We extracted the forecasted streamflow of the non-flood season and compared it with the observed streamflow in the five stations (Table 5). As shown, BESA performs better than other methods. During the low flow season, the advantage of BESA over the others was significant, where the streamflow was forecasted close to the observation. However, the overall forecasting accuracy of the RESA model and the CESA model was higher. At the same time, because the streamflow forecasting itself serves as the optimal allocation of water resources, the annual or flood runoff prediction was more meaningful. As a whole, the RESA model can better adapt to the streamflow forecasting for the five hydrological stations in northwest China. Combining precipitation as a predictor, selecting one or more models with high accuracy in the flood season, and using the entropy spectrum model and its combination [1] to forecast streamflow could be a future research direction.
Table 5

Three models’ performance in non-flowed metrics in each selected hydrological station.

Hydrological StationBESACESARESA
RE RMSEm3/s R DC RE RMSEm3/s R DC RE RMSEm3/s R DC
Yingluoxia0.1798.010.9430.8700.23910.540.9390.7870.23110.510.9430.788
Zamusi0.2243.000.9360.7260.2553.150.9010.6920.2633.470.8870.665
Jiutiaoliing0.2294.020.8630.6540.2574.420.8640.6310.2684.400.8640.636
Xiangang0.24512.720.8570.8070.23313.110.8410.7500.25915.360.8470.683
Tangnaihai0.261153.080.8020.6300.276165.700.8130.5670.284158.230.8270.605

4. Conclusions

In this paper, the BESA, CESA, and RESA models were applied for spectral analysis and streamflow forecasting in northwest China using monthly streamflow data from five hydrological stations. The estimated spectral density and prediction accuracy of the three methods were compared based on the optimal length of the training period. The spectral density functions of the BESA, CESA and RESA was smoother than that of FFT, and all of them can clearly estimate the 12 month primary period of monthly streamflow sequence without deviation. However, the spectral density function of BESA could not detect the other significant secondary periods, while that of CESA could detect the secondary periods for multi-period sequences despite a certain degree of leakage. By comparing these three entropy spectral methods, we discovered that all of these methods could forecast streamflow accurately. Among them, the RESA model has the highest prediction accuracy, followed by the CESA model. Due to the lack of data, this paper only applied the entropy spectral theory to the monthly streamflow forecasting of few rivers in northwest China. In future research, three entropy spectral analysis methods can further be applied to other rivers to verify the applicability of the three entropy spectral analysis methods in the forecasting of monthly streamflow in China.
Table A1

Hypothesis on the prior spectral density.

NumberPeriodSpectral Density Function
Assumption 1None p(f)=1
Assumption 212 monthsp(f)=p, p(1/12)=1000×p
Assumption 312 months, 6 monthsp(f)=p, p(1/12)=900×p, p(1/6)=100×p
Assumption 412 months, 4 monthsp(f)=p, p(1/12)=900×p, p(1/4)=100×p
Assumption 512 months, 4 months, 6 monthsp(f)=p, p(1/12)=900×p, p(1/6)=p(1/4)=50×p
Assumption 612 months, 4 months, 6 monthsp(f)=p, p(1/12)=900×p, p(1/4)=75p, p(1/6)=25×p

Note: .

Table A2

Itakura–Saito distance between CESA spectral density and each hypothesis spectral density for RESA.

Hydrologic StationAssumption 1Assumption 2Assumption 3Assumption 4Assumption 5Assumption 6
Yingluoxia3.4663 1.4495 1.45361.46381.46361.4629
Zamusi3.21001.30801.28881.27351.2508 1.2502
Jiutiaoling3.58511.32111.29611.2622 1.2267 1.2303
Xiangtang3.2742 1.3225 1.32271.32741.32371.3230
Tangnaihai3.13841.48341.42391.4790 1.4158 1.4163

Note: Boldface represents the optimal spectral density functions for RESA in five hydrologic stations.

  1 in total

1.  Restoring with maximum likelihood and maximum entropy.

Authors:  B R Frieden
Journal:  J Opt Soc Am       Date:  1972-04
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.