Lijun Pei1, Kewei Wang1. 1. School of Mathematics and Statistics, Zhengzhou University, Zhengzhou, 450001 Henan People's Republic of China.
Abstract
Precipitation as the meteorological data is closely related to human life. For this reason, we hope to propose new method to forecast it more accurately. In this article, we aim to forecast precipitation by reservoir computing with some additional processes. The concept of reservoir computing emerged from a specific machine learning paradigm, which is characterized by a three-layered architecture (input, reservoir and output layers). What is different from other machine learning algorithms is that only the output layer is trained and optimized for particular tasks. Since the precipitation data is non-smooth, its prediction is very difficult via the classical methods of prediction of the nonlinear time series. For the predicated precipitation data, we take its first-order moving average to make it smoother, then take the logarithm of smoothed nonzero data and the same negative constant for smoothed zero data to obtain a new series. We train the obtained series by reservoir computing and get the predicated result of its future. After taking its exponent function, the predicated data for original precipitation data are obtained. It indicates that reservoir computing combined with other processes can potentially bring about the accurate precipitation forecast.
Precipitation as the meteorological data is closely related to human life. For this reason, we hope to propose new method to forecast it more accurately. In this article, we aim to forecast precipitation by reservoir computing with some additional processes. The concept of reservoir computing emerged from a specific machine learning paradigm, which is characterized by a three-layered architecture (input, reservoir and output layers). What is different from other machine learning algorithms is that only the output layer is trained and optimized for particular tasks. Since the precipitation data is non-smooth, its prediction is very difficult via the classical methods of prediction of the nonlinear time series. For the predicated precipitation data, we take its first-order moving average to make it smoother, then take the logarithm of smoothed nonzero data and the same negative constant for smoothed zero data to obtain a new series. We train the obtained series by reservoir computing and get the predicated result of its future. After taking its exponent function, the predicated data for original precipitation data are obtained. It indicates that reservoir computing combined with other processes can potentially bring about the accurate precipitation forecast.
Precipitation is an important and complicated climate phenomenon [1] and heavy or long-termed rainfall may cause agriculture, animal husbandry and public safety loss [2]. Therefore, precipitation forecast is essential to help us make emergency management and reduce loss. In 2011, Guo et al. forecasted the precipitation by numerical weather prediction models [3]. On the basis of the models, Clark et al. used Convection-Permitting numerical weather prediction models to forecast the precipitation over the UK and the Alpine region, respectively, in 2016 [4]. In the absence of Convection-Permitting numerical weather prediction ensembles, Khain et al. used a smoothed time-lagged ensemble method to forecast the precipitation in 2019 [5]. In addition, the mechanism of precipitation and the circulation of water resources are also important. To address fouling issues and industrial wastewater, Tlili et al. applied the cutting edge membrane technology, electrospun nanofibrous membranes [6] and Mahmood et al. researched a suitable fungus for the bioremediation of textile industrial wastewater, F. pini cola IEBL-4 [7]. Besides, Tlili et al. reduced the energy required for desalination technology by studying the method of flat sheet direct contact membrane distillation [8]. Similar works can be found in [9]. Gao et al. [10] and Nayak et al. [11] aimed to increase heat transfer efficiency of water, which is also a meaningful study.The concept of reservoir computing (RC) was proposed by Jaeger [12] and Maass et al. [13] independently. Through more details, Jaeger proposed the concept of echo state networks (ESNs) and Maass et al. proposed the concept of liquid state machine (LSM). And later, both of them were unified under the concept of reservoir computing [14]. RC consists of a three-layered architecture, i.e., input, reservoir and output layers. The outstanding feature of RC is that only the output layer is trained and optimized for the particular task, which can extremely reduce algorithm cost. A review article introduces reservoir computing in detail [15]. What is more, RC is generally very suited for dealing with temporal classification, regression or prediction tasks, such as chaotic time series prediction [16, 17], attractor reconstruction [18, 19] and fault diagnosis [20] and so on. In recent years, more and more improved methods for RC were proposed to further improve its performances. For instance, Chembo proposed RC with time-delayed optoelectronic and photonic systems [21] and Goldmann et al. discovered the correlation between memory capacity and conditional Lyapunov exponents of dynamical systems by deep time-delayed RC [22].On July 20, 2021, Zhengzhou City, the capital of Henan Province, China encountered an extremely heavy rainstorm, which caused its huge loss of human lives and financial resources. To this end, we aim to forecast heavy precipitation in this paper. We first apply RC to chaotic time series prediction. The results of four chaotic systems are excellent. However, when we forecast precipitation by RC, the error between the real data and the predicted data is large and unrealistically negative prediction values occur. Then, we optimize our approach with additional processes. The first step is taking the first-order moving average for the original precipitation data to make it smoother. And the second step is taking the logarithm for smoothed nonzero data and taking the same negative constant for the smoothed zero data. The final step is training obtained data and predicting its future state by RC. By taking the exponent function of the predicted data, we get the the final forecast result. We can forecast precipitation just from historical data through a model-free approach. It shows that we also have good forecast results under long-term forecast. Besides it, we can apply our method to river runoff forecast, meteorological forecast of temperature and humidity, electricity consumption forecast, stock price forecast and epidemic forecast such as COVID-19, oil futures price forecast and so on. Since the precipitation data are non-smooth and it is not feasible to take the inverse of the moving average, the forecast results are not as accurate as those of the chaotic time series. Despite this, consistency of predicted results with actual precipitation may allow us to make early warnings of heavy rainfall. The reason is that we can predict some of the time of heavy rainfalls accurately. Its constraint and limitation are that since the precipitation data are non-smooth and the inverse of the moving average is unavailable, we cannot obtain the exact forecast result accurately of the original precipitation time series.This paper is structured as follows. In Sect. 2, we introduce RC in detail, including the model, node types, theoretical capabilities and reservoir creation and so on. In Sect. 3, we predict the chaotic time series by RC. In Sect. 4, we forecast the precipitation by RC with some additional processes. Finally, we briefly summarize and discuss our work.Schematic of ESNs architecture [21]. The input and reservoir connectivity matrix and W are both randomly generated, and only the output connectivity matrix is trained for optimization
Reservoir computing
In this section, we introduce the model, node types, theoretical capabilities, reservoir creation, training methods, reservoir adaptation, structured reservoirs and measures of dynamics of RC. Here, we mainly introduce the ESNs architecture with K input units, N internal units and L output units (see Fig. 1). And the model of reservoir computing as follows, i.e., the internal state x of reservoir and the output signal y are updated at each discrete time step n,where is a vector nonlinear function and often the hyperbolic tangent function is chosen as it [21]. And u is the input signal. W is the internal connectivity matrix in reservoir layer. and represents input connectivity matrix and readout matrix, respectively. In the case of supervised learning, the optimal readout matrix can be obtained by ridge regression as follows,where is a suitably designed matrix that concatenates the internal state x obtained with some training input vectors u, is the target matrix that yields the desired classification outcome, I is the identity matrix, and is a small regularization factor required to circumvent the ill-posedness of the inversion problem.
Fig. 1
Schematic of ESNs architecture [21]. The input and reservoir connectivity matrix and W are both randomly generated, and only the output connectivity matrix is trained for optimization
For node types, many different neuron types have already been used in RC, such as linear nodes, threshold logic gates, hyperbolic tangent and spiking neurons [23]. We know that the network with the best memory capacity consists of linear neurons, but its optimal outcome is far from that of systems with spiking neurons on many tasks. Therefore, there is no clear understanding on which types are optimal for specific tasks. For theoretical capabilities, taking ESNs as example, necessary and sufficient conditions of the echo state property of a recurrent network are based on the spectral radius and largest singular value of the connection matrix [12]. The echo state property means that the current state of the network is uniquely determined by the network input up to now, not by the initial state. For reservoir creation, reservoirs are created randomly and rescaled using measures based on stability bounds in ESNs [24]. In addition, spectral radius is slightly lower than one to ensure that the reservoirs have the echo state property. And a tighter bound on the echo state property is described based on ideas from robust control theory, which is even exact for some special connection topologies. For training in RC, the original LSM concept states that the dynamic reservoir states can be processed by any statistical classification or regression technique and the ESNs can only use linear regression as a readout function to train [14]. Both of the corresponding readout can be trained in off-line or on-line learning rules. For reservoir adaptation, although that the original RC concept uses fixed randomly created reservoir as its main advantage, it is verified that we can alter the reservoir to improve performance on a particular task [25]. For instance, we can alter reservoir whose performance is better than average by search algorithm. For structured reservoirs, it has been theorized that a single reservoir is only able to support a limited number of ‘timescales’ [23]. This can be alleviated using a structured reservoir. For measures of dynamics, the performance of a reservoir system is highly dependent on the actual dynamics of the system [26]. And the actual reservoir dynamics, when applied to a specific task, are highly dependent on the actual inputs, since a reservoir is an input-driven system that never reaches a steady state.
Chaotic time series prediction based on RC
We mentioned that RC has good performance in the prediction of chaotic systems. In this section, we will present the results of prediction in some traditional and novel chaotic systems. In Eq. (2), will approximate to the initial input u(n) when is chosen well. First, we obtain the phase coordinate (data) of systems as training set and testing set by the numerical simulation software, Winpp, and then, tuning parameters in RC to train these data and predict its future states. Here, we use Normalized Root Mean Square Error (NRMSE) to evaluate the performance of RC, which reflects the degree of difference between variables. Smaller NRMSE means better prediction performance and its specific expression is in Eq. (4),where N is the length of the testing sequence, x(k) is the reservoir output, is the desired output and represents the variance of .Lorenz systemIts initial value isHere, = 10, = 2.66667 and = 24.5. After training the first 10000 data of every variable of Lorenz system, respectively, the prediction result of the next 500 data is shown in Fig. 2. In the remainder of this section, the red curves in the figures represent the actual data and the blue curves describe the prediction data.
Fig. 2
Comparisons of prediction and actual results of Lorenz system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 500 testing data, their NRMSEs are 0.046, 0.065 and 0.115, respectively
Comparisons of prediction and actual results of Lorenz system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 500 testing data, their NRMSEs are 0.046, 0.065 and 0.115, respectivelyRössler systemIts initial value isHere, a = 0.1, b = 0.1 and c = 14. After training the first 10000 data of every variable of Rössler system, respectively, the prediction result of the next 2000 data is shown in Fig. 3.
Fig. 3
Comparisons of prediction and actual results of Rössler system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 2000 testing data, their NRMSEs are 0.025, 0.027 and 0.094, respectively
Comparisons of prediction and actual results of Rössler system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 2000 testing data, their NRMSEs are 0.025, 0.027 and 0.094, respectivelyChen systemIts initial value isHere, a = 35, b = 3 and c = 28. After training the first 10000 data of every variable of Chen system, respectively, the prediction result of the next 200 data is shown in Fig. 4.
Fig. 4
Comparisons of prediction and actual results of Chen system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 200 testing data, their NRMSEs are 0.109, 0.163 and 0.181, respectively
Comparisons of prediction and actual results of Chen system. a–c represent the time histories of variables x, y and z of this system, respectively. In the 200 testing data, their NRMSEs are 0.109, 0.163 and 0.181, respectivelyRabinovich-Fabrikant systemIts initial value isHere, a = 0.96 and b = 1.3. After training the first 10000 data of every variable of Rabinovich–Fabrikant system, respectively, the prediction result of the next 2000 data is shown in Fig. 5.
Fig. 5
Comparisons of prediction and actual results of Rabinovich–Fabrikant system. a–c represent the time histories of variable x, y and z of this system, respectively. In the 2000 testing data, their NRMSEs are 0.011, 0.01 and 0.014, respectively
Comparisons of prediction and actual results of Rabinovich–Fabrikant system. a–c represent the time histories of variable x, y and z of this system, respectively. In the 2000 testing data, their NRMSEs are 0.011, 0.01 and 0.014, respectivelyThe parameter settings in Winpp are as follows. Total is 9000, Transient is 7000 and Delta T is 0.005 for all chaotic systems. And Nout is 3 for all chaotic systems other than Rössler system whose Nout is 5. Their Maximum Lyapunov exponents are 0.865, 0.097, 2.027 and 0.17, respectively. Thus, their Lyapunov times are approximately equal to 1.156, 10.309, 0.453 and 5.88. It is verified that predicting the chaotic time series based on RC can reach about Lyapunov times. We summarize the prediction results in Table 1. It indicates that we can predict the time series of chaotic systems (which are smooth) accurately via RC. But for that of the non-smooth time series, such as precipitation data, it will be very difficult and we need other processes to tackle it.
Table 1
NRMSEs of the prediction of chaotic systems based on RC
Lorenz
Rössler
Chen
Rabinovich
NRMSE
0.046,0.065,0.115
0.025,0.027,0.094
0.109,0.163,0.181
0.011,0.01,0.014
NRMSEs of the prediction of chaotic systems based on RC
Precipitation forecast based on RC
In the last section, we can arrive at a conclusion that RC has excellent performance in chaotic time series prediction. But when we use RC to process precipitation data, the result is not satisfactory. Unlike chaotic time series, daily precipitation is a series of discontinuous data accompanied by numerous zero values. It will cause two main disadvantages: bad forecasting results and unrealistically negative prediction values. To solve both problems, we process the data in two steps. First, taking the moving average of precipitation data to make it smoother. Here, it is crucial that how many orders of the moving average we should take to make the result better. Then, taking the logarithm of the smoothed nonzero data and the same negative constant for the smoothed zero data. And it is key that which negative value we should take for the zero data to make the result better. At last, we obtain a new series corresponding to the original precipitation data. Afterwards, we train the obtained data to predict its future data by RC and get corresponding prediction result of the precipitation data by taking its exponent function. Then, we can evaluate the forecast performance.In Fig. 6, we take the original data, its first-order moving average and second-order moving average respectively and then set the negative constant to corresponding to the smoothed zero precipitation to forecast precipitation by RC. Also, the red curves represent the actual data and the blue curves depict the forecast data in this section. The original data consist of daily precipitation data of Xinzheng International Airport Weather Station, Zhengzhou, Henan Province, China, from January 1, 2013 to October 31, 2021. All precipitation data in this article comes from: http://www.wheata.cn/. Here we take the first 3100 data as the training set and the next 100 data as the testing set. In Fig. 7, we take the first-order moving average of data and then set the negative constant to , and corresponding to the smoothed zero precipitation respectively to forecast precipitation by RC. Also, We take the training set of the first 3100 data and the testing set of the next 100 data.
Fig. 6
a—c depict forecast performances of the original precipitation data, its first-order moving average and second-order moving average of the precipitation data from Xinzheng International Airport Weather Station, respectively. And their NRMSEs are 0.94, 0.78 and 0.54, respectively
Fig. 7
a–c depict forecast performances of the precipitation data of Xinzheng International Airport Weather Station, where we take the first-order moving average of precipitation data and, respectively, set the negative constants corresponding to the zero precipitation to , and . And their NRMSEs are 0.8, 0.78 and 0.83, respectively
a—c depict forecast performances of the original precipitation data, its first-order moving average and second-order moving average of the precipitation data from Xinzheng International Airport Weather Station, respectively. And their NRMSEs are 0.94, 0.78 and 0.54, respectivelya–c depict forecast performances of the precipitation data of Xinzheng International Airport Weather Station, where we take the first-order moving average of precipitation data and, respectively, set the negative constants corresponding to the zero precipitation to , and . And their NRMSEs are 0.8, 0.78 and 0.83, respectivelyIn Fig. 6, it is obvious that direct forecast without moving average has the worst performance. And from a numerical point of view, the prediction performance of the second-order moving average is better than that of the first order. But in fact, the second-order moving average may induce that the predicted precipitation will go earlier or later for several days than actual precipitation. What is more, the second-order moving average is weaker in the ability to forecast heavy rainfall because the higher order of moving average can cause smaller amplitude of precipitation data. In other words, the higher order of moving average can bring about larger errors. In Fig. 7a, we can observe a bad phenomenon that small forecast amplitude will appear in no rain days, i.e., misreporting rainless weather as rainy. From Fig. 7c, it is obvious that the forecast amplitude and numbers of rainy day surpass actual precipitation data greatly, i.e., erroneously forecasting heavy rain. To further understand the correlation between predicted and actual precipitation data, we introduce the Pearson Correlation Coefficient (PCC) in Eq. (13),where X and Y are variables with the same dimension, represents their covariance and represents the standard deviation of X. PCC is used to reflect the linear correlation between two variables, whose value is between and 1 and determine to what extent their tendency agrees well each other. The greater the absolute value of PCC is, the stronger the linear correlation between variables is.Figure 8 displays the results of precipitation forecast in some cities, i.e., Beijing, Shanghai, Zhengzhou, Shenzhen, Xi’an, Kunming, Hohhot and Hongkong cities in China. Table 2 shows the forecast performances. To sum up, we choose the first-order moving average and negative constant near to forecast the precipitation based on RC. The results show that it has significance for long-termed forecast of precipitation. Although the magnitudes of the predicted precipitation have errors, as well as the time of precipitation, we can predict accurately some of the time of heavy rainfalls by our method. The reason for the former is that we can not take the inverse of the moving average.
Fig. 8
The comparisons of the prediction results with the actual data for Beijing Capital International Airport Weather Station, Shanghai Baoshan Weather Station, Xinzheng International Airport Weather Station, Shenzhen Baoan International Airport Weather Station, Xi’an Yang Tomb Museum Weather Station, Kunming Guandu District Weather Station, Hohhot Baita International Airport Weather Station and Hongkong International Airport Weather Station from January 1, 2013 to October 31, 2021 are presented in a–h respectively
Table 2
Prediction performances: NRMSEs and PCCs of prediction results in some cities of China in Fig. 8.
Beijing
Shanghai
Zhengzhou
Shenzhen
NRMSE
0.38
0.59
0.78
0.42
PCC
0.67
0.68
0.35
0.62
Xi’an
Kunming
Hohhot
Hongkong
NRMSE
0.57
0.59
0.76
0.62
PCC
0.69
0.44
0.6
0.73
The comparisons of the prediction results with the actual data for Beijing Capital International Airport Weather Station, Shanghai Baoshan Weather Station, Xinzheng International Airport Weather Station, Shenzhen Baoan International Airport Weather Station, Xi’an Yang Tomb Museum Weather Station, Kunming Guandu District Weather Station, Hohhot Baita International Airport Weather Station and Hongkong International Airport Weather Station from January 1, 2013 to October 31, 2021 are presented in a–h respectivelyPrediction performances: NRMSEs and PCCs of prediction results in some cities of China in Fig. 8.
Summary and discussions
The contribution of the present paper is to forecast the precipitation by the novel machine learning method, RC. In this paper, we use RC to predict chaotic time series and obtain excellent results. Then, we use RC to forecast daily precipitation data with additional processes. First, we take the first-order moving average for the original precipitation data to make it smoother like chaotic system. Second, we take the logarithm of smoothed nonzero data and take negative constant for the smoothed zero data. Then, we use RC to train it and predict its future states. Finally, taking the exponent function of obtained data to get the final forecast result. We are able to forecast the long-term precipitation using only historical precipitation data. Although the forecast results are not as good as those of the chaotic time series, it may still allow us to make early warnings of heavy rainfall for the consistency of predicted results with actual precipitation. In the future, we will further improve our method to solve the problem that the inverse of the moving average is unavailable and apply it to more applications such as river runoff forecast, meteorological forecast of temperature and humidity, electricity consumption forecast, stock price forecast, epidemic forecast such as COVID-19 and oil futures price forecast, etc. Actually, we can use this method of combining RC and additional mathematical processes to forecast the non-smooth and noisy time series in sciences and engineering accurately in that it can smooth them, filter the noise and avoid the unrealistic forecast of negative values to make the precise predictions.