Literature DB >> 24782659

Modeling and computing of stock index forecasting based on neural network and Markov chain.

Yonghui Dai1, Dongmei Han2, Weihui Dai3.   

Abstract

The stock index reflects the fluctuation of the stock market. For a long time, there have been a lot of researches on the forecast of stock index. However, the traditional method is limited to achieving an ideal precision in the dynamic market due to the influences of many factors such as the economic situation, policy changes, and emergency events. Therefore, the approach based on adaptive modeling and conditional probability transfer causes the new attention of researchers. This paper presents a new forecast method by the combination of improved back-propagation (BP) neural network and Markov chain, as well as its modeling and computing technology. This method includes initial forecasting by improved BP neural network, division of Markov state region, computing of the state transition probability matrix, and the prediction adjustment. Results of the empirical study show that this method can achieve high accuracy in the stock index prediction, and it could provide a good reference for the investment in stock market.

Entities:  

Mesh:

Year:  2014        PMID: 24782659      PMCID: PMC3982277          DOI: 10.1155/2014/124523

Source DB:  PubMed          Journal:  ScientificWorldJournal        ISSN: 1537-744X


1. Introduction

The stock market is filled with the coexistence of high-risk and high-yield characteristics. As a barometer of the stock market, the stock index is an important reference for investors to make investment strategies. However, the stock price index is influenced by many factors such as the economic situation, policy changes, and emergency. Although faced with complicated challenges, the forecast of stock index has still attracted the attention of many industrial experts and scholars. Lendasse et al. used a nonlinear time series model to forecast the tendency of the Bel 20 stock market index [1]. Lee et al. forecasted Korean Stock Price Index (KOSPI) by three forecasting models including back-propagation neural network model (BPNN), Bayesian Chiao's model (BC), and the seasonal autoregressive integrated moving average model (SARIMA) [2]. Fan and Gao proposed “Grey Neural Network model (GNNM(1, N))” and argued that the combined model could improve the prediction accuracy and reduce the computation [3]. Up to now, stock prediction has still been a hot topic. In this field, there have been a lot of methods, such as artificial neural networks [4, 5], time series model [6, 7], decision trees [8], Bayesian belief networks [9], evolutionary algorithms [10], fuzzy sets [11], and Markov model [12-14]. However, the single method is usually limited to achieving an ideal precision in the dynamic market due to complicated influencing factors. In recent years, some new hybrid models have shown the potential superiorities [15-17]. Especially, the approach based on adaptive modeling and conditional probability transfer may be suitable for matching the problem's characteristics. In order to explore the new solution for improving the forecast precision, this paper presented a new method based on BP neural network and Markov chain, studied its modeling and computing technology with the data of Chinese Growth Enterprise Market, and hereafter conducted an empirical analysis of the prediction results. This paper is arranged as the following five sections: Section 1 is the introduction of research background and the most related literature; Section 2 expounds the methodology and technology as well as the combined model based on BP neural network and Markov chain; Section 3 discusses the modeling and computing technology of the presented method; Section 4 is the empirical analysis of prediction results; and the conclusion and discussion are finally in Section 5.

2. Methodology and Technology

2.1. BP Neural Network (BPNN)

BPNN is a one-way propagation of BP algorithm based on multilayer network. It is based on gradient descent method which minimizes the total of the squared errors between the actual and the desired output values. The structure of three layers BPNN includes the input layer, the hidden layer, and the output layer. The BP learning algorithm of three layers can be described as follows [18, 19].

Step 1

Initialize all the values of w (t), w (t), θ (t), θ (t) to small random values within the range [−1, 1], where w (t) means the connection weights between neurons i in the input layers and neurons j in the hidden layers during the tth learning process, w (t) represents the connection weights between neurons j in the hidden layers and neurons k in the output layers during the tth learning process, θ (t) means the threshold value in the hidden layers, and θ (t) means the threshold value.

Step 2

Select sample data and then apply the input vector X(i) = (x 1, x 2,…, x ) and desired output vector D(i) = (d 1, d 2,…, d ).

Step 3

Compute the outputs y ′ in every hidden layer, and compute the outputs y in output layer; here, f(·) = 1/(1 + e −) or f(·) = (1 − e −)/(1 + e −) is adapted activation function:

Step

Calculate the error terms δ (k) for the output nodes: where d represents desired output.

Step 5

Calculate the error terms δ (k) for the hidden nodes:

Step 6

Update weights on the output layer: where

Step 7

Update weights on the hidden layer: where

Step 8

Calculate error; repeat Steps  2–8 until the error falls below a predefined threshold: where m means the number of output node. Although BP algorithm is successful, it has some disadvantages such as lower convergence speed and easy to get into local minima points. Therefore, improved BP algorithm was applied in our study. Our improved method is based on the additional momentum and adaptive learning rate combined. The formula with the momentum factor weight adjusting is as follows: where w represents network weight, k is number of training, lr is the learning rate, mc is the momentum coefficient, 0 < mc < 1, and E is the error function. In addition, adaptive learning rate method can be described as follows: where lr is the learning rate, k is the number of training, E is the error function, , y is actual output value, and is anticipative output value; usually α = 1.05, β = 0.7, and γ = 1.04 [20].

2.2. Markov Chain

Discrete-time Markov chain can be described as a sequence of random variables {X(t), t ∈ T}, where T = {1,2,…, N} and state space S = {1,2,…, M}. For any time n ≥ 0 and any state (i 0, i 1,…, i , i, j) ∈ S and positive integer step k, when this sequence of variables has the following attributes: We call such stochastic variable sequence {X(t), t ∈ T} Markov chains, where p is the transition probability from state i to state j. These transition probabilities satisfied ∑ p = 1, i ∈ S, and the matrix P = (p ) is the transition matrix of the chain. If the transition probabilities in (12) do not depend on the time parameter n, it will be called “time-homogeneous Markov chains.” Since the state space S is countable, we can label the states by integers, such as S = {0,1, 2,…}. Under this label, the transition matrix can be described as follows:

2.3. Modeling of Forecast Based on Improved BPNN and Markov Chain

The modeling process can be described as follows. Construct improved BPNN model. Initialize forecasting by using model of Step  1. Normalize the error of prediction. The normalized formula is as follows: Set Markov state zoning by normalized upper and lower thresholds. Divide the Markov state region by using the sample average-mean square deviation method. Five ranges are divided as follows [21]: , , , , and , where means average and s is sample standard deviation; usually α 1 and α 4 are range [1.0, 1.5] and α 2 and α 3 are range [0.3, 0.6]. Define the initial state and calculate the state transition probability matrix. Markov chain test: use chi-square statistics test for Markov property. Forecast. Get the state vector of k step from formula (13) and forecast based on this model.

3. Modeling and Computing

3.1. Sample Data

In this paper, we select “Chinese Growth Enterprise Market Index (GEMI, 399006.SZ)” for the data set to empirical study, and then we will finish short-term Chinese GEM index price prediction based on this data set. The data set is total of 58 days, which is from 2013-5-24 to 2013-8-16 of trading data. Among them, divided into in-sample and out-of-sample, the first 41 days of data are in-sample as training data and then the data from 42 days are out-of-sample and used as prediction. Due to closing index price, the most important indicator for investment reference, our study focuses on the closing index price forecasting. The daily trading data including opening price, highest price, lowest price, closing price, and trading volume are used for modeling. The sample data of Chinese GEM index are shown in Table 1.
Table 1

Sample data.

Days12345678910
Opening price1044.291075.831089.771055.601063.001074.481070.731049.101021.021027.05
Highest price1073.781091.251090.161067.031078.461086.331074.821049.261038.031031.36
Lowest price1044.291075.831055.641053.181059.831068.071050.211011.401021.021020.83
Closing price1073.781089.721059.001065.961073.871073.021052.141022.461032.761022.13
Trading volume150230821613226215413445115694571270250714032823126682991274538893979439843383

Days1112131415 55565758

Opening price1022.561019.471029.931074.531073.31 1161.491175.631177.851158.96
Highest price1039.651029.281069.811091.911078.53 1180.611186.291184.401178.84
Lowest price1018.45986.051029.931068.611053.33 1151.681160.021165.771131.44
Closing price1030.141029.281068.151073.691072.93 1172.331180.661167.091132.09
Trading volume105578979936156118036871347501711576013 13353812162024671519221415946678

Data source: the above data are from Wind information database.

3.2. Modeling

3.2.1. Construct BP Neural Network Model

(i) Definition of Layer Number. According to Kolmogorov theorem, three layers can approach any continual function. Therefore, an input layer, a hidden layer, and an output layer are selected in this model. (ii) Activation Function and Training Target. Here, the activation function of hidden layer neuron is tansig, the output layer neurons traditional function is purelin. The training function is traingdx. The end of training conditions is the mean square error of the accuracy of E = 0.005. The circulation is 10000 times. In this model, the initial learning rate is 0.1. The initial momentum factor value is 0.9. (iii) Number of Neural Node. The input layer node number is five, namely, items of opening price, highest price, lowest price, closing price, and trading volume. Meanwhile, data of “day 1” were regarded as the first input data in input layer. In this model, the output layer node number is set to one; meanwhile, “closing price” of “day 2” was regarded as the first output data in output layer. Numbers of hidden layer node depend on experience and repeated training, how many of the nodes depend on the network error; the number corresponding to the minimum network error in training will be chosen as the number of the hidden layer nodes. The network errors which correspond to different number of neurons are shown in Table 2. It can be seen that this neural network has the minimum network error of 0.2689 when the neuron number is eleven. Therefore, we select eleven as the number of hidden layer nodes. The data in Table 2 indicate that network error cannot be reduced even if we contiune to increase the number of hidden layer nodes.
Table 2

Error of repeated training.

Number of neuron91011121314
Network error1,87100.93920.26890.59240.33570.7099

3.2.2. Training

Training of the network is completed in MATLAB software. First, the training sample data is selected; then, the data is normalized. Normalization means to limit the data in a certain interval. Here, in order to limit training data in [−1,1], the premnmx(·) function is called. After normalization, start training network with a training set of 41 sample data; the learning rate is 0.1 and the momentum is 0.9. The network was in training till the Mean Squared Error (MSE) was less than 0.005. Finally, we get the ideal model after training the neural network. The dependence of MSE on epochs is shown in Figure 1.
Figure 1

The dependence of MSE on epochs.

It can be seen from Figure 1 that the network MSE reaches the expected MSE after 8078 steps of training, in which the training MSE is less than 0.005.

3.2.3. Forecast

(i) Initial Forecasting Based on Improved BPNN. According to trained network and sample data, we used rolling forecasting method to predict the closing index price. Part of the code in MATLAB software is shown in Algorithm 1.
Algorithm 1

Part of the code of MATLAB.

The Chinese GEM index of daily closing price of simulation is shown in Figure 2. Both actual value and predicted value are shown when trading day from 42 days to 56 days.
Figure 2

Simulation of actual value and predicted value.

(ii) Computing of Normalization Calculate the absolute residual rate of prediction days. The calculation formula is as follows: where x is the actual value of closing index price, x is the predicted value of closing index price, and y is the absolute residual rate of i day. Normalize the data set of absolute residual rate in MATLAB software; the mapminmax(·) function is called. The absolute residual rate and normalized results are shown in Table 3.
Table 3

Normalization of absolute residual rate.

DaysActual valuePrediction valueError of absolute residual rateNormalization value
11148.811089.725.27%0.6797
21133.161207.61−4.25%0.2457
31136.401198.35−5.40%0.1978
41095.651143.73−1.75%0.3648
51117.341043.108.02%0.8214
61135.451217.55−2.31%0.3326
71187.311046.4211.98%1.0000
81182.541291.15−7.34%0.0985
91200.691242.76−2.56%0.3251
101173.261280.88−8.95%0.0397
111165.321223.61−3.17%0.2956
121153.451273.66−8.63%0.0478
131146.021254.19−7.95%0.0789
141151.681204.72−2.76%0.3153
151160.021266.82−7.30%0.1037
161165.771279.64−9.64%0.0000

3.2.4. Empirical Markov Model

(i) State Definition. According to the normalization value of Table 3, sample average-mean square deviation was used in state classification. Usually five intervals are divided: , , , , and , where is average, S is sample standard deviation, α 1 and α 4 belong to range [1.0, 1.5], and α 2 and α 3 belong to range [0.3, 0.6]. Taking into account the fact that the data is not that much, Markov state was divided into four ranges according to , , , and ; therefore, Markov state ranges are (1) [0,0.1926], (2) (0.1926,0.4840], (3) (0.4840,0.7462], and (4) (0.7462,1]. Then, Markov state transition was built as shown in Table 4.
Table 4

Markov state transition.

State(1)(2)(3)(4)Total
(1)23005
(2)32027
(3)01001
(4)11002
Total 6 7 0 2 15
(ii) Computing of State Transition Probability Matrix. It can be seen from Table 4 that from the state (1) to (1) it has 2 times, from the state (1) to (2) it has 3 times, from the state (1) to (3) it has 0 times, and from the state (1) to (4) it has 0 times; then, the sate transition probability can be calculated as follows: Similarly, Thus, the state transition probability matrix p can be described as follows: The probability matrix p has the Markov property after chi-square statistics test. (iii) The Step State Vector of Prediction. According to the state transition probability matrix p and the Markov forecast model, the step state vector of prediction can be calculated as follows: Thus, the step state vector of prediction can be described as shown in Table 5.
Table 5

Probability of step state vector.

StateStep  1Step  2Step  3Step  4
(1)0.00000.42860.43670.4219
(2)1.00000.28570.48160.4405
(3)0.00000.00000.00000.0000
(4)0.00000.28570.08160.1376

StateStep  5Step  6Step  7Step  8

(1)0.42640.42540.42560.4255
(2)0.44780.44670.44680.4468
(3)0.00000.00000.00000.0000
(4)0.12580.12790.12760.1277

StateStep  9Step  10Step  11Step  12

(1)0.42550.42550.42550.4255
(2)0.44680.44680.44680.4468
(3)0.00000.00000.00000.0000
(4)0.12770.12770.12770.1277

StateStep  13Step  14Step  15Step  16

(1)0.42550.42550.42550.4255
(2)0.44680.44680.44680.4468
(3)0.00000.00000.00000.0000
(4)0.12770.12770.12770.1277

4. Empirical Analysis

According to the step state vector of prediction of Markov model, prediction result from 2013-7-25 to 2013-8-15 was shown in Table 6. Among them, V col6 is adjustment value and , where P max⁡5 means the maximum probability of someday in fifth column and means the average of interval of fourth column.
Table 6

Prediction result.

DaysActual valueValue of improved BPNN forecastMarkov prediction intervalProbabilityAdjustment valueError of absolute residual rate (improved BPNN)Error of absolute residual rate (adjustment)
(1)(2)(3)(4)(5)(6)(7)(8)
11150.391089.72[985.6, 1030.6] [1030.6, 1098.8] [1098.8, 1160.1] [1160.1, 1219.5]0.00001.00000.00000.00001064.725.27%7.45%

21158.351207.61[1092.2, 1142.1] [1142.1, 1217.7] [1217.7, 1285.6] [1285.6, 1351.4]0.4286 0.2857 0.0000 0.28571117.18−4.25%3.55%

31136.981198.35[1083.8, 1133.3] [1133.3, 1208.3] [1208.3, 1275.7]  [1275.7, 1341.0]  0.4367 0.4816 0.0000 0.08161170.85−5.40%−2.98%

41124.101143.73[1034.4, 1081.7]  [1081.7, 1153.2]  [1153.2, 1217.6]  [1217.6, 1279.9]  0.4219 0.4405 0.0000 0.13761117.48−1.75%0.59%

51134.021043.10[943.4, 986.5]  [986.5, 1051.8]  [1051.8, 1110.5]  [1110.5, 1167.3]  0.4263 0.4478 0.0000 0.12581019.178.02%10.13%

61190.111217.55[1101.2, 1151.5]  [1151.5, 1227.7]  [1227.7, 1296.2]  [1296.2, 1362.5]  0.4254 0.4467 0.0000 0.12791189.61−2.31%0.04%

71188.851046.42[946.4, 989.7]  [989.7, 1055.1]  [1055.1, 1114.0]  [1114.0, 1171.0]  0.4256 0.44680.0000 0.12761084.5811.98%8.77%

81202.831291.15[1167.8, 1221.1]  [1221.1, 1301.9]  [1301.9, 1374.6]  [1374.6, 1444.9]  0.4255 0.4468 0.0000 0.12771261.51−7.34%−4.88%

91211.781242.76[1124.0, 1175.4]  [1175.4, 1253.1]  [1253.1, 1323.1]  [1323.1, 1390.7]  0.4255 0.4468 0.0000 0.12771214.24−2.56%−0.20%

101175.701280.88[1158.5, 1211.4]  [1211.4, 1291.5]  [1291.5, 1363.6]  [1363.6, 1433.4]  0.4255 0.4468 0.0000 0.12771251.48−8.95%−6.45%

111185.961223.61[1106.7, 1157.2]  [1157.2, 1233.8]  [1233.8, 1302.7]  [1302.7, 1369.3]  0.4255 0.4468 0.0000 0.12771195.53−3.17%−0.81%

121172.521273.66[1151.9, 1204.6]  [1204.6, 1284.2]  [1284.2, 1355.9]  [1355.9, 1425.3]  0.4255 0.4468 0.0000 0.12771244.43−8.63%−6.13%

131161.871254.19[1134.3, 1186.1]  [1186.1, 1264.6]  [1264.6, 1335.2]  [1335.2, 1403.5]  0.4255 0.4468 0.0000 0.12771225.40−7.95%−5.47%

141172.331204.72[1089.6, 1139.4]  [1139.4, 1214.7]  [1214.7, 1282.5]  [1282.5, 1348.1]  0.4255 0.4468 0.0000 0.12771177.07−2.76%−0.40%

151180.661266.82[1145.7, 1198.1]  [1198.1, 1277.3]  [1277.3, 1348.6]  [1348.6, 1417.6]  0.4255 0.4468 0.0000 0.12771237.75−7.30%−4.84%

161167.091279.64[1157.3, 1210.2]  [1210.2, 1290.2]  [1290.2, 1362.3]  [1362.3, 1432.0]  0.4255 0.44680.00000.12771250.28−9.64%−7.13%
It can be seen from column “error of absolute residual rate” in Table 6, during sixteen trading days, that most of the prediction results by this model are better than a single improved neural network prediction except during the first day and the fifth day.

5. Conclusion and Discussion

Due to the complicated influencing factors in dynamic stock market, the comprehensive method with hybrid models throws off more superiorities than a single method in the forecast of stock index. This paper presented a new method based on the combination of improved back-propagation (BP) neural network and Markov chain, which took the advantages of neural network and Markov model, and obtained the results better than that of the single improved BPNN method. This method could provide a good reference for the investment in stock market. As an open complex adaptive system constantly affected by all kinds of emergency events and people's psychological and behavioral effects, although many scholars including the famous financial experts pointed out that the changes of stock market cannot be predicted, we had to break those traditional ideas which rely only on the financial theory models and explore new combined methods such as the TDF (Theory-Data-Feedback) modeling and analyzing framework [22] and the spread model of emotions and behaviors caused by emergency events [23]. We believe that the change of the stock market has also its characteristics and inherent rules, and the forecast is possible at least in the short-term prediction.
  1 in total

1.  Explore Awareness of Information Security: Insights from Cognitive Neuromechanism.

Authors:  Dongmei Han; Yonghui Dai; Tianlin Han; Xingyun Dai
Journal:  Comput Intell Neurosci       Date:  2015-10-26
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.