Literature DB >> 24782659

Modeling and computing of stock index forecasting based on neural network and Markov chain.

Yonghui Dai¹, Dongmei Han², Weihui Dai³.

Abstract

The stock index reflects the fluctuation of the stock market. For a long time, there have been a lot of researches on the forecast of stock index. However, the traditional method is limited to achieving an ideal precision in the dynamic market due to the influences of many factors such as the economic situation, policy changes, and emergency events. Therefore, the approach based on adaptive modeling and conditional probability transfer causes the new attention of researchers. This paper presents a new forecast method by the combination of improved back-propagation (BP) neural network and Markov chain, as well as its modeling and computing technology. This method includes initial forecasting by improved BP neural network, division of Markov state region, computing of the state transition probability matrix, and the prediction adjustment. Results of the empirical study show that this method can achieve high accuracy in the stock index prediction, and it could provide a good reference for the investment in stock market.

Entities: Chemical Disease Gene

Mesh：

Year: 2014 PMID： 24782659 PMCID： PMC3982277 DOI： 10.1155/2014/124523

Source DB: PubMed Journal: ScientificWorldJournal ISSN： 1537-744X

1. Introduction

The stock market is filled with the coexistence of high-risk and high-yield characteristics. As a barometer of the stock market, the stock index is an important reference for investors to make investment strategies. However, the stock price index is influenced by many factors such as the economic situation, policy changes, and emergency. Although faced with complicated challenges, the forecast of stock index has still attracted the attention of many industrial experts and scholars. Lendasse et al. used a nonlinear time series model to forecast the tendency of the Bel 20 stock market index [1]. Lee et al. forecasted Korean Stock Price Index (KOSPI) by three forecasting models including back-propagation neural network model (BPNN), Bayesian Chiao's model (BC), and the seasonal autoregressive integrated moving average model (SARIMA) [2]. Fan and Gao proposed “Grey Neural Network model (GNNM(1, N))” and argued that the combined model could improve the prediction accuracy and reduce the computation [3]. Up to now, stock prediction has still been a hot topic. In this field, there have been a lot of methods, such as artificial neural networks [4, 5], time series model [6, 7], decision trees [8], Bayesian belief networks [9], evolutionary algorithms [10], fuzzy sets [11], and Markov model [12-14]. However, the single method is usually limited to achieving an ideal precision in the dynamic market due to complicated influencing factors. In recent years, some new hybrid models have shown the potential superiorities [15-17]. Especially, the approach based on adaptive modeling and conditional probability transfer may be suitable for matching the problem's characteristics. In order to explore the new solution for improving the forecast precision, this paper presented a new method based on BP neural network and Markov chain, studied its modeling and computing technology with the data of Chinese Growth Enterprise Market, and hereafter conducted an empirical analysis of the prediction results. This paper is arranged as the following five sections: Section 1 is the introduction of research background and the most related literature; Section 2 expounds the methodology and technology as well as the combined model based on BP neural network and Markov chain; Section 3 discusses the modeling and computing technology of the presented method; Section 4 is the empirical analysis of prediction results; and the conclusion and discussion are finally in Section 5.

2. Methodology and Technology

2.1. BP Neural Network (BPNN)

BPNN is a one-way propagation of BP algorithm based on multilayer network. It is based on gradient descent method which minimizes the total of the squared errors between the actual and the desired output values. The structure of three layers BPNN includes the input layer, the hidden layer, and the output layer. The BP learning algorithm of three layers can be described as follows [18, 19].

Step 1

Initialize all the values of w (t), w (t), θ (t), θ (t) to small random values within the range [−1, 1], where w (t) means the connection weights between neurons i in the input layers and neurons j in the hidden layers during the tth learning process, w (t) represents the connection weights between neurons j in the hidden layers and neurons k in the output layers during the tth learning process, θ (t) means the threshold value in the hidden layers, and θ (t) means the threshold value.

Step 2

Select sample data and then apply the input vector X(i) = (x 1, x 2,…, x ) and desired output vector D(i) = (d 1, d 2,…, d ).

Step 3

Compute the outputs y ′ in every hidden layer, and compute the outputs y in output layer; here, f(·) = 1/(1 + e −) or f(·) = (1 − e −)/(1 + e −) is adapted activation function:

Step

Calculate the error terms δ (k) for the output nodes: where d represents desired output.

Step 5

Calculate the error terms δ (k) for the hidden nodes:

Step 6

Update weights on the output layer: where

Step 7

Update weights on the hidden layer: where

Step 8

Calculate error; repeat Steps 2–8 until the error falls below a predefined threshold: where m means the number of output node. Although BP algorithm is successful, it has some disadvantages such as lower convergence speed and easy to get into local minima points. Therefore, improved BP algorithm was applied in our study. Our improved method is based on the additional momentum and adaptive learning rate combined. The formula with the momentum factor weight adjusting is as follows: where w represents network weight, k is number of training, lr is the learning rate, mc is the momentum coefficient, 0 < mc < 1, and E is the error function. In addition, adaptive learning rate method can be described as follows: where lr is the learning rate, k is the number of training, E is the error function, , y is actual output value, and is anticipative output value; usually α = 1.05, β = 0.7, and γ = 1.04 [20].

2.2. Markov Chain

Discrete-time Markov chain can be described as a sequence of random variables {X(t), t ∈ T}, where T = {1,2,…, N} and state space S = {1,2,…, M}. For any time n ≥ 0 and any state (i 0, i 1,…, i , i, j) ∈ S and positive integer step k, when this sequence of variables has the following attributes: We call such stochastic variable sequence {X(t), t ∈ T} Markov chains, where p is the transition probability from state i to state j. These transition probabilities satisfied ∑ p = 1, i ∈ S, and the matrix P = (p ) is the transition matrix of the chain. If the transition probabilities in (12) do not depend on the time parameter n, it will be called “time-homogeneous Markov chains.” Since the state space S is countable, we can label the states by integers, such as S = {0,1, 2,…}. Under this label, the transition matrix can be described as follows:

2.3. Modeling of Forecast Based on Improved BPNN and Markov Chain

The modeling process can be described as follows. Construct improved BPNN model. Initialize forecasting by using model of Step 1. Normalize the error of prediction. The normalized formula is as follows: Set Markov state zoning by normalized upper and lower thresholds. Divide the Markov state region by using the sample average-mean square deviation method. Five ranges are divided as follows [21]: , , , , and , where means average and s is sample standard deviation; usually α 1 and α 4 are range [1.0, 1.5] and α 2 and α 3 are range [0.3, 0.6]. Define the initial state and calculate the state transition probability matrix. Markov chain test: use chi-square statistics test for Markov property. Forecast. Get the state vector of k step from formula (13) and forecast based on this model.

3. Modeling and Computing

3.1. Sample Data

In this paper, we select “Chinese Growth Enterprise Market Index (GEMI, 399006.SZ)” for the data set to empirical study, and then we will finish short-term Chinese GEM index price prediction based on this data set. The data set is total of 58 days, which is from 2013-5-24 to 2013-8-16 of trading data. Among them, divided into in-sample and out-of-sample, the first 41 days of data are in-sample as training data and then the data from 42 days are out-of-sample and used as prediction. Due to closing index price, the most important indicator for investment reference, our study focuses on the closing index price forecasting. The daily trading data including opening price, highest price, lowest price, closing price, and trading volume are used for modeling. The sample data of Chinese GEM index are shown in Table 1.

Table 1

Sample data.

Days	1	2	3	4	5	6	7	8	9	10
Opening price	1044.29	1075.83	1089.77	1055.60	1063.00	1074.48	1070.73	1049.10	1021.02	1027.05
Highest price	1073.78	1091.25	1090.16	1067.03	1078.46	1086.33	1074.82	1049.26	1038.03	1031.36
Lowest price	1044.29	1075.83	1055.64	1053.18	1059.83	1068.07	1050.21	1011.40	1021.02	1020.83
Closing price	1073.78	1089.72	1059.00	1065.96	1073.87	1073.02	1052.14	1022.46	1032.76	1022.13
Trading volume	15023082	16132262	15413445	11569457	12702507	14032823	12668299	12745388	9397943	9843383

Days	11	12	13	14	15	⋯	55	56	57	58

Opening price	1022.56	1019.47	1029.93	1074.53	1073.31	⋯	1161.49	1175.63	1177.85	1158.96
Highest price	1039.65	1029.28	1069.81	1091.91	1078.53	⋯	1180.61	1186.29	1184.40	1178.84
Lowest price	1018.45	986.05	1029.93	1068.61	1053.33	⋯	1151.68	1160.02	1165.77	1131.44
Closing price	1030.14	1029.28	1068.15	1073.69	1072.93	⋯	1172.33	1180.66	1167.09	1132.09
Trading volume	10557897	9936156	11803687	13475017	11576013	⋯	13353812	16202467	15192214	15946678

Data source: the above data are from Wind information database.

3.2. Modeling

3.2.1. Construct BP Neural Network Model

(i) Definition of Layer Number. According to Kolmogorov theorem, three layers can approach any continual function. Therefore, an input layer, a hidden layer, and an output layer are selected in this model. (ii) Activation Function and Training Target. Here, the activation function of hidden layer neuron is tansig, the output layer neurons traditional function is purelin. The training function is traingdx. The end of training conditions is the mean square error of the accuracy of E = 0.005. The circulation is 10000 times. In this model, the initial learning rate is 0.1. The initial momentum factor value is 0.9. (iii) Number of Neural Node. The input layer node number is five, namely, items of opening price, highest price, lowest price, closing price, and trading volume. Meanwhile, data of “day 1” were regarded as the first input data in input layer. In this model, the output layer node number is set to one; meanwhile, “closing price” of “day 2” was regarded as the first output data in output layer. Numbers of hidden layer node depend on experience and repeated training, how many of the nodes depend on the network error; the number corresponding to the minimum network error in training will be chosen as the number of the hidden layer nodes. The network errors which correspond to different number of neurons are shown in Table 2. It can be seen that this neural network has the minimum network error of 0.2689 when the neuron number is eleven. Therefore, we select eleven as the number of hidden layer nodes. The data in Table 2 indicate that network error cannot be reduced even if we contiune to increase the number of hidden layer nodes.

Table 2

Error of repeated training.

Number of neuron	9	10	11	12	13	14
Network error	1,8710	0.9392	0.2689	0.5924	0.3357	0.7099

3.2.2. Training

Training of the network is completed in MATLAB software. First, the training sample data is selected; then, the data is normalized. Normalization means to limit the data in a certain interval. Here, in order to limit training data in [−1,1], the premnmx(·) function is called. After normalization, start training network with a training set of 41 sample data; the learning rate is 0.1 and the momentum is 0.9. The network was in training till the Mean Squared Error (MSE) was less than 0.005. Finally, we get the ideal model after training the neural network. The dependence of MSE on epochs is shown in Figure 1.

Figure 1

The dependence of MSE on epochs.

It can be seen from Figure 1 that the network MSE reaches the expected MSE after 8078 steps of training, in which the training MSE is less than 0.005.

3.2.3. Forecast

(i) Initial Forecasting Based on Improved BPNN. According to trained network and sample data, we used rolling forecasting method to predict the closing index price. Part of the code in MATLAB software is shown in Algorithm 1.

Algorithm 1

Part of the code of MATLAB.

The Chinese GEM index of daily closing price of simulation is shown in Figure 2. Both actual value and predicted value are shown when trading day from 42 days to 56 days.

Figure 2

Simulation of actual value and predicted value.

(ii) Computing of Normalization Calculate the absolute residual rate of prediction days. The calculation formula is as follows: where x is the actual value of closing index price, x is the predicted value of closing index price, and y is the absolute residual rate of i day. Normalize the data set of absolute residual rate in MATLAB software; the mapminmax(·) function is called. The absolute residual rate and normalized results are shown in Table 3.

Table 3

Normalization of absolute residual rate.

Days	Actual value	Prediction value	Error of absolute residual rate	Normalization value
1	1148.81	1089.72	5.27%	0.6797
2	1133.16	1207.61	−4.25%	0.2457
3	1136.40	1198.35	−5.40%	0.1978
4	1095.65	1143.73	−1.75%	0.3648
5	1117.34	1043.10	8.02%	0.8214
6	1135.45	1217.55	−2.31%	0.3326
7	1187.31	1046.42	11.98%	1.0000
8	1182.54	1291.15	−7.34%	0.0985
9	1200.69	1242.76	−2.56%	0.3251
10	1173.26	1280.88	−8.95%	0.0397
11	1165.32	1223.61	−3.17%	0.2956
12	1153.45	1273.66	−8.63%	0.0478
13	1146.02	1254.19	−7.95%	0.0789
14	1151.68	1204.72	−2.76%	0.3153
15	1160.02	1266.82	−7.30%	0.1037
16	1165.77	1279.64	−9.64%	0.0000

3.2.4. Empirical Markov Model

(i) State Definition. According to the normalization value of Table 3, sample average-mean square deviation was used in state classification. Usually five intervals are divided: , , , , and , where is average, S is sample standard deviation, α 1 and α 4 belong to range [1.0, 1.5], and α 2 and α 3 belong to range [0.3, 0.6]. Taking into account the fact that the data is not that much, Markov state was divided into four ranges according to , , , and ; therefore, Markov state ranges are (1) [0,0.1926], (2) (0.1926,0.4840], (3) (0.4840,0.7462], and (4) (0.7462,1]. Then, Markov state transition was built as shown in Table 4.

Table 4

Markov state transition.

State	(1)	(2)	(4)	Total
(1)	2	3	0	5
(2)	3	2	2	7
(3)	0	1	0	1
(4)	1	1	0	2
Total	6	7	2	15

(ii) Computing of State Transition Probability Matrix. It can be seen from Table 4 that from the state (1) to (1) it has 2 times, from the state (1) to (2) it has 3 times, from the state (1) to (3) it has 0 times, and from the state (1) to (4) it has 0 times; then, the sate transition probability can be calculated as follows: Similarly, Thus, the state transition probability matrix p can be described as follows: The probability matrix p has the Markov property after chi-square statistics test. (iii) The Step State Vector of Prediction. According to the state transition probability matrix p and the Markov forecast model, the step state vector of prediction can be calculated as follows: Thus, the step state vector of prediction can be described as shown in Table 5.

Table 5

Probability of step state vector.

State	Step 1	Step 2	Step 3	Step 4
(1)	0.0000	0.4286	0.4367	0.4219
(2)	1.0000	0.2857	0.4816	0.4405
(3)	0.0000	0.0000	0.0000	0.0000
(4)	0.0000	0.2857	0.0816	0.1376

State	Step 5	Step 6	Step 7	Step 8

(1)	0.4264	0.4254	0.4256	0.4255
(2)	0.4478	0.4467	0.4468	0.4468
(3)	0.0000	0.0000	0.0000	0.0000
(4)	0.1258	0.1279	0.1276	0.1277

State	Step 9	Step 10	Step 11	Step 12

(1)	0.4255	0.4255	0.4255	0.4255
(2)	0.4468	0.4468	0.4468	0.4468
(3)	0.0000	0.0000	0.0000	0.0000
(4)	0.1277	0.1277	0.1277	0.1277

State	Step 13	Step 14	Step 15	Step 16

(1)	0.4255	0.4255	0.4255	0.4255
(2)	0.4468	0.4468	0.4468	0.4468
(3)	0.0000	0.0000	0.0000	0.0000
(4)	0.1277	0.1277	0.1277	0.1277

4. Empirical Analysis

According to the step state vector of prediction of Markov model, prediction result from 2013-7-25 to 2013-8-15 was shown in Table 6. Among them, V col6 is adjustment value and , where P max⁡5 means the maximum probability of someday in fifth column and means the average of interval of fourth column.

Table 6

Prediction result.

Days	Actual value	Value of improved BPNN forecast	Markov prediction interval	Probability	Adjustment value	Error of absolute residual rate (improved BPNN)	Error of absolute residual rate (adjustment)
(1)	(2)	(3)	(4)	(5)	(6)	(7)	(8)
1	1150.39	1089.72	[985.6, 1030.6] [1030.6, 1098.8] [1098.8, 1160.1] [1160.1, 1219.5]	0.00001.00000.00000.0000	1064.72	5.27%	7.45%

2	1158.35	1207.61	[1092.2, 1142.1] [1142.1, 1217.7] [1217.7, 1285.6] [1285.6, 1351.4]	0.4286 0.2857 0.0000 0.2857	1117.18	−4.25%	3.55%

3	1136.98	1198.35	[1083.8, 1133.3] [1133.3, 1208.3] [1208.3, 1275.7] [1275.7, 1341.0]	0.4367 0.4816 0.0000 0.0816	1170.85	−5.40%	−2.98%

4	1124.10	1143.73	[1034.4, 1081.7] [1081.7, 1153.2] [1153.2, 1217.6] [1217.6, 1279.9]	0.4219 0.4405 0.0000 0.1376	1117.48	−1.75%	0.59%

5	1134.02	1043.10	[943.4, 986.5] [986.5, 1051.8] [1051.8, 1110.5] [1110.5, 1167.3]	0.4263 0.4478 0.0000 0.1258	1019.17	8.02%	10.13%

6	1190.11	1217.55	[1101.2, 1151.5] [1151.5, 1227.7] [1227.7, 1296.2] [1296.2, 1362.5]	0.4254 0.4467 0.0000 0.1279	1189.61	−2.31%	0.04%

7	1188.85	1046.42	[946.4, 989.7] [989.7, 1055.1] [1055.1, 1114.0] [1114.0, 1171.0]	0.4256 0.44680.0000 0.1276	1084.58	11.98%	8.77%

8	1202.83	1291.15	[1167.8, 1221.1] [1221.1, 1301.9] [1301.9, 1374.6] [1374.6, 1444.9]	0.4255 0.4468 0.0000 0.1277	1261.51	−7.34%	−4.88%

9	1211.78	1242.76	[1124.0, 1175.4] [1175.4, 1253.1] [1253.1, 1323.1] [1323.1, 1390.7]	0.4255 0.4468 0.0000 0.1277	1214.24	−2.56%	−0.20%

10	1175.70	1280.88	[1158.5, 1211.4] [1211.4, 1291.5] [1291.5, 1363.6] [1363.6, 1433.4]	0.4255 0.4468 0.0000 0.1277	1251.48	−8.95%	−6.45%

11	1185.96	1223.61	[1106.7, 1157.2] [1157.2, 1233.8] [1233.8, 1302.7] [1302.7, 1369.3]	0.4255 0.4468 0.0000 0.1277	1195.53	−3.17%	−0.81%

12	1172.52	1273.66	[1151.9, 1204.6] [1204.6, 1284.2] [1284.2, 1355.9] [1355.9, 1425.3]	0.4255 0.4468 0.0000 0.1277	1244.43	−8.63%	−6.13%

13	1161.87	1254.19	[1134.3, 1186.1] [1186.1, 1264.6] [1264.6, 1335.2] [1335.2, 1403.5]	0.4255 0.4468 0.0000 0.1277	1225.40	−7.95%	−5.47%

14	1172.33	1204.72	[1089.6, 1139.4] [1139.4, 1214.7] [1214.7, 1282.5] [1282.5, 1348.1]	0.4255 0.4468 0.0000 0.1277	1177.07	−2.76%	−0.40%

15	1180.66	1266.82	[1145.7, 1198.1] [1198.1, 1277.3] [1277.3, 1348.6] [1348.6, 1417.6]	0.4255 0.4468 0.0000 0.1277	1237.75	−7.30%	−4.84%

16	1167.09	1279.64	[1157.3, 1210.2] [1210.2, 1290.2] [1290.2, 1362.3] [1362.3, 1432.0]	0.4255 0.44680.00000.1277	1250.28	−9.64%	−7.13%

It can be seen from column “error of absolute residual rate” in Table 6, during sixteen trading days, that most of the prediction results by this model are better than a single improved neural network prediction except during the first day and the fifth day.

5. Conclusion and Discussion

Due to the complicated influencing factors in dynamic stock market, the comprehensive method with hybrid models throws off more superiorities than a single method in the forecast of stock index. This paper presented a new method based on the combination of improved back-propagation (BP) neural network and Markov chain, which took the advantages of neural network and Markov model, and obtained the results better than that of the single improved BPNN method. This method could provide a good reference for the investment in stock market. As an open complex adaptive system constantly affected by all kinds of emergency events and people's psychological and behavioral effects, although many scholars including the famous financial experts pointed out that the changes of stock market cannot be predicted, we had to break those traditional ideas which rely only on the financial theory models and explore new combined methods such as the TDF (Theory-Data-Feedback) modeling and analyzing framework [22] and the spread model of emotions and behaviors caused by emergency events [23]. We believe that the change of the stock market has also its characteristics and inherent rules, and the forecast is possible at least in the short-term prediction.

1 in total

1. Explore Awareness of Information Security: Insights from Cognitive Neuromechanism.

Authors: Dongmei Han; Yonghui Dai; Tianlin Han; Xingyun Dai
Journal: Comput Intell Neurosci Date: 2015-10-26

1 in total