| Literature DB >> 30596772 |
Ming-Chi Tsai1, Ching-Hsue Cheng2, Meei-Ing Tsai2, Huei-Yuan Shiu2.
Abstract
Many different time-series methods have been widely used in forecast stock prices for earning a profit. However, there are still some problems in the previous time series models. To overcome the problems, this paper proposes a hybrid time-series model based on a feature selection method for forecasting the leading industry stock prices. In the proposed model, stepwise regression is first adopted, and multivariate adaptive regression splines and kernel ridge regression are then used to select the key features. Second, this study constructs the forecasting model by a genetic algorithm to optimize the parameters of support vector regression. To evaluate the forecasting performance of the proposed models, this study collects five leading enterprise datasets in different industries from 2003 to 2012. The collected stock prices are employed to verify the proposed model under accuracy. The results show that proposed model is better accuracy than the other listed models, and provide persuasive investment guidance to investors.Entities:
Mesh:
Year: 2018 PMID: 30596772 PMCID: PMC6312251 DOI: 10.1371/journal.pone.0209922
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Technique indicators.
| Indicator | Explanation |
|---|---|
| MA5 | |
| MA10 | |
| 5BIAS | The difference between the closing price and MA5, which utilizes the stock price nature of returning back to average price for analyzing the stock trends [ |
| 10BIAS | The difference between the closing price and MA10, which employs the stock price nature of returning back to average price for analyzing the stock trends [ |
| RSI | RSI measures the magnitude of recently gain to recently loss in an trial to determine overbought and oversold conditions of an asset [ |
| 12PSY | PSY12 (12 days psychological line) = (Dup12/12) * 100, Dup12 is the number of days when price is going up within 12 days [ |
| 10WMS%R | Williams %R is usually drawn by using negative values. For analysis and discussion, ignore the negative symbols. It is the best to wait the security’s price until change direction before placing your trading [ |
| MACD | MACD presents the difference between a fast and slow exponential moving average (EMA) for closing prices. Fast is a short-period average, and slow is a long period one [ |
| MO1 | MO1(t) = price(t) − price(t − n), n = 1 [ |
| MO2 | MO2(t) = price(t) − price(t − n), n = 2 [ |
| Transaction volume | Transaction volume presents a basic yet very important element of market timing strategy. Volume gives clues for the intensity of given price moving [ |
| CDP value | Divide the previous price movement into five values and make the intraday trading decision based on the five value [ |
| Exponential Moving Average (EMA) | EMA is defined as a linear transformation of time series to a smoother time series by |
| Company-Daily price change | |
| TAIEX-Daily index change |
Fundamental indicators.
| Indicator | Explanation |
|---|---|
| TAIEX index | This study considered related market index with macroeconomic, TAIEX index is the indicator of fundamental analysis. |
| Exchange Rate | Conversion rate of US to NT [ |
| Prime / Base rate | Prime rate is an interest rate, which is paid by a borrower (debtor) for the use of money that they borrow from a lender [ |
| The Final Best Ask Quote | Each transaction, the system discloses the quote for the lowest offer price [ |
| The Final Best Bid Quote | Each transaction, the system discloses the quote for the highest bid price [ |
| Price earnings ratio | It is defined as market price per share divided by annual earnings in per share [ |
| PBR | Compare a company’s current market price to its book value [ |
| Dividend Yield | |
| Return of Investment (ROI) | |
| Log (ROI) | |
| Sale Month | This study considered sale monthly, sales growth rate, sales growth rate and the compared rate sale monthly with previous month would affect the company stock price. |
| Aggregate Sales Growth Rate | Ryear = ([Rt / Rt-1]-1)*100(%) Rt is the total revenue in t year. |
| Sales Growth Rate | Rmonth = ([Ry / Ry-1]-1)*100(%) Ry is the monthly revenue in y year. |
| Rate compared sale monthly | R = [(Rm − Rm-1)/ Rm-1] *100(%) Rm is the revenue in mth month. |
| Demand Savings Deposits | This study considered that the rate of demand savings deposits might be a factor of investment. When the rate is low, investors may be willing to take the risk for investment. [ |
Fig 1Research processes of the proposed model.
The meaning of model abbreviation.
| Abbreviation | Method |
|---|---|
| MARS | Multivariate adaptive regression splines |
| GA | Genetic algorithm |
| SR | Stepwise Regression |
| KRR | Kernel Ridge Regression |
| GA-SVR | Applying Genetic algorithm to optimize the Support vector regression parameters |
| KRR-SR | Employing KRR to select the key features and bulid model by SR |
| KRR-GA-SVR | Employing KRR to select the key features and bulid model by GA-SVR |
| KRR-MARS | Employing KRR to select the key features and bulid model by MARS |
| SR-KRR | Employing SR to select the key features and bulid model by KRR |
| SR-MARS | Employing SR to select the key features and bulid model by MARS |
| SR-GA-SVR (proposal model B) | Employing SR to select the key features and bulid model by GA-SVR |
| MARS-KRR | Employing MARS to choose the key features and bulid model by KRR |
| MARS-SR | Employing MARS to choose the key features and bulid model by SR |
| MARS-GA-SVR (proposal model A) | Employing MARS to choose the key features and bulid model by GA-SVR |
The experiment of the long and short test period.
| Experiment | Training period | Testing period |
|---|---|---|
| Short test period | 2003 to 2011 year | 2012 |
| Long test period | 2003 to 2009 year | 2010 to 2012 year |
The initial performance comparisons for three companies in RMSE.
| company | model | Test period (year) | |
|---|---|---|---|
| 1 | 3 | ||
| SR | 0.0662 | 0.7664 | |
| KRR | 15.08679 | 34.3449785 | |
| MARS | 0.1435 | 2.3455 | |
| GA-SVR | 0.001162354 | 0.00329521 | |
| China Steel | SR | 0.25348 | 0.34462 |
| KRR | 3.91307785 | 4.16090037 | |
| MARS | 0.24370 | 0.32520 | |
| GA-SVR | 0.000651508 | 0.000922785 | |
| SR | 1.89469 | 2.20621 | |
| KRR | 4.27619926 | 12.5219447 | |
| MARS | 1.9228 | 2.151 | |
| GA-SVR | 0.002353294 | 0.006689822 | |
Selected features by MARS, SR and KRR for five companies.
| Company | Method | Indicator1 | Indicator2 | Indicator3 | Indicator4 |
|---|---|---|---|---|---|
| MARS | CDP value | MO2 | |||
| SR | CDP value | MO1 | |||
| KRR | Sale Month | P/E | |||
| China Steel | MARS | Daily price change | TAIEX index | FBAQ | FBBQ |
| SR | CDP value | MO2 | MO1 | - | |
| KRR | 10BIAS | FBBQ | P/E | - | |
| Hon Hai | MARS | FBBQ | ln(ROI) | Transaction volume | |
| SR | EMA | MACD | ln(ROI) | ||
| KRR | Sale Month | RSI6 | MO1 | ||
| Cathay Financial Holdings | MARS | ROI | FBBQ | CDP value | |
| SR | 5BIAS | MA5 | MO2 | ||
| KRR | ROI | TAIEX -Daily index change | TAIEX index | ||
| TSMC | MARS | MO2 | FBAQ | FBBQ | |
| SR | CDP value | MO2 | MO1 | ||
| KRR | 10W%R | MA15 | EMA |
Note: FBAQ denotes The Final Best Ask Quote; FBBQ represents The Final Best Bid Quote
The optimal parameters of GA searching for five companies.
| Company | Method | Training period(year) | ε | C | Training RMSE | |
|---|---|---|---|---|---|---|
| Chunghwa Telecom | Proposed model A | 9 | 0.2 | 85.115 | 621.018 | 0.10075 |
| 7 | 0.2 | 85.115 | 621.018 | 0.08261 | ||
| Proposed model B | 9 | 0.4 | 49.157 | 332.041 | 0.09963 | |
| 7 | 0.5 | 95.012 | 525.560 | 0.08395 | ||
| KRR-GA-SVR | 9 | 0.1 | 94.601 | 687.127 | 1.11637 | |
| 7 | 0.2 | 96.738 | 988.762 | 1.10806 | ||
| China Steel | Proposed model A | 9 | 0.5 | 60.4713 | 0.0253 | 0.25387 |
| 7 | 0.2 | 87.2495 | 9.356 | 0.20543 | ||
| Proposed model B | 9 | 0.4 | 53.6446 | 6.1135 | 0.30688 | |
| 7 | 0.1 | 6.5271 | 0.6631 | 0.58843 | ||
| KRR-GA-SVR | 9 | 0.5 | 51.7702 | 2.7261 | 0.504085 | |
| 7 | 0.3 | 25.0826 | 52.8511 | 0.31624 | ||
| Hon Hai | Proposed model A | 9 | 0.1 | 13.2041 | 134.089 | 0.26024 |
| 7 | 0.4 | 74.9381 | 836.899 | 0.09486 | ||
| Proposed model B | 9 | 0.3 | 17.8872 | 328.5681 | 0.11489 | |
| 7 | 0.5 | 67.8872 | 782.3546 | 0.09896 | ||
| KRR-GA-SVR | 9 | 0.3 | 26.6437 | 44.2146 | 10.8103 | |
| 7 | 0.3 | 45.8883 | 20.0447 | 13.0136 | ||
| Cathay Financial Holdings | Proposed model A | 9 | 0.5 | 58.5778 | 3.428 | 0.97916 |
| 7 | 0.1 | 60.9315 | 2.9168 | 1.04814 | ||
| Proposed model B | 9 | 0.5 | 25.0216 | 0.834 | 1.04227 | |
| 7 | 0.2 | 37.1928 | 2.604 | 0.94704 | ||
| KRR-GA-SVR | 9 | 0.2 | 15.4725 | 42.8472 | 1.69138 | |
| 7 | 0.3 | 5.9417 | 38.8681 | 1.29369 | ||
| TSMC | Proposed model A | 9 | 0.3 | 12.3139 | 4.3282 | 0.99827 |
| 7 | 0.4 | 25.2229 | 3.4814 | 1.0201 | ||
| Proposed model B | 9 | 0.3 | 72.1026 | 1.1468 | 0.98032 | |
| 7 | 0.5 | 26.0217 | 1.3375 | 0.98209 | ||
| KRR-GA-SVR | 9 | 0.3 | 97.3413 | 731.4384 | 3.55611 | |
| 7 | 0.4 | 88.6254 | 48.0336 | 3.72557 |
Fig 2Results of forecasting short and long test period for Chunghwa Telecom datasets.
Performance comparisons for Chunghwa Telecom.
| model | Test period(year) | |
|---|---|---|
| 1 | 3 | |
| KRR-SR | 25.0596 | 28.4000 |
| KRR-MARS | 25.4612 | 37.0234 |
| KRR-GA-SVR | 2.5085 | 1.6955 |
| MARS-SR | 0.1527 | 0.1801 |
| MARS-KRR | 8.1484 | 27.6530 |
| SR -MARS | 0.1434 | 0.6356 |
| SR-KRR | 7.3393 | 75.7406 |
| 0.0354 | 0.0586 | |
| 0.0342 | 0.0608 | |
Proposed model A: Integrated MARS and GA-SVR model
Proposed model B: Integrated SR and GA-SVR model
Performance comparisons for China Steel.
| model | Test period(year) | |
|---|---|---|
| 1 | 3 | |
| KRR-SR | 25.0596 | 28.4001 |
| KRR-MARS | 7.6083 | 0.3205 |
| KRR-GA-SVR | 1.634339328 | 1.01520238 |
| MARS-SR | 3.7657 | 3.9520 |
| MARS-KRR | 0.4634 | 0.4533 |
| SR—MARS | 0.2468 | 0.3250 |
| SR—KRR | 0.2452 | 0.3250 |
| Proposed model A | 0.2265 | 0.0381 |
| Proposed model B | 0.0370 | 0.3161 |
Proposed model A: Integrated MARS and GA-SVR model. Proposed model B: Integrated SR and GA-SVR model
Fig 3Results of forecasting short and long test period for China Steel datasets.
Performance comparisons for Hon Hai.
| model | Test period(year) | |
|---|---|---|
| 1 | 3 | |
| KRR-SR | 22.6070 | 76.1505 |
| KRR-MARS | 17.8506 | 339.1292 |
| KRR-GA-SVR | 0.4686 | 0.3136 |
| MARS-SR | 1.9738 | 2.1778 |
| MARS-KRR | 1.8837 | 2.2246 |
| SR -MARS | 1.8768 | 2.1328 |
| SR -KRR | 1.8528 | 2.1303 |
| 0.0040 | 0.0087 | |
| 0.0030 | 0.0914 | |
Proposed model A: Integrated MARS and GA-SVR model. Proposed model B: Integrated SR and GA-SVR model
Fig 4Results of forecasting short and long test period for Hon Hai datasets.
Performance comparisons for Cathay Financial Holdings.
| model | Test period (year) | |
|---|---|---|
| 1 | 3 | |
| KRR-SR | 27.9581 | 30.3242 |
| KRR-MARS | 13.7496 | 24.5962 |
| KRR-GA-SVR | 0.0010 | 1.0663 |
| MARS-SR | 2.5218 | 0.8007 |
| MARS-KRR | 0.4663 | 0.7296 |
| SR -MARS | 0.4842 | 14.5722 |
| SR -KRR | 0.4831 | 0.7367 |
| Proposed model A | 0.2859 | 0.6552 |
| Proposed model B | 0.1671 | 0.5280 |
Proposed model A: Integrated MARS and GA-SVR model. Proposed model B: Integrated SR and GA-SVR model
Fig 5Results of forecasting short and long test period for Cathay Financial datasets.
Performance comparisons for TSMC.
| model | Test period(year) | |
|---|---|---|
| 1 | 3 | |
| KRR-SR | 2.7440 | 12.3018 |
| KRR-MARS | 9.8224 | 125.9944 |
| KRR-GA-SVR | 5.0519 | 6.1786 |
| MARS-SR | 1.2980 | 0.9255 |
| MARS-KRR | 32.9380 | 29.0172 |
| SR -MARS | 1.2494 | 1.8027 |
| SR—KRR | 30.9380 | 22.4461 |
| Proposed model A | 0.8256 | 0.8395 |
| Proposed model B | 0.7292 | 0.7413 |
Proposed model A: Integrated MARS and GA-SVR model. Proposed model B: Integrated SR and GA-SVR model
Fig 6Results of forecasting short and test long period for TSMC datasets.
Wilcoxon sign test for different models comparison in short period.
| model | KRR- | KRR- | KRR- | MARS- | MARS- | SR- | SR- | Proposed | Proposed model A |
| KRR- | - | -0.135 | -2.023 | -2.023 | -0.674 | -2.023 | -0.674 | -2.023 | -2.023 |
| KRR- | - | -1.214 | -1.483 | -0.405 | -1.753 | -0.405 | -2.023 | -2.023 | |
| KRR- | - | -0.135 | -1.483 | -0.944 | -1.214 | -1.753 | -1.753 | ||
| MARS- | - | -0.405 | -2.023 | -0.405 | -2.023 | -2.023 | |||
| MARS- | - | -1.483 | -1.753 | -2.023 | -2.023 | ||||
| SR- | - | -0.405 | -2.023 | -2.023 | |||||
| SR- | - | -2.023 | -2.023 | ||||||
| Proposed model B | -0.674 |
Note: The digital in parentheses is the corresponding p-value;
*:p<0.1;
**:p<0.05
Wilcoxon sign test for different models comparison in long period.
| model | KRR- | KRR- | KRR- | MARS- | MARS- | SR- | SR- | Proposed | Proposed model A |
| KRR- | - | -1.214 | -1.753 | -1.753 | -1.753 | -1.753 | -0.944 | -2.023 | -2.023 |
| KRR- | - | -1.753 | -1.753 | -0.944 | -1.753 | -0.135 | -2.023 | -2.023 | |
| KRR- | - | -0.135 | -1.214 | -0.135 | -1.214 | -2.023 | -2.023 | ||
| MARS- | - | -0.674 | -0.674 | -0.405 | -2.023 | -2.023 | |||
| MARS- | - | -1.214 | -0.405 | -2.023 | -2.023 | ||||
| SR- | - | -0.674 | -2.023 | -2.023 | |||||
| SR- | - | -2.023 | -2.023 | ||||||
| Proposed model B | - | -1.753 |
Note: The digital in parentheses is the corresponding p-value;
*:p<0.1;
**:p<0.05
The descriptive statistics for all datasets.
| TSMC | Cathay | Hon Hai | China Steel | CHT | |
|---|---|---|---|---|---|
| Range | 62.8 | 69.75 | 246 | 34.7 | 64 |
| Minimum | 36.8 | 24.05 | 54 | 19.2 | 46 |
| Maximum | 99.6 | 93.8 | 300 | 53.9 | 110 |
| Mean | 61.9997 | 53.6333 | 142.352 | 31.6591 | 66.9004 |
| Std. Deviation | 11.15911 | 14.83258 | 50.2475 | 6.48304 | 15.10203 |
| Variance | 142.526 | 220.006 | 2524.807 | 42.03 | 228.071 |
Note: Cathay denotes Cathay Financial Holdings
The RMSE of all experiment for short testing period.
| TSMC | Cathay | Hon Hai | China Steel | CHT | |
|---|---|---|---|---|---|
| KRR-SR | 2.7440 | 27.9581 | 22.6070 | 25.0596 | 25.0596 |
| KRR-MARS | 9.8224 | 13.7496 | 17.8506 | 7.6083 | 25.4612 |
| KRR-GA-SVR | 5.0519 | 0.4686 | 1.6343 | 2.5085 | |
| MARS-SR | 1.2980 | 2.5218 | 1.9738 | 3.7657 | 0.1527 |
| SR—MARS | 1.2494 | 0.4842 | 1.8768 | 0.2468 | 0.1434 |
| MARS-KRR | 32.938 | 0.4663 | 1.8837 | 0.4634 | 8.1484 |
| SR—KRR | 30.938 | 0.4831 | 1.8528 | 0.2452 | 7.3393 |
| Proposed model A | 0.8256 | 0.2859 | 0.0040 | 0.2265 | 0.0354 |
| Proposed model B | 0.1671 |
Note: Cathay denotes Cathay Financial Holdings
The RMSE of all experiments for long testing period.
| TSMC | Cathay | Hon Hai | China Steel | CHT | |
|---|---|---|---|---|---|
| KRR-SR | 12.3018 | 30.3242 | 76.1505 | 28.4001 | 28.4000 |
| KRR-MARS | 125.9944 | 24.5962 | 339.1292 | 0.3205 | 37.0234 |
| KRR-GA-SVR | 6.1786 | 1.0663 | 0.3136 | 1.0152 | 1.6955 |
| MARS-SR | 0.9255 | 0.8007 | 2.1778 | 3.9520 | 0.1801 |
| SR—MARS | 1.8027 | 14.5722 | 2.1328 | 0.3250 | 0.6356 |
| MARS-KRR | 29.0172 | 0.7296 | 2.2246 | 0.4533 | 27.6530 |
| SR—KRR | 22.4461 | 0.7367 | 2.1303 | 0.3250 | 75.7406 |
| Proposed model A | 0.8395 | 0.6552 | |||
| Proposed model B | 0.0914 | 0.3161 | 0.0608 |
Note: Cathay denotes Cathay Financial Holdings
Fig 7The closing prices of five companies from 2003 to 2012.
The selected features for all datasets.
| Method | Indicator1 | Indicator2 | Indicator3 | Indicator4 | |
|---|---|---|---|---|---|
| CHT | MARS | CDP value | MO2 | - | - |
| SR | - | - | |||
| China Steel | MARS | Daily price change | The Final Best Ask Quote | TAIEX index | |
| SR | - | ||||
| Hon Hai | MARS | ln(ROI) | Transaction volume | - | |
| SR | EMA | MACD | ln(ROI) | - | |
| Cathay Financial Holdings | MARS | ROI | CDP value | - | |
| SR | 5BIAS | MA5 | - | ||
| TSMC | MARS | MO2 | The Final Best Ask Quote | - | |
| SR | - |