| Literature DB >> 35018169 |
Samit Bhanja1,2, Abhishek Das2.
Abstract
Among all the application areas of the time-series prediction, stock market prediction is the most challenging task due to its dynamic nature, and dependency on many volatile factors. The unpredictable fatal events called Black Swan events also highly influence the stock market. If the successful stock trends prediction is achieved, then the investors can adopt a more appropriate trading strategy, and that can significantly reduce the risk of investment. In this work, a time-efficient hybrid stock trends prediction framework(HSTPF) is proposed to successfully predict the future trends of the stock market even during the periods of Black Swan events. Here, to improve the prediction accuracy of HSTPF, the Black Swan events analysis and features selection operations are performed, and also the performance of various machine learning classifiers are analyzed. A vast number of experiments are conducted on the two real-world stock market datasets S&P BSE SENSEX and Nifty 50, to analyze the performance of the proposed framework. The framework is applied for the single-step and multi-step ahead predictions. The experimental results show that the proposed framework produces over 86% of accuracy, and during the Black Swan events, its accuracy is almost 80% for single-step ahead predictions. For the multi-step ahead of predictions, the HSTPF is produced satisfactory results. The framework also outperforms other existing similar works even during the Black Swan events in terms of prediction accuracy, and its computational time is also very low.Entities:
Keywords: Black Swan event; Deep learning; Machine learning; Stock trends prediction; Technical indicators
Year: 2022 PMID: 35018169 PMCID: PMC8739707 DOI: 10.1007/s11334-021-00428-0
Source DB: PubMed Journal: Innov Syst Softw Eng ISSN: 1614-5046
Fig. 1Architecture of the proposed forecasting system
Dataset description
| Historical dataset | |||
|---|---|---|---|
| name | Period | Attributes | Source |
| S&P BSE SENSEX | 01.01.1991 to 31.12.2019 | Open price, low price, high price, close price | www1.nseindia.com |
| Nifty 50 | 01.01.1996 to 31.12.2019 | Open price, low price, high price, close price | www.bseindia.com |
| Forecasting dataset | |||
| S&P BSE SENSEX | 01.01.2020 to 31.03.2021 | Open price, low price, high price, close price | www1.nseindia.com |
| Nifty 50 | 01.01.2020 to 31.03.2021 | Open price, low price, high price, close price | www.bseindia.com |
Description of Black Swan events
| Event | Year | Duration |
|---|---|---|
| Harsh Mehta Scam | Apr’1992 | 261 days |
| 9/11 Attack | Sep’2001 | 38 days |
| Global Financial Meltdown | Jan’2008 | 418 days |
| China Slowdown | Aug’2015 | 185 days |
| Demonitization | Nov’2019 | 17 days |
| NBFC Crisis | Sep’2018 | 144 days |
| COVID19 Pandemic | Mar’2020 | 105 days |
Fig. 2Graphical representation of Black Swan event
Fig. 3Architecture of the impact prediction model(DLM)
Formula of technical indicators
| Technical indicator | Formula | Technical indicator | Formula |
|---|---|---|---|
| – | |||
| – | |||
| high price. | |||
| – | |||
| – | |||
| – | |||
Fig. 4Architecture of the autoencoder
Optimal parameters of the deep learning and machine learning models
| Sl. No. | Model | Optimal parameter values | |
|---|---|---|---|
| 1 | DLM | Block 1 | layers = 4 |
| layer1(1D-CNN):filters=12, kernel_size=3, strides=1 | |||
| layer2(1D-CNN): filters=8, kernel_size=3,strides=1 | |||
| Layer3(1D-CNN): filters=4, kernel_size=3,strides=1 | |||
| Layer4(Dense): units=10, activation=None | |||
| Block 2 | layers=3 | ||
| Layer1(Bi-GRU): units=64, activation=tanh, return_sequences=False, dropout=0.0 | |||
| Layer2(Bi-GRU): units=32, activation=tanh, return_sequences=False, dropout=0.4 | |||
| Layer3(Dense): units=2, activation=None | |||
| 2 | Multinomial Naïve Bayes (MNB) | alpha:0.1 | |
| 3 | Support Vector Machine (SVM) | kernel: rbf, C: 0.6 | |
| 4 | K-Nearest Neighbor (KNN) | n_neighbors: 5 | |
| 5 | AdaBoost (AB) | n_estimators: 120, learning_rate: 0.1 | |
| 6 | Gradient Boosting Classifier (GBM) | n_ estimators: 200 | |
Comparison of different machine learning classifier for HSTPF in terms of precision, recall, and F-measure for S&P BSE SENSEX dataset
| Classes | Metrices (%) | Algorithms | ||||
|---|---|---|---|---|---|---|
| MNB | SVM | KNN | AB | GBM | ||
| Positive | Precision | 88.0 | 85.89 | 71.68 | 77.00 | 89.38 |
| Recall | 83.33 | 80.17 | 91.01 | 88.78 | 93.52 | |
| F-measure | 85.84 | 81.92 | 80.20 | 82.47 | 91.40 | |
| Negative | Precision | 77.01 | 83.91 | 70.80 | 87.36 | 91.95 |
| Recall | 83.75 | 82.02 | 78.22 | 77.55 | 86.97 | |
| F-measure | 80.24 | 82.95 | 84.04 | 82.16 | 89.39 | |
| Neutral | Precision | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Recall | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| F-measure | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Comparison of different machine learning classifier for HSTPF in terms of precision, recall, and F-measure for Nifty 50 dataset
| Classes | Metrices (%) | Algorithms | ||||
|---|---|---|---|---|---|---|
| MNB | SVM | KNN | AB | GBM | ||
| Positive | Precision | 71.90 | 83.47 | 75.2 | 66.94 | 85.12 |
| Recall | 84.47 | 87.82 | 85.05 | 90.00 | 87.82 | |
| F-measure | 77.70 | 85.60 | 81.23 | 76.78 | 85.60 | |
| Negative | Precision | 82.97 | 85.10 | 64.46 | 90.43 | 88.30 |
| Recall | 69.64 | 80.00 | 72.22 | 68.00 | 82.17 | |
| F-measure | 75.73 | 82.47 | 68.12 | 77.62 | 85.12 | |
| Neutral | Precision | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Recall | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| F-measure | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Fig. 5Prediction accuracy of various machine learning classifiers on the S&P BSE SENSEX dataset for 1 to 10 days ahead prediction
Fig. 6Prediction accuracy of various machine learning classifiers on the Nifty 50 dataset for 1 to 10 days ahead prediction
Fig. 7Training time of various models to train S&P BSE SENSEX dataset
Fig. 8Training time of various models to train Nifty 50 dataset
Fig. 9Prediction accuracy of HSTPF with and without Black Swan event analysis on the S&P BSE SENSEX dataset for 1 to 10 days ahead prediction
Fig. 10Prediction accuracy of HSTPF with and without Black Swan event analysis on the Nifty 50 dataset for 1 to 10 days ahead prediction
Comparison of different forecasting models in terms of precision, recall, and F-measure for S&P BSE SENSEX dataset for 1 day ahead forecasting
| Classes | Metrices | Models | |||||
|---|---|---|---|---|---|---|---|
| CNN | Bi-GRU | MLF | AT-GRU-M | HOMM | HSTPF | ||
| Positive | Precision (%) | 70.80 | 63.72 | 79.65 | 81.42 | 61.06 | 91.15 |
| Recall (%) | 65.31 | 73.47 | 90.00 | 85.19 | 90.79 | 91.96 | |
| F-measure (%) | 65.78 | 68.25 | 84.51 | 83.26 | 73.01 | 91.55 | |
| Negative | Precision (%) | 50.57 | 70.11 | 88.51 | 79.22 | 91.95 | 89.66 |
| Recall (%) | 57.06 | 59.8 | 77.78 | 74.39 | 64.52 | 88.64 | |
| F-measure (%) | 53.59 | 64.55 | 82.8 | 76.76 | 75.83 | 89.15 | |
| Neutral | Precision (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Recall (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| F-measure (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Comparison of different forecasting models in terms of precision, recall, and F-measure for Nifty 50 dataset for 1 day ahead forecasting
| Classes | Metrices | Models | |||||
|---|---|---|---|---|---|---|---|
| CNN | Bi-GRU | MLF | AT-GRU-M | HOMM | HSTPF | ||
| Positive | Precision (%) | 48.85 | 48.76 | 90.08 | 79.34 | 83.47 | 92.56 |
| Recall (%) | 64.00 | 69.41 | 78.99 | 83.48 | 82.11 | 84.85 | |
| F-measure (%) | 55.41 | 57.28 | 84.17 | 81.36 | 82.78 | 88.54 | |
| Negative | Precision (%) | 61.67 | 72.34 | 69.15 | 79.79 | 76.6 | 78.72 |
| Recall (%) | 46.4 | 52.31 | 84.42 | 75.00 | 78.26 | 89.16 | |
| F-measure (%) | 52.97 | 60.72 | 76.03 | 77.32 | 77.42 | 83.62 | |
| Neutral | Precision (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| Recall (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| F-measure (%) | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
Fig. 11Prediction accuracy of various models on the S&P BSE SENSEX dataset for 1 to 10 days ahead prediction
Fig. 12Prediction accuracy of various models on the Nifty 50 dataset for 1 to 10 days ahead prediction
Accuracy (%) comparison of various models to forecast S&P BSE SENSEX during COVID 19 Pandemic
| Day | CNN | Bi-GRU | MLF | AT-GRU–M | HOMM | HSTPF |
|---|---|---|---|---|---|---|
| Day 1 | 53.21 | 52.81 | 61.03 | 60.93 | 54.37 | 80.36 |
| Day 2 | 53.32 | 53.11 | 60.14 | 60.81 | 51.82 | 80.34 |
| Day 3 | 50.86 | 47.61 | 59.44 | 60.01 | 50.32 | 74.13 |
| Day 4 | 47.58 | 45.83 | 55.39 | 55.87 | 50.07 | 73.31 |
| Day 5 | 45.19 | 40.60 | 50.09 | 49.30 | 44.77 | 70.71 |
| Day 6 | 45.41 | 40.89 | 50.09 | 49.30 | 44.77 | 68.13 |
| Day 7 | 40.32 | 38.42 | 48.19 | 46.73 | 40.21 | 64.24 |
| Day 8 | 35.66 | 34.96 | 4.35 | 44.80 | 40.09 | 60.08 |
| Day 9 | 34.39 | 30.81 | 41.82 | 41.72 | 38.46 | 59.07 |
| Day 10 | 32.06 | 31.03 | 40.09 | 39.11 | 36.98 | 60.01 |
Accuracy (%) comparison of various models to forecast Nifty 50 during COVID 19 Pandemic
| Day | CNN | Bi-GRU | MLF | AT-GRU–M | HOMM | HSTPF |
|---|---|---|---|---|---|---|
| Day 1 | 50.21 | 50.66 | 61.74 | 62.02 | 68.11 | 79.42 |
| Day 2 | 49.38 | 50.372 | 59.10 | 60.38 | 63.21 | 74.26 |
| Day 3 | 46.04 | 48.03 | 57.46 | 58.02 | 63.04 | 72.39 |
| Day 4 | 46.11 | 45.79 | 55.82 | 54.37 | 59.83 | 70.17 |
| Day 5 | 40.82 | 40.33 | 50.31 | 50.63 | 51.96 | 65.32 |
| Day 6 | 40.01 | 38.41 | 48.36 | 45.94 | 43.03 | 61.26 |
| Day 7 | 39.16 | 34.28 | 42.41 | 44.28 | 41.81 | 61.30 |
| Day 8 | 30.70 | 32.61 | 41.57 | 42.07 | 42.04 | 59.44 |
| Day 9 | 30.01 | 31.84 | 40.38 | 41.87 | 40.78 | 58.38 |
| Day 10 | 29.17 | 30.21 | 39.16 | 40.08 | 41.53 | 57.26 |