| Literature DB >> 32530923 |
Suhui Liu1, Xiaodong Zhang1, Ying Wang1, Guoming Feng1.
Abstract
Stock price movement prediction plays important roles in decision making for investors. It was usually regarded as a binary classification task. In this paper, a recurrent convolutional neural kernel (RCNK) model was proposed, which learned complementary features from different sources of data, namely, historical price data and text data in the message board, to predict the stock price movement. It integrated the advantage of technical analysis and sentiment analysis. Different from previous studies, the text data was treated as sequential data and utilized the RCNK model to train sentiment embeddings with the temporal features. Besides, in the classification section of the model, the explicit kernel mapping layer was used to replace several full-connected layers. This operation reduced the parameters of the model and the risk of overfitting. In order to test the impact of treating the sentiment data as sequential data, the effectiveness of explicit kernel mapping layer and the usefulness integrating the technical analysis and sentiment analysis, the proposed model was compared with the other two deep learning models (recurrent convolutional neural network model and convolutional neural kernel model) and the models with only one source of data as input. The result showed that the proposed model outperformed the other models.Entities:
Mesh:
Year: 2020 PMID: 32530923 PMCID: PMC7292408 DOI: 10.1371/journal.pone.0234206
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The process of data preparation.
Input features in Input_1(t).
| Feature | Formula | Description |
|---|---|---|
| - | Stock open price at date | |
| - | Stock close price at date | |
| - | Highest stock price at date | |
| - | Lowest stock price at date | |
| - | The number of shares traded at date | |
| A price trend indicator calculated as the average of stock close price over | ||
| A price trend indicator calculated as the exponential average of stock close price over | ||
| max( | A price volatility indicator at date | |
| A price volatility trend indicator over | ||
| 100−100( | Relative strength index of price trend over | |
| ( | Price rate-of-change indicator | |
| A momentum indicator that shows the relationship between the current close price and highest, lowest price over past | ||
| A momentum indicator that gives a signal meaning that a stock is oversold or overbought | ||
| Exponential moving average of | ||
| An oscillator indicator used to indicate whether a stock is oversold or over overbought |
Fig 2The architecture of Recurrent Convolutional Neural Kernel (RCNK) model.
Fig 3The processing of CNN layer.
Fig 4The processing of LSTM layer.
Label distribution of datasets.
| Positive | Negative | |
|---|---|---|
| Training data | 0.5220 | 0.4780 |
| Test data | 0.5297 | 0.4703 |
Abbreviations for different models.
| Abbreviation | Description |
|---|---|
| RCNK | Recurrent convolutional neural kernel model |
| RCNK-T | RCNK model only with time series data as input |
| RCNK-S | RCNK model only with sentiment data as input |
| CNK | Convolutional neural kernel model in [ |
| RCNN | Recurrent convolutional neural network model in [ |
Fig 5Two strategies for collecting posts.
Fig 6Accuracy of two strategies with different input window length.
Average accumulated returns for two strategies with different input window length.
| Input window length | |||||
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | |
| Strategy A | 732 | 1660 | 2910 | 1724 | 648 |
| Strategy B | 3947 | 3833 | 3701 | 4244 | 3886 |
Predictive performance of different models.
| Metrics | Models | ||||
|---|---|---|---|---|---|
| RCNK | RCNK-T | RCNK-S | CNK | RCNN | |
| Accuracy | 66.26% | 59.44% | 63.25% | 65.61% | 63.66% |
| MCC | 0.3918 | 0.2747 | 0.3572 | 0.3613 | 0.3111 |
Accumulated returns of individual stocks with different models and buy & hold strategy.
| Stocks | Models | Buy & Hold strategy | ||||
|---|---|---|---|---|---|---|
| RCNK | RCNK-T | RCNK-S | CNK | RCNN | ||
| 000005 | 997 | 723 | -248 | 554 | 529 | -894 |
| 000060 | 154 | 751 | -355 | 683 | -780 | -1172 |
| 000068 | 3098 | -1077 | 1759 | 2408 | 2101 | -2518 |
| 000157 | 2907 | 1382 | -52 | 1466 | 3765 | 188 |
| 000338 | 5376 | 1884 | 4647 | 5459 | 4297 | 433 |
| 000425 | 6776 | 4554 | 6382 | 6661 | 7919 | 1422 |
| 000547 | 8380 | 203 | 8902 | 7647 | 10697 | 3878 |
| 000630 | -710 | -632 | 233 | 166 | -442 | -787 |
| 000751 | -732 | -187 | -702 | -704 | -287 | -1016 |
| 000876 | 7525 | 572 | 5750 | 6222 | 4716 | 3202 |
| 000921 | 3060 | -612 | 2662 | 2708 | 3839 | -1555 |
| 000998 | 5138 | 1274 | 6278 | 11099 | 7277 | 3784 |
| 002299 | 5963 | -65 | 2778 | 2624 | 3031 | -1689 |
| 002714 | 11479 | -53 | 4218 | 9642 | 3871 | 1848 |
| Average | 4244 | 623 | 3018 | 4045 | 3609 | 366 |
Fig 7The t-SNE projections of test data.
(a) The t-SNE projections of the test data after computation of LSTM layer; (b) The t-SNE projections of the test data after computation of explicit kernel mapping layer.