| Literature DB >> 35572052 |
Xiaoming Yu1, Wenjun Wu1, Xingchuang Liao1, Yong Han1.
Abstract
In a complex and changeable stock market, it is very important to design a trading agent that can benefit investors. In this paper, we propose two stock trading decision-making methods. First, we propose a nested reinforcement learning (Nested RL) method based on three deep reinforcement learning models (the Advantage Actor Critic, Deep Deterministic Policy Gradient, and Soft Actor Critic models) that adopts an integration strategy by nesting reinforcement learning on the basic decision-maker. Thus, this strategy can dynamically select agents according to the current situation to generate trading decisions made under different market environments. Second, to inherit the advantages of three basic decision-makers, we consider confidence and propose a weight random selection with confidence (WRSC) strategy. In this way, investors can gain more profits by integrating the advantages of all agents. All the algorithms are validated for the U.S., Japanese and British stocks and evaluated by different performance indicators. The experimental results show that the annualized return, cumulative return, and Sharpe ratio values of our ensemble strategy are higher than those of the baselines, which indicates that our nested RL and WRSC methods can assist investors in their portfolio management with more profits under the same level of investment risk.Entities:
Keywords: Deep reinforcement learning; Investment market; Real-time decision-making; Stock trading
Year: 2022 PMID: 35572052 PMCID: PMC9082989 DOI: 10.1007/s10489-022-03606-0
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.019
Fig. 1The total value of an investor’s stock changes after 3 different actions (buy, hold, and sell)
Fig. 2A2C model framework in the stock environment
Fig. 3Framework of the DDPG model
Fig. 4Framework of nested A2C/DDPG/SAC RL
Fig. 5Workflow of the Weighted Random Strategy with Confidence (WRSC) strategy
The company name abbreviation of 30 stocks selected in the U.S., Japanese, and British markets respectively
| Market | Company Symbol |
|---|---|
| The U.S. stock (30 stocks) | AXP, AMGN, AAPL, BA, CAT, CSCO, CVX, GS, HD, HON, IBM, INTC, JNJ, KO, JPM, MCD, MMM, MRK, MSFT, NKE, PG, TRV, UNH, CRM, VZ, V, WBA, WMT, DIS, DOW |
| The Japanese stock (30 stocks) | Advantest Corp, Alps Electric, Amada, Chiba Bank, Chugai Pharmaceutical, Concordia Financial Group, Dainippon Screen Mfg., DIC Corp, Eneos Holdings Inc, Fujikura Ltd., GS Yuasa Corp., Hitachi Zosen Corp., J.Front Retailing Co., Ltd., Japan Steel Works Ltd, JR West Japan, KDDI, Keisei Electric Railway Co.,Ltd., Konami Corp., Maruha Nichiro Corp, Minebea Mitsumi Inc, NEC, Tokyo Electric Power, Casio, Mazda, Hitachi, Mitsui chemical, Mitsubishi Motors, Sony, Isuzu Motors Ltd., Yamaha |
| The British stock (30 stocks) | ABDN, EDEV, BGUK, BLND, CRH, ENT, FLTRF, HIK, HLMA, ICP, JETJ, MRON, NWG, OCDO, POLYP, PSHP, RMV, RTO, SGE, SMDS, SMT, AZN, CPG, RR, LSEG, SSE, TW, NG, BP, BATS |
Fig. 6Data splitting of stocks on the U.S./Japanese/British markets
Index values of Nested RL, WRSC and other baselines on 30 U.S. stocks
| 2015/01/01-2021/01/01 | WRSC (ours) | Nested RL(ours) | Multi-DQN [ | AUST [ | Random Weight-R | Weight-R | Average-W | ||
|---|---|---|---|---|---|---|---|---|---|
| A2C RL | DDPG RL | SAC RL | |||||||
| Annual return | 12.27% | 9.23% | 7.57% | 0.326% | 8.51% | 6.36% | 8.77% | 6.28% | |
| Cumulative return | 100.69% | 70.15% | 55.11% | 2.54% | 63.43% | 44.95% | 65.86% | 44.24% | |
| Sharpe ratio | 0.81 | 0.68 | 0.58 | 0.12 | 0.58 | 0.43 | 0.66 | 0.44 | |
| Annual volatility | 15.8% | 15.76% | 14.46% | 14.24% | 15.23% | 16.79% | 17.79% | 14.22% | 16.97% |
| Max drawdown | -17.76% | -27.09% | -20.04% | -18.08% | -28.96% | -25.2% | -32.33% | -19.79% | -28.22% |
Index values of Nested RL, WRSC and other baselines on 30 Japanese stocks
| 2015/01/01-2021/01/01 | WRSC (ours) | Nested RL(ours) | Multi-DQN [ | AUST [ | Random Weight-R | Weight-R | Average-W | ||
|---|---|---|---|---|---|---|---|---|---|
| A2C RL | DDPG RL | SAC RL | |||||||
| Annual return | 0.64% | -1.98% | -1.17% | -6.18% | -2.67% | -1.90% | -4.72% | -3.56% | |
| Cumulative return | 3.76% | -10.99% | -6.62% | -33.36% | -14.53% | -10.57% | -24.48% | -18.99% | |
| Sharpe ratio | 0.149 | -0.404 | -0.342 | -0.652 | -0.316 | -0.583 | 0.27 | -0.538 | |
| Annual volatility | 5.19% | 45.7% | 4.68% | 3.28% | 10.15% | 7.6% | 3.21% | 69.85% | 6.36% |
| Max drawdown | -6.72% | -37.5% | -10.99% | -11.8% | -34.69% | -19.7% | -16.97% | -62.11% | -21.4% |
Index values of Nested RL, WRSC and other baselines on 30 British stocks
| 2015/01/01-2021/01/01 | WRSC (ours) | Nested RL(ours) | Multi-DQN [ | AUST [ | Random Weight-R | Weight-R | Average-W | ||
|---|---|---|---|---|---|---|---|---|---|
| A2C RL | DDPG RL | SAC RL | |||||||
| Annual return | 25.32% | 23.11% | 22.81% | 9.42% | 25.65% | 17.77% | 25.21% | 14.42% | |
| Cumulative return | 289.53% | 249.87% | 244.82% | 84.74% | 295.74% | 167.79% | 287.36% | 125.16% | |
| Sharpe ratio | 0.65 | 0.62 | 0.61 | 0.51 | 0.65 | 0.54 | 0.65 | 0.50 | |
| Annual volatility | 62.47% | 62.11% | 62.80% | 60.17% | 63.15% | 62.93% | 59.73% | 59.87% | 55.48% |
| Max drawdown | -44.23% | -43.40% | -42.14% | -43.51% | -51.67% | -43.14% | -42.55% | -40.03% | -39.65% |
Fig. 7Trends of cumulative returns of nested RL, WRSC and baselines for 2015/01/01-2020/12/30 on 30 U.S. stocks
Fig. 8Trends of cumulative returns of nested RL, WRSC and baselines for 2015/01/01-2020/12/30 on 30 Japanese stocks
Fig. 9Trends of cumulative returns of nested RL, WRSC and baselines for 2015/01/01-2020/12/30 on 30 British stocks
Fig. 10Trend of cumulative returns of nested A2C RL, WRSC and the DDPG, the SAC, and the A2C on the U.S. stocks
Fig. 11Decision-making process of AUST on CSCO stock for 2020/09/30-2020/12/30
Fig. 12Decision-making process of nested A2C RL on CSCO stock for 2020/09/30-2020/12/30