| Literature DB >> 33262381 |
Yi Li1, Zichuan Mi1, Wenjun Jing2.
Abstract
This study adopts the textual network to describe the coordination among the interplay of words, where nodes represent words and nodes are connected if the corresponding words have co-occurrence pattern across documents. To study stock movements, we further proposed the sparse laplacian shrinkage logistic model (SLS_L) which can properly take into account the network connectivity structure. By using this approach, we investigated the relationship between Shenwan index and analysts' research reports. The securities analysts' research reports are crawled by a famous financial website in China: EastMoney, and are then parsed into time-series textual data. The empirical results show that the proposed SLS_L model outperforms alternatives including Lasso-Logistics (L_L) and MCP-Logistic (MCP_L) models by having better prediction performance. Besides, we search published literature and find the identified keywords with more lucid interpretations. Our study unveils some interesting findings that the efficient use of textual network is important to improve the predictive power as well as the semantic interpretability in stock market analysis.Entities:
Year: 2020 PMID: 33262381 PMCID: PMC7708485 DOI: 10.1038/s41598-020-77823-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The flowchart of the text mining processing.
Figure 2Keywords network. (a) The simple grid layout of a graph; (b) the node is scaled using the Fruchterman–Reingold layout algorithm.
Results of the K-core collapse sequence analysis.
| k-core | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| k-remainder | 5 | 3 | 7 | 3 | 5 | 3 | 3 | 5 | 3 | 1 | 18 |
Results of the K-core collapse sequence analysis.
| Keywords | Coefficient | ||
|---|---|---|---|
| MCP_L | SLS_L | L_L | |
| Fuzhou(福州) | − 1.0156 | − 0.3635 | − 0.0823 |
| Design(设计) | − 6.2383 | − 0.9961 | − 0.2950 |
| Demographic dividend(人口红利) | − 1.4596 | − 1.4300 | − 0.7243 |
| Intensify(加剧) | − 2.3491 | − 0.8559 | − 0.3031 |
| Input market(入市) | − 1.4538 | − 0.4034 | − 0.1640 |
| Pessimism(悲观) | 6.6403 | 1.0625 | 0.3359 |
| Concern(关注度) | − 10.1856 | − 1.9920 | − 0.8522 |
| Anew(重新) | − 1.8171 | − 0.3550 | − 0.0789 |
| Own funds(自有资金) | − 4.1016 | − 0.7022 | − 0.2374 |
| Zhejiang(浙江) | 0 | − 0.2541 | − 0.0970 |
| Hotspot(热点) | − 0.2639 | 0 | − 0.0019 |
| Prefer(首选) | − 0.7123 | − 0.2818 | − 0.0752 |
| Consistent(一致) | 0 | 0 | − 0.0087 |
| Listing(上市) | − 0.3414 | − 0.1252 | − 0.0275 |
| Principle(原则) | − 3.5691 | 0 | 0 |
| Entrepreneurship(创业) | − 4.4760 | − 0.9795 | − 0.2328 |
| Cool down(冷却) | − 4.3086 | − 1.0358 | − 0.0480 |
| Rise(升至) | − 0.4476 | − 0.4925 | − 0.2201 |
| Rise range(上涨幅度) | 4.5206 | 0.9423 | 0.1197 |
| Chengyu stock(成渝) | − 5.2876 | − 1.9413 | − 0.2541 |
| Real economy(实体经济) | − 3.7955 | − 1.0270 | − 0.1717 |
| Federal reserve(美联储) | − 1.9805 | − 0.6351 | 0 |
| Circulation(流通) | − 1.1935 | − 0.5240 | − 0.1364 |
| Sign(签署) | − 1.3929 | − 0.3504 | − 0.0544 |
| Volume of trade(放量) | − 2.0652 | − 0.3127 | − 0.0281 |
| Securitization(证券化) | − 1.7006 | − 0.4444 | − 0.1080 |
| Explore(探索) | 0 | 0.4535 | 0 |
| Imagine(想象) | 7.2612 | 1.9061 | 0.4485 |
| Effectiveness(效用) | 6.5592 | 1.9769 | 0.4402 |
| Multiple(倍数) | 2.9660 | 1.1980 | 0.0678 |
| Comprehensive planning(总体规划) | 5.9179 | 0.9804 | 0.2849 |