| Literature DB >> 34925482 |
Sumei Ruan1, Xusheng Sun1, Ruanxingchen Yao2, Wei Li1.
Abstract
To detect comprehensive clues and provide more accurate forecasting in the early stage of financial distress, in addition to financial indicators, digitalization of lengthy but indispensable textual disclosure, such as Management Discussion and Analysis (MD&A), has been emphasized by researchers. However, most studies divide the long text into words and count words to treat the text as word count vectors, bringing massive invalid information but ignoring meaningful contexts. Aiming to efficiently represent the text of large size, an end-to-end neural networks model based on hierarchical self-attention is proposed in this study after the state-of-the-art pretrained model is introduced for text embedding including contexts. The proposed model has two notable characteristics. First, the hierarchical self-attention only affords the essential content with high weights in word-level and sentence-level and automatically neglects lots of information that has no business with risk prediction, which is suitable for extracting effective parts of the large-scale text. Second, after fine-tuning, the word embedding adapts the specific contexts of samples and conveys the original text expression more accurately without excessive manual operations. Experiments confirm that the addition of text improves the accuracy of financial distress forecasting and the proposed model outperforms benchmark models better at AUC and F2-score. For visualization, the elements in the weight matrix of hierarchical self-attention act as scalers to estimate the importance of each word and sentence. In this way, the "red-flag" statement that implies financial risk is figured out and highlighted in the original text, providing effective references for decision-makers.Entities:
Mesh:
Year: 2021 PMID: 34925482 PMCID: PMC8683239 DOI: 10.1155/2021/1165296
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Flow chart of the proposed deep learning model.
Figure 2The architecture of hierarchical attention networks (HAN).
The sample distribution of the original dataset.
| Class | Number |
|---|---|
| Positive samples (titled “ST” in the next 2 years) | 862 |
| Negative samples | 11142 |
| Total samples | 12004 |
The sample distribution of the original dataset.
| Class | Number |
|---|---|
| Positive samples (titled “ST” in the next 2 years) | 862 |
| Negative samples | 2978 |
| Total samples | 3840 |
Evaluation of models on 48 financial ratios.
| AUC | Precision | Recall |
|
| ||
|---|---|---|---|---|---|---|
| FIN | LR | 0.6768 |
| 0.372 | 0.5166 | 0.4189 |
| SVM | 0.7506 | 0.7768 | 0.5465 | 0.6416 | 0.5809 | |
| XGB |
| 0.7222 |
|
|
| |
| RF | 0.7829 | 0.7448 | 0.6279 | 0.6814 | 0.6482 | |
| ANN | 0.7337 | 0.644 | 0.5581 | 0.5980 | 0.5734 | |
| AdaBoost | 0.7933 | 0.7604 | 0.6453 | 0.6981 | 0.6654 |
Evaluation of models on both 48 financial ratios and text.
| AUC | Precision | Recall |
|
| ||
|---|---|---|---|---|---|---|
| FIN + BOW | LR | 0.7203 | 0.8515 | 0.4826 | 0.6160 | 0.5284 |
| SVM | 0.7729 |
| 0.5258 | 0.6594 | 0.5708 | |
| XGB | 0.8115 | 0.7356 | 0.7035 | 0.7192 | 0.7097 | |
| RF | 0.7634 | 0.6357 | 0.6121 | 0.6237 | 0.6167 | |
| ANN | 0.7636 | 0.5720 | 0.6962 | 0.6280 | 0.6672 | |
| AdaBoost | 0.8071 | 0.7214 | 0.6860 |
| 0.6986 | |
|
| ||||||
| FIN + TXT |
|
| 0.6656 |
| 0.6951 |
|
Figure 3A page from MD&A parsed from a positive sample. In the sentence-level attention, corresponding to the top three total scores of column weights, the three sentences that best summarize the article information are highlighted. Similarly, the keywords in each sentence are also marked according to word-level attention respectively, where word-level attention is not depicted here.