| Literature DB >> 35237312 |
Lei Zhang1,2, Qiankun Song2.
Abstract
Due to the difficulty of credit risk assessment, the current financing and loan difficulties of small- and medium-sized enterprises (SMEs) are particularly prominent, which hinders the operation and development of enterprises. Based on the previous researches, this paper first screens out features by correlation coefficient method and gradient boosting decision tree (GBDT). Then, with the help of SE-Block, the attention mechanism is added to the feature tensor of the subset separated from metadata. On this foundation, two models, XGBoost and LightGBM, are used to train four subsets, respectively, and Bayesian ridge regression is used to fuse the training results of single models under different subsets. In the simulation experiment, the AUC value of the NN-ATT-Bayesian-Stacking model reaches 0.9675 and the distribution of prediction results is ideal. The model shows good robustness, which could make a reliable assessment for the financing and loans of SMEs.Entities:
Mesh:
Year: 2022 PMID: 35237312 PMCID: PMC8885257 DOI: 10.1155/2022/8612759
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Network construction of SE-Block module.
Correlation coefficient selection.
| Accuracy | Metadata (%) | Subset 1 (%) | Subset 2 (%) |
|---|---|---|---|
| Random forest | 91.139 | 89.255 | 91.051 |
Tree model selection.
| Accuracy | Metadata (%) | Subset 3 (%) | Subset 4 (%) |
|---|---|---|---|
| Random forest | 91.139 | 91.139 | 88.911 |
Figure 2Model structure.
Experimental environmental configuration.
| OS | Ubuntu 16.04.5 LTS 64 bits |
|---|---|
| CPU | E5-2620 v3 @ 2.40 GHz,x2 |
| GPU | GTX TITAN |
| CUDA | 9.0.176 |
| TensorFlow | 2.0.0 |
| Scikit-learn | 0.23.1 |
AUC value of base model.
| Classifier | Raw | Subset 1 | Subset 2 | Subset 3 | Subset 4 |
|---|---|---|---|---|---|
| Logistic | 0.8980 | 0.8992 | 0.8925 | 0.8956 | 0.8962 |
| Linear | 0.9034 | 0.8880 | 0.8855 | 0.8868 | 0.8869 |
| Bayesian | 0.8921 | 0.8942 | 0.8892 | 0.8908 | 0.8986 |
| LGB | 0.9266 | 0.9248 | 0.9155 | 0.9255 | 0.9231 |
| XGB | 0.9285 | 0.9304 | 0.9238 | 0.9294 | 0.9185 |
| NN | 0.9198 | 0.9109 | 0.9123 | 0.9114 | 0.9044 |
| NN-ATT | 0.9035 | 0.9101 | 0.9003 | 0.9017 | 0.9019 |
| CNN | 0.9257 | 0.9214 | 0.9158 | 0.9101 | 0.9137 |
AUC value of stacking model.
| Classifier | Raw | Subset 1 | Subset 2 | Subset 3 | Subset 4 |
|---|---|---|---|---|---|
| NN-XGB | 0.9299 | 0.9176 | 0.9287 | 0.9191 | 0.9203 |
| NN-ATT-XGB | 0.9299 | 0.9217 | 0.9318 | 0.9128 | 0.9194 |
| CNN-XGB | 0.9299 | 0.9275 | 0.9267 | 0.9188 | 0.9232 |
| NN-LGB | 0.9266 | 0.9195 | 0.9167 | 0.9128 | 0.9225 |
| NN-ATT-LGB | 0.9266 | 0.9127 | 0.9173 | 0.9208 | 0.9229 |
| CNN-LGB | 0.9266 | 0.9215 | 0.9161 | 0.9292 | 0.9228 |
AUC value of fusion model.
| Classifier | Raw | Subset |
|---|---|---|
| Bayesian-stacking | 0.9475 | 0.9645 |
| Linear–stacking | 0.9644 | 0.9644 |
| Logistic–stacking | 0.9673 | 0.9673 |
| CNN-Bayesian-stacking | 0.9489 | 0.9675 |
| CNN-Linear–stacking | 0.9489 | 0.9674 |
| CNN-Logistic–stacking |
|
|
| NN-Bayesian-stacking | 0.9489 | 0.9709 |
| NN-Linear–stacking | 0.9489 | 0.9658 |
| NN-Logistic–stacking | 0.9477 | 0.9775 |
| NN-ATT-Bayesian-stacking | 0.9489 | 0.9675 |
| NN-ATT-Linear–stacking | 0.9489 | 0.9675 |
| NN-ATT-Logistic–stacking | 0.9492 | 0.9711 |
Figure 3Distribution map of prediction by evaluation model.