| Literature DB >> 35592649 |
Hendra Suryanto1, Ashesh Mahidadia1,2, Michael Bain2, Charles Guan1, Ada Guan1.
Abstract
In the domain of credit risk assessment lenders may have limited or no data on the historical lending outcomes of credit applicants. Typically this disproportionately affects Micro, Small, and Medium Enterprises (MSMEs), for which credit may be restricted or too costly, due to the difficulty of predicting the Probability of Default (PD). However, if data from other related credit risk domains is available Transfer Learning may be applied to successfully train models, e.g., from the credit card lending and debt consolidation (CD) domains to predict in the small business lending domain. In this article, we report successful results from an approach using transfer learning to predict the probability of default based on the novel concept of Progressive Shift Contribution (PSC) from source to target domain. Toward real-world application by lenders of this approach, we further address two key questions. The first is to explain transfer learning models, and the second is to adjust features when the source and target domains differ. To address the first question, we apply Shapley values to investigate how and why transfer learning improves model accuracy, and also propose and test a domain adaptation approach to address the second. These results show that adaptation improves model accuracy in addition to the improvement from transfer learning. We extend this by proposing and testing a combined strategy of feature selection and adaptation to convert values of source domain features to better approximate values of target domain features. Our approach includes a strategy to choose features for adaptation and an algorithm to adapt the values of these features. In this setting, transfer learning appears to improve model accuracy by increasing the contribution of less predictive features. Although the percentage improvements are small, such improvements in real world lending could be of significant economic importance.Entities:
Keywords: credit risk; deep learning; domain adaptation; explainable AI; transfer learning
Year: 2022 PMID: 35592649 PMCID: PMC9110803 DOI: 10.3389/frai.2022.868232
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Figure 1Network u: the base model.
Figure 2Network v.
Figure 3Network N1N2N3N4.
Figure 4Network N1 and Network N1N2.
Figure 5Network N1N2N3.
The datasets used in the transfer learning studies are listed below.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| CD1 | 2007–2011 | 23,813 | Source | 0.364(0.023) | |
| SB1 |
| 2007–2011 | 1,831 | Target | 0.272(0.067) |
| CD2 | 2007–2014 | 100,000 | Source | 0.417(0.016) | |
| SB2 |
| 2007–2014 | 6,686 | Target | 0.274(0.040) |
| CD3 | 2007–2016 | 100,000 | Source | 0.447(0.013) | |
| SB3 |
| 2007–2016 | 12,114 | Target | 0.331(0.032) |
| CD4 | 2007–2018 | 100,000 | Source | 0.448(0.012) | |
| SB4 |
| 2007–2018 | 13,794 | Target | 0.351(0.024) |
| CCD |
| 2007–2018 | 100,000 | Source | 0.463(0.014) |
| CAR |
| 2007–2018 | 12,734 | Target | 0.436(0.036) |
The “Type” column shows whether the dataset is used as the source or target for the transfer learning process; the results given are cross-validation runs' means and standard deviations (s.d.).
Experimental Results for six models with progressively shifted contribution, built on the source and target datasets described in Table 1 (all of the source:Credit Card/Debt Consolidation (CD), target:Small Business (SB) Loan datasets, plus the source:Credit Card, target:Car Loan datasets); results shown as means (s.d.); models with the highest performance in each column are denoted by the symbol *.
|
| |||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
| 0.157(0.022) | 0.236(0.051) | −0.191(0.260) | 0.196(0.026) | 0.262(0.355) | |
| *0.301(0.097) | *0.287(0.051) | 0.334(0.029) | 0.350(0.029) | *0.447(0.037) | |
| 0.292(0.091) | 0.272(0.054) | *0.337(0.030) | 0.350(0.028) | 0.434(0.035) | |
| 0.230(0.087) | 0.217(0.057) | 0.300(0.032) | 0.310(0.030) | 0.376(0.040) | |
| 0.174(0.010) | 0.172(0.051) | 0.254(0.029) | 0.273(0.030) | 0.310(0.050) | |
| 0.272(0.067) | 0.274(0.040) | 0.331(0.032) | *0.351(0.024) | 0.436(0.036) | |
| % improvement | 10.7% | 4.7% | 1.8% | 0.0% | 2.5% |
Figure 6Feature contribution comparison for transferred, target, and source models using SHAP-transferring from CD to MD.
Target model vs. transferred model.
|
|
|
|
|
|
|---|---|---|---|---|
| CD to MD | Training using Target only | 0.5971(0.0823) | ||
| CD to MD | Training using Source then retraining the last layer using Target | 0.6391(0.0856) | 0.0420 (7.0%) | <0.01 |
| CD to SB | Training using Target only | 0.6194(0.0456) | ||
| CD to SB | Training using Source then retraining the last layer using Target | 0.6419(0.0509) | 0.0224 (3.6%) | <0.01 |
Kolmogorov-Smirnov (KS) of input features.
|
|
| ||||
|---|---|---|---|---|---|
|
|
|
|
| ||
| 1 | term_36m | 0.0407 | <0.24 | 0.0357 | <0.20 |
| 2 | term_60m | 0.0407 | <0.24 | 0.0357 | <0.20 |
| 3 | grade_n | 0.0792 | <0.01 | 0.0984 | <0.01 |
| 4 | sub_grade_n | 0.0884 | <0.01 | 0.1069 | <0.01 |
| 5 | int_rate_n | 0.0941 | <0.01 | 0.1033 | <0.01 |
| 6 | revol_util_n |
| <0.01 |
| <0.01 |
| 7 | emp_length_n | 0.0242 | <0.85 | 0.0749 | <0.01 |
| 8 | dti_n |
| <0.01 |
| <0.01 |
| 9 | installment_n |
| <0.01 | 0.0671 | <0.01 |
| 10 | annual_inc_n | 0.0670 | <0.01 | 0.0906 | <0.01 |
| 11 | loan_amnt_n |
| <0.01 | 0.0813 | <0.01 |
| 12 | cover |
| <0.01 | 0.0585 | <0.01 |
Values in bold represent large differences that are statistically significant.
Adapted model vs. transferred model.
|
|
|
|
|
|
|---|---|---|---|---|
| CD to MD | Transfer only | 0.6391(0.0856) | ||
| CD to MD | Transfer with cover adapted | 0.6491(0.0824) | 0.0100 (1.6%) | <0.01 |
| CD to SB | Transfer only | 0.6419(0.0509) | ||
| CD to SB | Transfer with cover adapted | 0.6361(0.0502) | –0.0058 (–0.9%) | <0.01 |
Adapted model vs. transferred model in CD to MD transfer.
|
|
|
|
|
|---|---|---|---|
| Transfer only | 0.6391(0.0856) | ||
| Adapt all features | 0.4620(0.3048) | –0.1771 (–27.7%) | <0.01 |
| Adapt credit grade and related features, i.e., grade, sub-grade, interest rate | 0.4635(0.3052) | –0.1756 (–27.5%) | <0.01 |
| Adapt features with high KS, i.e., revolving utility, debt to income ratio, installment, loan amount and cover | 0.6563(0.0806) | 0.0172 (2.7%) | <0.01 |
| Adapt features with high KS and related features, i.e., revolving utility, debt to income ratio, installment, loan amount, cover and annual income | 0.6600(0.07417) | 0.0209 (3.3%) | <0.01 |
| Adapt features with high KS and related features less credit history features, i.e., installment, loan amount, cover and annual income | 0.6649(0.0731) | 0.0257 (4.0%) | <0.01 |
Adapted model vs. transferred model in CD to SB transfer.
|
|
|
|
|
|---|---|---|---|
| Transfer only | 0.6419(0.0509) | ||
| Adapt all features | 0.5189(0.1666) | –0.123 (–19.2%) | <0.01 |
| Adapt credit grade and related features, i.e., grade, sub-grade, interest rate | 0.5313(0.1624) | –0.1106 (–17.2%) | <0.01 |
| Adapt features used in CD to MD transfer, i.e., installment, loan amount, cover, annual income | 0.6404(0.0475) | –0.0015 (–0.2%) | <0.01 |
| Adapt features with high KS, i.e., revolving utility and debt to income ratio | 0.6437(0.0495) | 0.0018 (0.3%) | <0.01 |
Kolmogorov-Smirnov test to compare source data and target data, before and after the source data is adapted, ACD is the abbreviation for Adapted Credit card and Debt consolidation data.
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| installment | CD to MD | 0.3005 | <0.01 | ACD to MD | 0.0293 | <0.64 | 90.3% |
| annual_inc | CD to MD | 0.0670 | <0.01 | ACD to MD | 0.0369 | <0.34 | 44.8% |
| loan_amnt | CD to MD | 0.2899 | <0.01 | ACD to MD | 0.0681 | <0.01 | 76.5% |
| cover | CD to MD | 0.2736 | <0.01 | ACD to MD | 0.0892 | <0.01 | 67.4% |
| revol_util | CD to SB | 0.2292 | <0.01 | ACD to SB | 0.0536 | <0.01 | 76.6% |
| dti | CD to SB | 0.2295 | <0.01 | ACD to SB | 0.0301 | <0.39 | 86.9% |