| Literature DB >> 34972129 |
Jong Wook Lee1, So Young Sohn1.
Abstract
Potential relationship among loan applicants can provide valuable information for evaluating default risk. However, most of the existing credit scoring models either ignore this relationship or consider a simple connection information. This study assesses the applicants' relation in terms of their distance estimated based on their characteristics. This information is then utilized in a proposed spatial probit model to reflect the different degree of borrowers' relation on the default prediction of loan applicant. We apply this method to peer-to-peer Lending Club Loan data. Empirical results show that the consideration of information on the spatial autocorrelation among loan applicants can provide high predictive power for defaults.Entities:
Mesh:
Year: 2021 PMID: 34972129 PMCID: PMC8719753 DOI: 10.1371/journal.pone.0261737
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Description of attributes used in this study.
| Type | Variable | Definition |
|---|---|---|
| Numeric | Annual income | The annual income provided by the borrower during registration |
| Debt to income | The borrower’s debt-to-income ratio: monthly payments on the total debt obligations, excluding mortgage, divided by self-reported monthly income | |
| Inquiries in the last six months | The number of inquiries by creditors during the past 6 months | |
| Loan amount | The listed amount of the loan applied for by the borrower | |
| Open accounts | The number of open credit lines in the borrower’s credit file | |
| Revolving balance | The total credit revolving balance | |
| Revolving utilization rate | The amount of credit the borrower is using relative to all available revolving credit | |
| Categorical | Employment length | Employment length in years: integers between 0 and 10, with 0 meaning less than one year and 10 meaning ten or more years |
| Grade | Lending Club categorizes borrowers into seven different loan grades from A down to G, A-grade being the safest. | |
| Home ownership | The home ownership status information provided by the borrower during registration: rent, own, and mortgage | |
| Loan length | The length of time (years) that workers have been with their current employer: 36 months, 60 months | |
| Loan purpose | Includes 14 loan purposes: wedding, credit card, car loan, major purchase, home improvement, debt consolidation, house, vacation, medical, moving, renewable energy, educational, small business, and other |
Fig 1The distribution of categories for each categorical attribute.
Result of the Welch`s T test for numeric attributes.
| Attributes | Fully paid loans | Defaulted loans | P-value |
|---|---|---|---|
| Annual income (log) | 11.0397 | 10.9587 | <0.0001 |
| Debt to income | 16.7235 | 18.2408 | <0.0001 |
| Inquiries in the last six months | 0.7908 | 0.9697 | <0.0001 |
| Loan amount (log) | 9.2938 | 9.4174 | <0.0001 |
| Open accounts | 11.1021 | 10.757 | <0.0001 |
| Revolving balance (log) | 9.2420 | 9.2568 | 0.29 |
| Revolving utilization rate | 57.4090 | 62.2409 | <0.0001 |
Result of the chi-squared test for categorical attributes.
| Attribute | Category | Fully Paid Loans | Defaulted Loans | Defaulted / Fully Paid Loans | Chi-squared test |
|---|---|---|---|---|---|
| Employment length | Short | 14,592 | 2,713 | 0.19 | 4.5902 (0.1) |
| Middle | 11,148 | 2,216 | 0.2 | ||
| Long | 11,272 | 2,151 | 0.19 | ||
| Grade | A | 7,757 | 561 | 0.07 | 1589.9 (<0.0001) |
| B | 13,493 | 1,918 | 0.14 | ||
| C | 8,251 | 1,863 | 0.23 | ||
| D or less | 7,511 | 2,738 | 0.36 | ||
| Home ownership | Mortgage | 17,935 | 3,153 | 0.18 | 37.839 (<0.0001) |
| Own | 2,762 | 543 | 0.2 | ||
| Rent | 16,315 | 3,384 | 0.21 | ||
| Loan length | 36 months | 31,030 | 4,815 | 0.16 | 978.29 (<0.0001) |
| 60 months | 5,982 | 2,265 | 0.38 | ||
| Loan purpose | Credit card | 7,365 | 1,074 | 0.15 | 99.942 (<0.0001) |
| Debt consolidation | 21,432 | 4,481 | 0.21 | ||
| Other | 8,215 | 1,525 | 0.19 |
Fig 2Test AUC variation with initial ρ0.
Result of the estimation of the baseline and SAR models.
| Baseline model | Spatial probit model | ||||||
|---|---|---|---|---|---|---|---|
| Estimate | Std. Error | Pr(>|Z|) | Estimate | Std. Error | Pr(>|Z|) | ||
| Intercept | -0.076 | 0.563 | 0.893 | -0.481 | 0.867 | 0.579 | |
| log(Annual income) | -1.714 | 0.546 | 0.002 | -0.417 | 0.559 | 0.455 | |
| Debt to income | 0.512 | 0.301 | 0.089 | 0.963 | 0.312 | 0.002 | |
| Inquiries in the last 6 months | 0.192 | 0.176 | 0.276 | -0.007 | 0.180 | 0.969 | |
| log(Loan amount) | 0.584 | 0.384 | 0.128 | -0.905 | 0.404 | 0.025 | |
| Open accounts | 0.444 | 0.361 | 0.219 | -0.557 | 0.371 | 0.133 | |
| log(Revolving balance) | -2.106 | 0.786 | 0.007 | 0.368 | 0.813 | 0.651 | |
| Revolving utilization rate | 0.591 | 0.327 | 0.071 | -0.766 | 0.334 | 0.022 | |
| Employment length (short) | -0.006 | 0.131 | 0.963 | -0.039 | 0.131 | 0.768 | |
| Employment length (long) | 0.155 | 0.142 | 0.275 | 0.104 | 0.143 | 0.469 | |
| Grade (B) | 0.457 | 0.194 | 0.018 | 0.677 | 0.328 | 0.038 | |
| Grade (C) | 0.805 | 0.213 | <0.001 | 1.085 | 0.362 | 0.003 | |
| Grade (D or less) | 1.081 | 0.235 | <0.001 | 1.394 | 0.393 | <0.001 | |
| Home ownership (Own) | -0.062 | 0.213 | 0.771 | -0.179 | 0.213 | 0.401 | |
| Home ownership (Rent) | 0.111 | 0.124 | 0.391 | 0.094 | 0.124 | 0.446 | |
| Loan length (60 months) | 0.581 | 0.163 | <0.001 | 0.488 | 0.182 | 0.007 | |
| Loan purpose (debt consolidation) | 0.269 | 0.155 | 0.083 | 0.129 | 0.154 | 0.404 | |
| Loan purpose (other) | 0.395 | 0.189 | 0.036 | 0.118 | 0.187 | 0.529 | |
| Spatial component ( | Estimate |
| p-value | ||||
| 0.505 | 273.282 | <0.001 | |||||
| Accuracy | 0.624 | 0.652 | |||||
| Precision | 0.63 | 0.619 | |||||
| Recall | 0.6 | 0.792 | |||||
| F1 score | 0.615 | 0.695 | |||||
| AUC | 0.696 | 0.713 | |||||
*, **, and *** represent significance at the 10%, 5%, and 1% levels, respectively.
Result of the estimation of the SAR model with 500 repetitions.
| Initial Rho | 0.2 | 0.5 | 0.8 |
|---|---|---|---|
| Mean | Mean | Mean | |
| Accuracy (Baseline model: 0.614) | 0.606 | 0.613 | 0.592 |
| Precision (Baseline model: 0.612) | 0.598 | 0.590 | 0.564 |
| Recall (Baseline model: 0.622) | 0.647 | 0.745 | 0.809 |
| F1 score (Baseline model: 0.617) | 0.621 | 0.658 | 0.664 |
| AUC (Baseline model: 0.660) | 0.650 | 0.665 | 0.652 |