| Literature DB >> 30052681 |
Cagan Urkup1, Burcin Bozkaya2, F Sibel Salman1.
Abstract
The rapid growth of mobile payment and geo-aware systems as well as the resulting emergence of Big Data present opportunities to explore individual consuming patterns across space and time. Here we analyze a one-year transaction dataset of a leading commercial bank to understand to what extent customer mobility behavior and financial indicators can predict the use of a target product, namely the Individual Consumer Loan product. After data preprocessing, we generate 13 datasets covering different time intervals and feature groups, and test combinations of 3 feature selection methods and 10 classification algorithms to determine, for each dataset, the best feature selection method and the most influential features, and the best classification algorithm. We observe the importance of spatio-temporal mobility features and financial features, in addition to demography, in predicting the use of this exemplary product with high accuracy (AUC = 0.942). Finally, we analyze the classification results and report on most interesting customer characteristics and product usage implications. Our findings can be used to potentially increase the success rates of product recommendation systems.Entities:
Mesh:
Year: 2018 PMID: 30052681 PMCID: PMC6063431 DOI: 10.1371/journal.pone.0201197
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Descriptive statistics for the sample data.
| 75.2 | |
| 46.8 | |
| 34.0 | |
| 44.7 | |
| 65.1 | |
| 100.9 |
Fig 1Percentile distribution of top 10,000 customers’ transaction counts.
Dataset explanations.
| Dataset | Transactions Spanning Months | Feature Types |
|---|---|---|
| Does not Depend on Month | Demographic | |
| 1-11 | Demographic & Mobility | |
| 1-11 | Demographic & Financial | |
| 1-11 | Demographic & Financial & Mobility | |
| 3-11 | Demographic & Mobility | |
| 3-11 | Demographic & Financial | |
| 3-11 | Demographic & Financial & Mobility | |
| 6-11 | Demographic & Mobility | |
| 6-11 | Demographic & Financial | |
| 6-11 | Demographic & Financial & Mobility | |
| 9-11 | Demographic & Mobility | |
| 9-11 | Demographic & Financial | |
| 9-11 | Demographic & Financial & Mobility |
Performance measures for different datasets.
| Performance Measures | |||||
|---|---|---|---|---|---|
| Dataset | AUC | Accuracy | Precision | Recall | F1 |
| 0.669 | 62.431 | 0.537 | 0.646 | 0.587 | |
| 0.835 | 80.233 | 0.799 | 0.636 | 0.696 | |
| 0.918 | 82.856 | 0.874 | 0.767 | 0.817 | |
| 0.924 | 86.922 | 0.858 | 0.828 | 0.843 | |
| 0.847 | 79.067 | 0.739 | 0.713 | 0.715 | |
| 0.924 | 86.056 | 0.850 | 0.787 | 0.817 | |
| 0.931 | 87.211 | 0.846 | 0.802 | 0.823 | |
| 0.869 | 79.733 | 0.761 | 0.683 | 0.709 | |
| 0.931 | 85.456 | 0.843 | 0.821 | 0.832 | |
| 0.940 | 85.651 | 0.838 | 0.854 | 0.846 | |
| 0.878 | 79.500 | 0.767 | 0.715 | 0.730 | |
| 0.932 | 86.056 | 0.855 | 0.812 | 0.833 | |
| 0.942 | 85.556 | 0.868 | 0.830 | 0.849 | |
Fig 2ICL 3 month AUC results.
Fig 5ICL 11 month AUC results.
Results of t-tests (p-values) for comparison of mean performance measures.
| 2.76E-40 | 1.33E-47 | 1.33E-50 | |
| 2.21E-40 | 2.96E-39 | ||
| 7.85E-10 | |||
| 8.20E-21 | 6.14E-25 | 1.78E-27 | |
| 5.21E-05 | 5.78E-11 | ||
| 2.23E-06 | |||
| 2.25E-49 | 4.87E-50 | 4.18E-53 | |
| 3.68E-31 | 2.32E-34 | ||
| 1.30E-10 | |||
| 7.07E-13 | 1.21E-39 | 4.99E-45 | |
| 4.28E-46 | 9.78E-49 | ||
| 3.98E-29 | |||
| 4.21E-44 | 6.36E-51 | 4.13E-57 | |
| 1.27E-43 | 1.68E-48 | ||
| 1.94E-25 |
Feature importances for prediction of ICL product usage.
| Feature Importance Rankings | ||
|---|---|---|
| Feature Name | Importance | Effect |
| No. of Active Products | 25.74% | Positive |
| Mean Credit Score | 13.25% | Positive |
| cluster+d−l−r | 9.86% | Negative |
| Income | 9.84% | Positive |
| radial+d−l−r | 9.18% | Negative |
| spatial_cluster_regularity | 8.46% | Positive |
| Average Balance | 7.59% | Positive |
| spatial_radial_regularity | 5.86% | Positive |
| weekly_regularity | 5.71% | Positive |
| week+d+l+r | 4.51% | Positive |
Final testing performance measures for different datasets.
| Performance Measures | |||||
|---|---|---|---|---|---|
| Dataset | AUC | Accuracy | Precision | Recall | F1 |
| 0.681 | 62.421 | 0.527 | 0.632 | 0.575 | |
| 0.835 | 79.077 | 0.721 | 0.724 | 0.723 | |
| 0.940 | 86.040 | 0.834 | 0.802 | 0.817 | |
| 0.942 | 87.199 | 0.860 | 0.786 | 0.821 | |
| 0.821 | 80.246 | 0.787 | 0.618 | 0.693 | |
| 0.928 | 82.872 | 0.858 | 0.777 | 0.815 | |
| 0.938 | 86.906 | 0.869 | 0.814 | 0.840 | |
| 0.853 | 79.747 | 0.749 | 0.699 | 0.723 | |
| 0.914 | 85.467 | 0.827 | 0.805 | 0.816 | |
| 0.955 | 85.635 | 0.853 | 0.844 | 0.848 | |
| 0.893 | 79.488 | 0.751 | 0.705 | 0.727 | |
| 0.916 | 86.072 | 0.845 | 0.802 | 0.823 | |
| 0.955 | 85.571 | 0.881 | 0.845 | 0.863 | |