| Literature DB >> 34121790 |
Manjeevan Seera1, Chee Peng Lim2, Ajay Kumar3, Lalitha Dhamotharan4, Kim Hua Tan5.
Abstract
Payment cards offer a simple and convenient method for making purchases. Owing to the increase in the usage of payment cards, especially in online purchases, fraud cases are on the rise. The rise creates financial risk and uncertainty, as in the commercial sector, it incurs billions of losses each year. However, real transaction records that can facilitate the development of effective predictive models for fraud detection are difficult to obtain, mainly because of issues related to confidentially of customer information. In this paper, we apply a total of 13 statistical and machine learning models for payment card fraud detection using both publicly available and real transaction records. The results from both original features and aggregated features are analyzed and compared. A statistical hypothesis test is conducted to evaluate whether the aggregated features identified by a genetic algorithm can offer a better discriminative power, as compared with the original features, in fraud detection. The outcomes positively ascertain the effectiveness of using aggregated features for undertaking real-world payment card fraud detection problems.Entities:
Keywords: Classification; Feature aggregation; Fraud detection; Payment card; Predictive modeling
Year: 2021 PMID: 34121790 PMCID: PMC8186361 DOI: 10.1007/s10479-021-04149-2
Source DB: PubMed Journal: Ann Oper Res ISSN: 0254-5330 Impact factor: 4.854
Summary of review
| Features | References | Data set | Classifier | Remarks |
|---|---|---|---|---|
| Original | de Sá et al. ( | Brazilian company | Bayesian | Improved efficiency by 72.64% |
| Van Vlasselaer et al. ( | Worldline Belgium | APATE | Best AUC acquired with addition of customer spending history | |
| Russac et al. ( | Word2Vec | Performance improved by 3% | ||
| Gómez et al. ( | BBVA bank | MLP | Solution comparable with costly ones | |
| Jurgovsky et al. ( | Credit card data | RF, LSTM | RF + LSTM could result in a better fraud detection system | |
| Robinson and Aria ( | CardCom | HMM | Able to detect fraudulent cases in real-time | |
| Rtayli and Enneya ( | Credit card data | Hybrid SVM | Recursive feature elimination and hyper-parameters optimization | |
| Zhu et al. ( | Benchmark data | WELM | Dandelion algorithm with probability-based mutation outperforms particle swarm optimization | |
| Forough and Momtazi ( | Credit card data | Deep learning | Efficient real-time performance | |
| Aggregated | Bahnsen et al. ( | European bank | DT, LOR, RF | Average saving of 13% was achieved |
| Dal Pozzolo et al. ( | European bank | Ensemble | Lower degree of influence of feedback led to less precise alerts | |
| Fu et al. ( | Commercial bank | CNN | Able to identify the patterns of fraud and produce better performance | |
| Jiang et al. ( | Simulated data | RF | 80% detection accuracy | |
| Lim et al. ( | RF, | Aggregation-based methods outperformed transaction-based methods | ||
| Lucas et al. ( | Credit card data | HMM | Useful feature engineering for classification | |
| Zhang et al. ( | Commercial bank in China | HOBA | Aggregation based on previous and incoming transactions |
Distribution of data samples
| Data set | Features | Class 1 | Class 2 | Total |
|---|---|---|---|---|
| Statlog (German credit) | 24 | 300 | 700 | 1000 |
| Statlog (Australian credit) | 14 | 307 | 383 | 690 |
| Default of credit card | 23 | 6636 | 23,364 | 30,000 |
Accuracy results (best in bold)
| Model | German (%) | Australia (%) | Card (%) |
|---|---|---|---|
| NB | 72.700 | 80.000 | 70.700 |
| DT | 70.000 | 83.188 | 81.973 |
| RF | 70.000 | 81.304 | 77.950 |
| GBT | 74.000 | ||
| DS | 69.900 | 85.507 | 81.960 |
| RT | 70.000 | 70.725 | 79.153 |
| ANN | 70.200 | 83.188 | 81.827 |
| MLP | 73.900 | 85.652 | 81.963 |
| LIR | 85.797 | 79.920 | |
| LOR | 76.500 | 86.087 | 81.050 |
| SVM | 75.700 | 85.362 | 80.863 |
| RI | 71.900 | 85.072 | 78.740 |
| DL | 72.600 | 85.217 | 81.737 |
AUC results (best in bold)
| Model | German | Australia | Card |
|---|---|---|---|
| NB | 0.763 | 0.901 | 0.736 |
| DT | 0.500 | 0.860 | 0.643 |
| RF | 0.590 | 0.908 | 0.678 |
| GBT | 0.767 | ||
| DS | 0.500 | 0.862 | 0.644 |
| RT | 0.506 | 0.682 | 0.545 |
| ANN | 0.725 | 0.905 | 0.745 |
| MLP | 0.777 | 0.926 | 0.742 |
| LIR | 0.932 | 0.717 | |
| LOR | 0.789 | 0.934 | 0.723 |
| SVM | 0.786 | 0.929 | 0.709 |
| RI | 0.653 | 0.878 | 0.662 |
| DL | 0.770 | 0.934 | 0.772 |
Comparison of accuracy and AUC using the German data set (best in bold)
| Model | Accuracy (%) | AUC |
|---|---|---|
| DT (Feng et al., | 67.00 | 0.610 |
| NN (Feng et al., | 70.00 | 0.620 |
| SVM (Feng et al., | 71.50 | 0.550 |
| BagDT (Feng et al., | 73.20 | 0.622 |
| BagNN (Feng et al., | 76.00 | 0.672 |
| BagSVM (Feng et al., | 75.00 | 0.651 |
| RF (Feng et al., | 74.00 | 0.635 |
| 75.20 | 0.759 | |
| NB (Jadhav et al., | 73.70 | 0.767 |
| LIR |
Comparison of accuracy and AUC using the Australian data set (best in bold)
| Model | Accuracy (%) | AUC |
|---|---|---|
| DT (Feng et al., | 82.10 | 0.820 |
| NN (Feng et al., | 85.30 | 0.855 |
| SVM (Feng et al., | 85.40 | 0.860 |
| BagDT (Feng et al., | 85.50 | 0.860 |
| BagNN (Feng et al., | 86.00 | 0.860 |
| BagSVM (Feng et al., | 85.50 | 0.870 |
| RF (Feng et al., | 0.860 | |
| 85.70 | 0.878 | |
| NB (Jadhav et al., | 80.43 | 0.913 |
| GBT | 86.23 |
Comparison of accuracy and AUC using the Card data set (best in bold)
| Model | Accuracy (%) | |
|---|---|---|
| DT (Feng et al., | 82.00 | 0.665 |
| NN (Feng et al., | 82.05 | 0.660 |
| SVM (Feng et al., | 82.00 | 0.643 |
| BagDT (Feng et al., | 82.00 | 0.665 |
| BagNN (Feng et al., | 82.00 | 0.660 |
| BagSVM (Feng et al., | 81.00 | 0.620 |
| RF (Feng et al., | 82.00 | 0.625 |
| 80.80 | 0.627 | |
| NB (Jadhav et al., | 71.36 | 0.699 |
| GBT |
List of features
| Features | D1 | D2 | D3 | D4 |
|---|---|---|---|---|
| Account no | ✓ | ✓ | ✓ | ✓ |
| Transaction amount | ✓ | ✓ | ✓ | |
| Transaction date | ✓ | ✓ | ✓ | ✓ |
| Transaction time | ✓ | ✓ | ✓ | ✓ |
| Device type | ✓ | ✓ | ✓ | |
| MCC | ✓ | |||
| Acquiring country | ✓ | ✓ | ✓ | |
| For country | ✓ | |||
| Transaction type | ✓ | ✓ | ||
| Transaction amount no | ✓ | ✓ | ✓ | |
| Transaction amount sum | ✓ | ✓ | ✓ | |
| Acquiring country no | ✓ | ✓ | ✓ | |
| Acquiring country sum | ✓ | ✓ | ✓ | |
| MCC no | ✓ | ✓ | ||
| MCC sum | ✓ | ✓ | ||
| Device type no | ✓ | ✓ | ||
| Device type sum | ✓ | ✓ |
Accuracy (ACC) results (best in bold)
| Methods | D1 (%) | D2 (%) | D3 (%) | D4 (%) |
|---|---|---|---|---|
| NB | 32.878 | 97.615 | 97.472 | 97.971 |
| DT | ||||
| RF | 99.996 | 99.993 | 99.996 | |
| GBT | 99.847 | 99.990 | 99.990 | 99.988 |
| DS | ||||
| RT | 99.968 | 99.969 | 99.985 | |
| ANN | 99.990 | 99.988 | 99.990 | |
| MLP | 99.966 | 99.969 | 99.975 | |
| LIR | 99.959 | 99.959 | 99.959 | |
| LOR | 99.958 | 99.972 | 99.971 | 99.975 |
| SVM | 99.965 | 99.966 | 99.969 | |
| RI | 99.959 | 99.959 | 99.959 | |
| DL | 95.113 | 99.984 | 99.985 | 99.993 |
MCC rates (best in bold)
| Methods | D1 | D2 | D3 | D4 |
|---|---|---|---|---|
| NB | 0.013 | 0.119 | 0.115 | 0.129 |
| DT | – | |||
| RF | – | 0.945 | 0.909 | 0.946 |
| GBT | 0.882 | 0.882 | 0.869 | |
| DS | – | |||
| RT | – | 0.648 | 0.591 | 0.835 |
| ANN | – | 0.882 | 0.863 | 0.882 |
| MLP | – | 0.704 | 0.732 | 0.767 |
| LIR | – | – | – | – |
| LOR | – | 0.749 | 0.729 | 0.756 |
| SVM | – | 0.709 | 0.717 | 0.732 |
| RI | – | – | – | – |
| DL | 0.032 | 0.794 | 0.821 | 0.906 |
Fig. 1Fraud detection rates of data sets 1–4
Fig. 2Non-fraud detection rates of data set 1–4
AUC results (best in bold)
| Methods | D1 | D2 | D3 | D4 |
|---|---|---|---|---|
| NB | 0.862 | 0.965 | 0.954 | 0.948 |
| DT | 0.500 | 0.500 | 0.500 | 0.500 |
| RF | 0.500 | 0.958 | 0.958 | 0.958 |
| GBT | ||||
| DS | 0.499 | 0.500 | 0.500 | 0.500 |
| RT | 0.620 | 0.584 | 0.633 | 0.650 |
| ANN | 0.809 | 0.943 | 0.955 | 0.950 |
| MLP | 0.784 | 0.960 | 0.958 | 0.959 |
| LIR | 0.847 | 0.964 | 0.961 | 0.965 |
| LOR | 0.833 | 0.952 | 0.937 | 0.939 |
| SVM | 0.573 | 0.942 | 0.963 | 0.959 |
| RI | 0.500 | 0.500 | 0.500 | 0.500 |
| DL | 0.845 | 0.951 | 0.929 | 0.932 |