| Literature DB >> 27610177 |
Zeinab Zare Hosseini1, Mahdi Mohammadzadeh2.
Abstract
The rapid growing of information technology (IT) motivates and makes competitive advantages in health care industry. Nowadays, many hospitals try to build a successful customer relationship management (CRM) to recognize target and potential patients, increase patient loyalty and satisfaction and finally maximize their profitability. Many hospitals have large data warehouses containing customer demographic and transactions information. Data mining techniques can be used to analyze this data and discover hidden knowledge of customers. This research develops an extended RFM model, namely RFML (added parameter: Length) based on health care services for a public sector hospital in Iran with the idea that there is contrast between patient and customer loyalty, to estimate customer life time value (CLV) for each patient. We used Two-step and K-means algorithms as clustering methods and Decision tree (CHAID) as classification technique to segment the patients to find out target, potential and loyal customers in order to implement strengthen CRM. Two approaches are used for classification: first, the result of clustering is considered as Decision attribute in classification process and second, the result of segmentation based on CLV value of patients (estimated by RFML) is considered as Decision attribute. Finally the results of CHAID algorithm show the significant hidden rules and identify existing patterns of hospital consumers.Entities:
Keywords: CRM; Data mining; Hospital; Knowledge discovery; Patient behavior; RFM
Year: 2016 PMID: 27610177 PMCID: PMC4986115
Source DB: PubMed Journal: Iran J Pharm Res ISSN: 1726-6882 Impact factor: 1.696
The recent researches of RFM method and data mining techniques in different industries
|
|
|
|
|
|
|---|---|---|---|---|
| WRFM | product recommendations to each customer group | Marketing in a business | Clustering(k-means) and association rules | Duen-Ren Liu and et al.(2005) |
| RFM | find out the characteristic of customer in order to strengthen CRM | electronic industry | Clustering (K-means) and classification(rough set theory-LEM2) | Ching-Hsue Cheng and et al. (2009) |
| RFM | Sequential pattern mining to discover customers’ purchasing patterns over time. | on-line retailers of electronic commerce | Sequential Pattern mining | Yen-Liang Chen and et al. (2010) |
| RFM_ WRFM | Cluster analysis to assess the customer loyalty | SAPCO Company | Clustering (K-means) | M. Seyed Hosseini and et al.(2010) |
| RFM_CRFM (C: count ) | Estimating customer lifetime value | a health and beauty company | Clustering (K-means) | M. Khajvand and et al. (2011) |
| GRFM (G: group refers to the customers’ purchase patterns | discover better customer | computer products | Clustering(PICC- Purchased items Constrained Clustering) | Hui-Chu Chang and et al.(2011) |
| RFM_RFMDR (D: discount and price, R: return times) | mine association rules of customer values | Online shopping | association rules (Supervised Apriori algorithm) | Wen-Yu Chiang and et al.(2011) |
| RFM | Identifying patients in target customer segments | hospital | Clustering (K-means) and classification(rough set theory) | You-Shyang Chen and et al.(2012) |
| LRFM (L: length, refers to the first and the last visit dates) | market segmentation of a children’s | dental clinic | Clustering((SOM) technique) | Jo-Ting Wei and et al.(2012) |
| WRFM | discover sequential patterns | applying the synthetic data generation algorithm | PrefixSpan algorithm (the conventional sequential pattern mining method) | Ya-Han Hu (2012) |
Descriptive statistics of the model variables
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
| 1 | 1511 | 924.806 | 394.597 | 155706.876 | 972 | 1506 | 234674 |
|
| 1 | 50 | 2.612 | 3.933 | 15.468 | 1 | 1 | 234674 |
|
| 1000 | 62272600 | 440167.1 | 1776964.302 | 3.1576E+12 | 72400 | 34000 | 234674 |
|
| 0 | 1507 | 136.054 | 297.359 | 88422.609 | 0 | 0 | 234674 |
Figure 1Flowchart of the proposed classification model
Descriptive statistics of five clusters
| | |
|
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Total N | % | Mean | Standard Deviation | Mean | Standard Deviation | Mean | Standard Deviation | Mean | Standard Deviation | ||
|
|
| 96609 | 41.17 | 1.58 | 1.43 | 21.08 | 66.82 | 225826.7 | 1190472 | 523.2 | 220.83 |
|
| 3396 | 1.45 | 25.84 | 8.66 | 1134.49 | 254.3 | 3957069 | 5407150 | 1407.77 | 118.36 | |
|
| 99675 | 42.47 | 1.67 | 1.47 | 18.47 | 49.11 | 377182.2 | 1639632 | 1191.51 | 187.53 | |
|
| 14060 | 5.99 | 6.87 | 3.91 | 987.44 | 188.3 | 842459.5 | 1780686 | 1345.89 | 133.05 | |
|
| 20934 | 8.92 | 5.23 | 3.95 | 492.73 | 149.13 | 888509.2 | 2576546 | 1147.11 | 222.46 | |
|
| 234674 | 100 | 2.61 | 3.93 | 136.05 | 297.36 | 440167.1 | 1776964 | 924.81 | 394.6 | |
Discrete values of continuous attributes in RFML model
| variables | Range | Discrete value |
|---|---|---|
| Recency_TILE5 | [1-543), [543-826), [826-1097), [1097-1323), [1323-1511] | R1, R2, R3, R4, R5 |
| Frequency_TILE5 | [1-2), [2-3), [3-4), [4-5), [5-50] | F1, F2, F3, F4, F5 |
| Monetary_TILE5 | [1000-34100), [34100-51002), [51002-111101), [111101-410300), [410300-62272600] | M1, M2, M3, M4, M5 |
| Length_TILE5 | [0-1), [1-2), [2-3), [3-167), [167-1507] | L1, L2, L3, L4, L5 |
Figure 3Efficacy of the variables in ascending order is: R=0.7766, L=0.2046, M=0.0096 and F=0.0092 in the classification by CLV
Comparison of the models accuracy and other algorithms
| method | Accuracy (%) |
|---|---|
| Classification by clustering (decision tree - CHAID) | 89.69% |
| Classification by CLV(decision tree- CHAID) | 80.98% |
| Decision tree-C5 | 89.58% |
| Neural network | 89.31% |
Extracted rules from classification model based on k-means clustering
|
|