Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes.

Literature DB >> 25105749

Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes.

Seyed Abdolmotalleb Izad Shenas¹, Bijan Raahemi², Mohammad Hossein Tekieh³, Craig Kuziemsky⁴.

Abstract

In this paper, we use data mining techniques, namely neural networks and decision trees, to build predictive models to identify very high-cost patients in the top 5 percentile among the general population. A large empirical dataset from the Medical Expenditure Panel Survey with 98,175 records was used in our study. After pre-processing, partitioning and balancing the data, the refined dataset of 31,704 records was modeled by Decision Trees (including C5.0 and CHAID), and Neural Networks. The performances of the models are analyzed using various measures including accuracy, G-mean, and Area under ROC curve. We concluded that the CHAID classifier returns the best G-mean and AUC measures for top performing predictive models ranging from 76% to 85%, and 0.812 to 0.942 units, respectively. We also identify a small set of 5 non-trivial attributes among a primary set of 66 attributes to identify the top 5% of the high cost population. The attributes are the individual׳s overall health perception, age, history of blood cholesterol check, history of physical/sensory/mental limitations, and history of colonic prevention measures. The small set of attributes are what we call non-trivial and does not include visits to care providers, doctors or hospitals, which are highly correlated with expenditures and does not offer new insight to the data. The results of this study can be used by healthcare data analysts, policy makers, insurer, and healthcare planners to improve the delivery of health services.

Entities: Chemical Disease Species

Keywords: Data mining; Decision tree; Healthcare expenditures; Medical Expenditure Panel Survey; Predictive models

Mesh：

Year: 2014 PMID： 25105749 DOI： 10.1016/j.compbiomed.2014.07.005

Source DB: PubMed Journal: Comput Biol Med ISSN： 0010-4825 Impact factor: 4.589

Keyword Cloud
Cited

5 in total

1. Exploring the characteristics of the high-cost population from the family perspective: a cross-sectional study in Jiangsu Province, China.

Authors: Yudong Miao; Dongfu Qian; Sandeep Sandeep; Ting Ye; Yadong Niu; Dan Hu; Liang Zhang
Journal: BMJ Open Date: 2017-11-09 Impact factor: 2.692

2. Systematic review of high-cost patients' characteristics and healthcare utilisation.

Authors: Joost Johan Godert Wammes; Philip J van der Wees; Marit A C Tanke; Gert P Westert; Patrick P T Jeurissen
Journal: BMJ Open Date: 2018-09-08 Impact factor: 2.692

3. Characterising and predicting persistent high-cost utilisers in healthcare: a retrospective cohort study in Singapore.

Authors: Sheryl Hui Xian Ng; Nabilah Rahman; Ian Yi Han Ang; Srinath Sridharan; Sravan Ramachandran; Debby Dan Wang; Astrid Khoo; Chuen Seng Tan; Mengling Feng; Sue-Anne Ee Shiow Toh; Xin Quan Tan
Journal: BMJ Open Date: 2020-01-06 Impact factor: 2.692

4. The Efficacy of Machine-Learning-Supported Smart System for Heart Disease Prediction.

Authors: Nurul Absar; Emon Kumar Das; Shamsun Nahar Shoma; Mayeen Uddin Khandaker; Mahadi Hasan Miraz; M R I Faruque; Nissren Tamam; Abdelmoneim Sulieman; Refat Khan Pathan
Journal: Healthcare (Basel) Date: 2022-06-18

5. Neural Network-Based Coronary Heart Disease Risk Prediction Using Feature Correlation Analysis.

Authors: Jae Kwon Kim; Sanggil Kang
Journal: J Healthc Eng Date: 2017-09-06 Impact factor: 2.682

5 in total