| Literature DB >> 26827621 |
Yang Xie1, Günter Schreier2, Michael Hoy3, Ying Liu3, Sandra Neubauer4, David C W Chang3, Stephen J Redmond3, Nigel H Lovell5.
Abstract
Health insurers maintain large databases containing information on medical services utilized by claimants, often spanning several healthcare services and providers. Proper use of these databases could facilitate better clinical and administrative decisions. In these data sets, there exists many unequally spaced events, such as hospital visits. However, data mining of temporal data and point processes is still a developing research area and extracting useful information from such data series is a challenging task. In this paper, we developed a time series data mining approach to predict the number of days in hospital in the coming year for individuals from a general insured population based on their insurance claim data. In the proposed method, the data were windowed at four different timescales (bi-monthly, quarterly, half-yearly and yearly) to construct regularly spaced time series features extracted from such events, resulting in four associated prediction models. A comparison of these models indicates models using a half-yearly windowing scheme delivers the best performance on all three populations (the whole population, a senior sub-population and a non-senior sub-population). The superiority of the half-yearly model was found to be particularly pronounced in the senior sub-population. A bagged decision tree approach was able to predict 'no hospitalization' versus 'at least one day in hospital' with a Matthews correlation coefficient (MCC) of 0.426. This was significantly better than the corresponding yearly model, which achieved 0.375 for this group of customers. Further reducing the length of the analysis windows to three or two months did not produce further improvements.Keywords: Health care; Health insurance claims; Predictive modeling; Temporal data mining
Mesh:
Year: 2016 PMID: 26827621 DOI: 10.1016/j.jbi.2016.01.002
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317