| Literature DB >> 28077408 |
Suzanne Tamang1, Arnold Milstein1, Henrik Toft Sørensen2, Lars Pedersen2, Lester Mackey3, Jean-Raymond Betterton3, Lucas Janson3, Nigam Shah1.
Abstract
OBJECTIVES: To compare the ability of standard versus enhanced models to predict future high-cost patients, especially those who move from a lower to the upper decile of per capita healthcare expenditures within 1 year-that is, 'cost bloomers'.Entities:
Keywords: high-cost patients; predictive analytics
Mesh:
Year: 2017 PMID: 28077408 PMCID: PMC5253526 DOI: 10.1136/bmjopen-2016-011580
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Year 1 model features for high-cost patient prediction are shown by data source, feature type (ie, traditional/non-traditional) and feature category
| Clinical registries | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Traditional features (6) | Non-traditional features (1053) | Civil reg. system | |||||||||
| Residents | Age | Gender | Disease risk | Costs | Costs | Clinical | Visits/Tx counts and LOS | Recency | Social relationship status | District | |
| ID1 | 45 | F | CCS disease score and CCI chronic condition score | Inpatient and outpatient specialist | Drug (Rx) | Primary care (PC) | CCS (247), CCI (44), ICD10 (211), NOMESCO (171), ATC (221) based categories | Counts by year and quarter: IOS, Rx, PC and surgeries; total inpatient length of stay (LOS) | Moving averages by quarter: diagnoses, costs, visits, Rx, inpatient LOS | Married-Widowed | 1 |
| ID2 | 34 | F | Single-Married | 4 | |||||||
| ID3 | 22 | M | Single | 2 | |||||||
| ID4 | 32 | M | Married | 2 | |||||||
| – | – | – | – | – | |||||||
| IDN | 71 | F | Widowed | 1 | |||||||
Each row represents a unique resident and example values for a feature category. The number of features and the data type appear below each feature category; for example, Traditional features, ‘Costs 2-numerical’ indicates that there are two traditional cost features in the feature category and each feature represents a numerical value.
Description of alternative standard and enhanced high-cost patient prediction models, presenting the feature types included, the statistical method used for prediction and the number of traditional, non-traditional and total model features
| Feature count | |||||
|---|---|---|---|---|---|
| Model | Feature description | Regression method | Traditional | Non-traditional | Total |
| Standard model 1 | Age+gender+disease risk scores | Standard | 4 | 0 | 4 |
| Standard model 2 | Age+gender+disease risk scores+hospital inpatient and specialist+Rx costs | Standard | 6 | 0 | 6 |
| Enhanced model 1 | Age+gender+disease risk scores+hospital inpatient and specialist+Rx costs+primary care costs | Standard | 6 | 1 | 7 |
| Enhanced model 2 | Age+gender+disease risk scores+hospital inpatient and specialist+Rx costs+social relationship status | Penalised | 6 | 71 | 77 |
| Enhanced model 3 | Full feature set without costs | Penalised | 6 | 1028 | 1034 |
| Enhanced model 4 | Full feature set | Penalised | 6 | 1053 | 1059 |
Figure 1Overview of our model development and evaluation framework. Three independent panel data sets were used for training (model fitting), tuning and testing steps. To evaluate alternative models, we calculated the ratio of predicted high-cost patient expenditures to actual high-cost patient expenditures in year 2.
Figure 2High-cost persistence in Western Denmark (N=2 146 801). Among the 314 989 individuals with any high-cost years, the bars show the per cent of high-cost patients by total high-cost years; colour saturation increases proportionally to the longest duration of consecutive high-cost years for each individual from 2004 to 2011.
Figure 3Proportion of chronic condition indicators among persistent high-cost patients (N=49 855) and cost bloomers (N=105 904). Bars show the per cent of patients with each indicator in the prior year, 2010; colour identifies the high-cost group.
Figure 4Age distribution of 2011 high-cost patients by high-cost status (N=155 756). Lines show the per cent of patients by age; colour distinguishes persistent high-cost or cost-bloom status. Persistent high-cost patients and cost bloomers had mean and median interquartile age ranges of 30 and 34, respectively.
Comparison of alternative models for predicting future high-cost patients at the population level and cost bloomers
| Alternative high-cost patient prediction models | |||||||
|---|---|---|---|---|---|---|---|
| Standard model 1 | Standard model 2 | Enhanced model 1 | Enhanced model 2 | Enhanced model 3 | Enhanced model 4 | ||
| Number of model features | |||||||
| Prediction sample | Metric | 4 (Baseline) | 6 | 7 | 77 | 1034 | 1059 |
| Whole-population analysis (N=1 557 950) | AUC | 0.775 | 0.814 | 0.825 | 0.823 | 0.823 | 0.836 |
| Cost capture | 0.495 | 0.559 | 0.577 | 0.579 | 0.578 | ||
| Cost-bloom analysis (N=1 402 155) | AUC | 0.719 | 0.748 | 0.772 | 0.765 | 0.771 | 0.786 |
| Cost capture | 0.376 | 0.443 | 0.455 | 0.461 | 0.466 | ||
Column headers indicate each model and the number of model features appears in parentheses. Results with the highest cost capture value are shown in bold.
Figure 5Performance of alternative cost-bloom prediction models by cost capture and relative improvement over the baseline. Bars show cost capture for each model; lines show the per cent increases in predictive power. More details on each model are provided in table 2.