| Literature DB >> 26395541 |
Zhongkai Hu1, Shiying Hao, Bo Jin, Andrew Young Shin, Chunqing Zhu, Min Huang, Yue Wang, Le Zheng, Dorothy Dai, Devore S Culver, Shaun T Alfreds, Todd Rogow, Frank Stearns, Karl G Sylvester, Eric Widen, Xuefeng Ling.
Abstract
BACKGROUND: The increasing rate of health care expenditures in the United States has placed a significant burden on the nation's economy. Predicting future health care utilization of patients can provide useful information to better understand and manage overall health care deliveries and clinical resource allocation.Entities:
Keywords: electronic medical record; health care costs; prospective studies; risk assessment; statistical data analysis
Mesh:
Year: 2015 PMID: 26395541 PMCID: PMC4642374 DOI: 10.2196/jmir.4976
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Cohort characteristics.
| Characteristic | Cohort | ||
|
| Retrospective (01/01/12-12/31/12) | Prospective (07/01/12-06/30/13) | |
|
|
|
| |
|
| Female | 669,021 (52.55) | 710,042 (52.28) |
|
| Male | 604,093 (47.45) | 648,111 (47.72) |
| Age (years), median (IQR) | 43.71 (22.40-60.87) | 43.76 (22.83-61.11) | |
| Family income estimate (US $), median (IQR) | 59,209 (49,148-68,589) | 58,984 (49,148-68,082) | |
|
|
|
| |
|
| Percent high-school graduate or higher | 90.50 (87.30-93.20) | 90.40 (87.20-92.80) |
|
| Percent bachelor’s degree or higher | 24.40 (17.90-33.00) | 23.90 (17.90-31.30) |
Figure 1Study design to develop the next 6-month health care resource utilization predictive algorithm. Maine HIE data were split into retrospective and prospective cohorts based on different time frames. A decision tree–based model estimated the health care resource utilization risks in the next 6 months by statistically learning the preceding 12-month clinical histories and were trained, calibrated, and blind tested with the retrospective cohort. The predictive risk model was then validated with the prospective cohort.
Electronic medical record features used to develop the model.
| Feature group | Feature description (12-month clinical history from January 1, 2012-December 31, 2012) |
| Encounter history (n=40) | Visit counts of different encounter types (emergency/outpatient/inpatient/preadmission) |
|
| The accumulated length of hospitalized stay |
|
| Historical resource utilization |
|
| Counts of historical chronic disease diagnoses |
|
| Counts of total and no redundant total laboratory tests and outpatient prescriptions |
| Demographics (n=7) | Income, education, payer |
|
| Age group was defined by age on January 1, 2013 (0, 1-5, 6-12, 13-18, 19-34, 35-49, 50-65, and ≥65 years) |
| Facility (n=8) | Different facilities |
| Diagnosis (n=14) | Counts for primary diagnosis and secondary diagnosis |
| Outpatient prescriptions (n=1) | Counts for different outpatient prescriptions |
Prospective results of our risk model predictive of next 6-month resource utilization (from July 1, 2013, to December 31, 2013).
| Result statistics | Predicted risk bin | |||||||||
|
| Low | Intermediate | High | |||||||
|
| 0-10 | 10-20 | 20-30 | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 |
| Patients, n (%)a | 571,538 (42.08) | 220,746 (16.25) | 147,853 (10.89) | 119,242 (8.78) | 79,152 (5.83) | 78,585 (5.79) | 59,134 (4.35) | 41,711 (3.1) | 25,264 (1.86) | 14,928 (1.10) |
| Estimated maximum resource utilization (US $) | 0 | 170 | 340 | 510 | 680 | 925 | 1870 | 2720 | 4625 | 13,301 |
| Resource utilization per personb (US $), mean (SD) | 353.69 (5539.44) | 425.86 (7034.20) | 449.89 (3194.92) | 690.62 (7982.95) | 868.47 (11,386.03) | 1315.39 (6624.14) | 2087.44 (14,347.49) | 3211.58 (20,286.97) | 4530.99 (12,796.93) | 6823.42 (21,814.22) |
| Confidence levelc | 0.784 | 0.735 | 0.805 | 0.783 | 0.754 | 0.723 | 0.805 | 0.790 | 0.796 | 0.889 |
a Patient percentage of each risk bin is defined as the percentage of patients in that bin of the total prospective population.
b Mean resource utilization per person in each risk bin is defined as the next 6-month mean resource utilization per person in that bin.
c Confidence level of each risk bin is defined as the proportion of patients in that bin with next 6-month resource utilization less than the estimated maximum resource utilization.
Figure 2The prospective performance of the model. Prospective validation of the model: the mean next 6-month resource utilization distribution (box-and-whisker plot) and the patient counts (gray bar) versus the predicted risks. The resource utilization distributions were calculated per 1000 patients per 6 months.
Figure 3Prospective analysis of next 6-month resource utilizations stratified by chronic diseases. Bubble chart of all 178 chronic diseases (red for diseases with top 20 patient counts and pink for others) stored in our database together with the nonchronic disease group (green). Each bubble represents a chronic disease group, demonstrating mean values of the next 6-month resource utilization and the risks of the patients diagnosed with that disease. The bubble diameter is proportional to the patient counts. Outliers are marked with black circles.
Figure 4Close examination of the prospective analysis of next 6-month resource utilizations stratified by the top 20 most common chronic diseases. The relationship between the resource utilization and risk score were smoothed by LOESS regression (solid line: the fitting curve; dashed line: the 0.9 confidence level boundaries) showing a good linearity with R-squared=.901 and P <.001.
Figure 5Schematic demonstration of data flow and communications of a population risk exploration system, which allows online real-time assessment of population resource utilization risk.