| Literature DB >> 35384134 |
Marica Iommi1, Savannah Bergquist2, Gianluca Fiorentini3, Francesco Paolucci4,5.
Abstract
The Italian National Healthcare Service relies on per capita allocation for healthcare funds, despite having a highly detailed and wide range of data to potentially build a complex risk-adjustment formula. However, heterogeneity in data availability limits the development of a national model. This paper implements and ealuates machine learning (ML) and standard risk-adjustment models on different data scenarios that a Region or Country may face, to optimize information with the most predictive model. We show that ML achieves a small but generally statistically insignificant improvement of adjusted R2 and mean squared error with fine data granularity compared to linear regression, while in coarse granularity and poor range of variables scenario no differences were observed. The advantage of ML algorithms is greater in the coarse granularity and fair/rich range of variables set and limited with fine granularity scenarios. The inclusion of detailed morbidity- and pharmacy-based adjustors generally increases fit, although the trade-off of creating adverse economic incentives must be considered.Entities:
Keywords: data granularity; formula funding; health expenditure; machine learning; risk-adjustment
Mesh:
Year: 2022 PMID: 35384134 PMCID: PMC9320950 DOI: 10.1002/hec.4512
Source DB: PubMed Journal: Health Econ ISSN: 1057-9230 Impact factor: 2.395
Data setting scenarios to predict total health care expenditure
| Poor range of variables | Fair range of variables | Rich range of variables | |
|---|---|---|---|
| Demographic (DEM) | DEM + hospital discharge records (HDR) | DEM + HDR + pharmacy database (PD) | |
|
|
|
|
|
|
|
|
| |
|
| |||
|
|
|
|
|
|
| |||
|
|
|
| |
|
|
| ||
|
|
Abbreviations: ESRD, End‐Stage Renal Disease; HCC, Hierarchical Condition Category; HDR, hospital discharge records; PD, pharmacy database; RxHCC, Prescription Drug Hierarchical Condition Category.
For example, possible coding classification systems include the Clinical Classifications Software for ICD‐9‐CM (https://www.hcup‐us.ahrq.gov/toolssoftware/ccs/ccs.jsp) or the 2015 Risk Adjustment model software HCC, RxHCC, ESRD of Centers for Medicare & Medicaid Services (https://www.cms.gov/Medicare/Health‐Plans/MedicareAdvtgSpecRateStats/Risk‐Adjustors) or the DCG.
For example, Pharmacy Cost‐Group (PCG).
Description of the study population (n = 4,262,982)
| Total expenditure in 2016 | |||||||
|---|---|---|---|---|---|---|---|
| N | % Col | Mean | S. D. | 25th perc | Median | 75th perc | |
| Observations | 4,262,982 | 100% | 834.0 | 3306.0 | 13 | 87 | 405 |
| Male | 2,054,695 | 48.2% | 838.2 | 3618.0 | 6 | 59 | 344 |
| Female | 2,208,287 | 51.8% | 829.8 | 2986.9 | 24 | 114 | 460 |
| Age groups | |||||||
| 0 | 48,632 | 1.1% | 1358.1 | 4832.4 | 128 | 340 | 416 |
| 1–4 | 119,804 | 2.8% | 234.6 | 1875.0 | 11 | 30 | 82 |
| 5–9 | 157,598 | 3.7% | 167.7 | 1339.2 | 7 | 21 | 65 |
| 10–14 | 153,215 | 3.6% | 201.9 | 1448.7 | 6 | 29 | 99 |
| 15–19 | 144,999 | 3.4% | 244.1 | 1873.1 | 5 | 24 | 98 |
| 20–24 | 148,695 | 3.5% | 286.8 | 1870.2 | 3 | 22 | 91 |
| 25–29 | 184,774 | 4.3% | 322.5 | 1795.3 | 4 | 26 | 111 |
| 30–34 | 225,961 | 5.3% | 358.4 | 1730.4 | 5 | 31 | 131 |
| 35–39 | 279,342 | 6.6% | 374.3 | 1925.0 | 5 | 31 | 138 |
| 40–44 | 353,249 | 8.3% | 384.8 | 2267.9 | 2 | 32 | 141 |
| 45–49 | 389,168 | 9.1% | 424.2 | 2359.5 | 4 | 48 | 172 |
| 50–54 | 391,461 | 9.2% | 528.0 | 2883.7 | 4 | 57 | 223 |
| 55–59 | 313,161 | 7.3% | 743.2 | 3579.0 | 18 | 103 | 337 |
| 60–64 | 266,900 | 6.3% | 1008.0 | 3978.9 | 40 | 172 | 497 |
| 65–69 | 273,682 | 6.4% | 1378.1 | 4351.7 | 101 | 318 | 809 |
| 70–74 | 224,178 | 5.3% | 1771.5 | 4675.0 | 173 | 459 | 1145 |
| 75–79 | 226,081 | 5.3% | 2077.4 | 5020.8 | 237 | 568 | 1414 |
| 80–84 | 173,644 | 4.1% | 2277.6 | 4929.6 | 287 | 638 | 1690 |
| 85–89 | 116,705 | 2.7% | 2325.1 | 4592.5 | 275 | 617 | 2082 |
| 90+ | 71,733 | 1.7% | 2225.8 | 4002.1 | 205 | 506 | 2713 |
| Italian citizenship | |||||||
| No | 351,815 | 8.3% | 529.9 | 2761.0 | 8 | 50 | 220 |
| Yes | 3,911,167 | 91.7% | 861.2 | 3349.5 | 13 | 92 | 425 |
| Degree of urbanization | |||||||
| Urban | 1,548,231 | 36.3% | 857.8 | 3418.7 | 11 | 84 | 406 |
| Peri‐urban | 1,896,501 | 44.5% | 796.1 | 3191.1 | 13 | 83 | 385 |
| Rural | 818,250 | 19.2% | 875.9 | 3350.3 | 17 | 102 | 450 |
| Income | |||||||
| Exempt low income | 1,746,677 | 41.0% | 1161.1 | 3576.4 | 54 | 226 | 735 |
| Exempt middle‐low income | 583,624 | 13.7% | 720.9 | 2767.8 | 24 | 114 | 378 |
| Exempt middle‐high income | 98,832 | 2.3% | 696.8 | 2777.1 | 17 | 94 | 334 |
| Exempt high income | 397,233 | 9.3% | 797.9 | 3817.2 | 0 | 38 | 235 |
| Exempt other reason | 1,436,616 | 33.7% | 501.2 | 2994.1 | 1 | 22 | 119 |
| No. of hospitalization | |||||||
| 0 | 3,777,073 | 88.6% | 226.3 | 728.4 | 9 | 61 | 245 |
| 1 | 380,626 | 8.9% | 3768.6 | 5440.2 | 1362 | 2297 | 4272 |
| 2 | 71,425 | 1.7% | 9101.3 | 8904.1 | 3974 | 6841 | 11,434 |
| 3+ | 33,858 | 0.8% | 18,174.6 | 15,304.0 | 9145 | 14,355 | 22,410 |
| No. of pharmaceutical prescription | |||||||
| 0 | 931,697 | 21.9% | 205.5 | 1726.9 | 0 | 21 | 75 |
| 1 | 876,850 | 20.6% | 158.3 | 1244.2 | 0 | 5 | 28 |
| 2 | 326,076 | 7.6% | 350.6 | 1895.8 | 15 | 34 | 116 |
| 3+ | 2,128,359 | 49.9% | 1461.3 | 4313.9 | 115 | 322 | 904 |
| Diagnostic cost group (DCG) | |||||||
| No | 3,784,039 | 88.8% | 226.7 | 730.9 | 9 | 61 | 245 |
| Yes | 478,943 | 11.2% | 5631.1 | 8194.2 | 1614 | 3033 | 6466 |
| Pharmacy cost group (PCG) | |||||||
| No | 2,884,260 | 67.7% | 406.0 | 2302.7 | 5 | 34 | 131 |
| Yes | 1,378,722 | 32.3% | 1728.8 | 4639.1 | 186 | 446 | 1109 |
Abbreviations: DCG, Diagnostic cost group; PCG, Pharmacy cost group.
Adjusted‐R2 of the models compared to OLS
| Poor range of variables | Fair range of variables | Rich range of variables | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Models ranking | Adj. | RE | Models ranking | Adj. | RE | Models ranking | Adj. | RE | |
| Coarse granularity |
| 4.3% | 1.0 |
| 48.8% | 1.0 |
| 49.9% | 1.0 |
|
| 4.3% | 1.0 |
| 48.0% | 1.0 |
| 49.1% | 1.0 | |
|
| 4.3% | 1.0 |
|
|
|
| 48.9% | 1.0 | |
|
| 4.3% | 1.0 |
| 47.6% | 1.0 |
|
|
| |
|
| 4.3% | 1.0 |
| 47.6% | 1.0 |
| 48.5% | 1.0 | |
|
| 4.3% | 1.0 |
| 47.5% | 1.0 |
| 48.5% | 1.0 | |
|
|
|
|
| 47.3% | 1.0 |
| 48.2% | 1.0 | |
|
| 4.3% | 1.0 |
| Inc. value | ‐ |
| Inc. value | ‐ | |
|
| 4.3% | 1.0 |
| Inc. value | ‐ |
| Inc. value | ‐ | |
Abbreviations: GAM, generalized additive model; MSE, mean squared error; OLS, Ordinary least squares; RF, random forest; SL, super learner.
Average cross‐validated MSE of the models compared to OLS
| Poor range of variables | Fair range of variables | Rich range of variables | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Models ranking | MSE | RE | Models ranking | MSE | RE | Models ranking | MSE | RE | |
| Coarse granularity |
| 10,657,793 | 1.0 |
| 5,705,097 | 1.0 |
| 5,583,558 | 1.0 |
|
| 10,657,794 | 1.0 |
| 5,791,326 | 1.0 |
| 5,663,559 | 1.0 | |
|
| 10,657,813 | 1.0 |
|
|
|
| 5,689,220 | 1.0 | |
|
| 10,657,862 | 1.0 |
| 5,830,718 | 1.0 |
|
|
| |
|
|
|
|
| 5,830,867 | 1.0 |
| 5,735,406 | 1.0 | |
|
| 10,657,875 | 1.0 |
| 5,842,105 | 1.0 |
| 5,735,468 | 1.0 | |
|
| 10,657,875 | 1.0 |
| 5,863,916 | 1.0 |
| 5,766,763 | 1.0 | |
|
| 10,657,875 | 1.0 |
| 13,446,690 | 0.4 |
| 41,287,682 | 0.1 | |
|
| 10,658,065 | 1.0 |
| Inc. value | ‐ |
| Inc. value | ‐ | |
Abbreviations: GAM, generalized additive model; MSE, mean squared error; OLS, Ordinary least squares; RF, random forest; SL, super learner.
FIGURE 1Under/overcompensation by quintiles for each data scenario. The ridge and elastic net mean compensations are not shown because they were very similar to lasso. GLMs are not shown because some of their mean compensation's values were completely out of range. Top 1% spenders were not included in the graph representation due to the higher range of values compared to quintiles (mean under compensation range from −24,971 to −14,606 €)