| Literature DB >> 30840682 |
Hong J Kan1, Hadi Kharrazi1, Hsien-Yen Chang1, Dave Bodycombe1, Klaus Lemke1, Jonathan P Weiner1.
Abstract
BACKGROUND: Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practical and incremental step forward that provides transparency and interpretability within the familiar regression framework. This study conducted an in-depth comparison of prediction performance of standard and penalized linear regression in predicting future health care costs in older adults. METHODS ANDEntities:
Mesh:
Year: 2019 PMID: 30840682 PMCID: PMC6402678 DOI: 10.1371/journal.pone.0213258
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Prediction performance of models using 2012 predictors in predicting 2013 costs in the test set (n = 20,369).
| Mean predicted costs ($) | Mean actual costs ($) | R2 | RMSE ($) | MAPE ($) | PR | |
|---|---|---|---|---|---|---|
| OLS with all 2012 predictors | 16,299 | 16,284 | 16.3% | 35,801 | 15,331 | 1.001 |
| OLS with lasso selected variables | 16,307 | 16,284 | 16.6% | 35,749 | 15,237 | 1.001 |
| Ridge regression | 16,320 | 16,284 | 16.9% | 35,680 | 15,260 | 1.002 |
| Elastic net regression | ||||||
| 0.1 | 16,337 | 16,284 | 16.9% | 35,669 | 15,244 | 1.003 |
| 0.2 | 16,337 | 16,284 | 16.9% | 35,679 | 15,250 | 1.003 |
| 0.3 | 16,337 | 16,284 | 16.9% | 35,683 | 15,249 | 1.003 |
| 0.4 | 16,336 | 16,284 | 16.9% | 35,686 | 15,249 | 1.003 |
| 0.5 | 16,336 | 16,284 | 16.9% | 35,687 | 15,249 | 1.003 |
| 0.6 | 16,336 | 16,284 | 16.8% | 35,688 | 15,249 | 1.003 |
| 0.7 | 16,336 | 16,284 | 16.8% | 35,689 | 15,249 | 1.003 |
| 0.8 | 16,336 | 16,284 | 16.8% | 35,689 | 15,249 | 1.003 |
| 0.9 | 16,336 | 16,284 | 16.8% | 35,690 | 15,249 | 1.003 |
| Lasso regression | 16,336 | 16,284 | 16.8% | 35,690 | 15,249 | 1.003 |
OLS: ordinary least squares; RMSE: root mean squared error; MAPE: mean absolute prediction error; PR: prediction ratio; lasso: least absolute shrinkage and selection operator
Prediction performance of models using 2012 predictors in predicting 2013 costs in the test set, by deciles of predicted costs.
| Decile | N | Mean predicted costs ($) | Mean actual costs ($) | RMSE ($) | MAPE ($) | PR | Mean predicted costs ($) | Mean actual costs ($) | RMSE ($) | MAPE ($) | PR |
|---|---|---|---|---|---|---|---|---|---|---|---|
| OLS with all 2012 predictors | OLS with lasso selected predictors | ||||||||||
| 1 | 2,037 | 1,998 | 4,616 | 19,838 | 5,059 | 0.433 | 2,104 | 3,905 | 16,269 | 4,544 | 0.539 |
| 2 | 2,037 | 4,343 | 5,121 | 16,931 | 5,768 | 0.848 | 4,420 | 4,889 | 18,801 | 5,618 | 0.904 |
| 3 | 2,037 | 6,288 | 6,826 | 17,929 | 7,205 | 0.921 | 6,395 | 7,768 | 23,361 | 7,821 | 0.823 |
| 4 | 2,037 | 8,415 | 8,945 | 23,073 | 8,686 | 0.941 | 8,492 | 8,530 | 18,981 | 8,400 | 0.996 |
| 5 | 2,037 | 10,703 | 11,527 | 24,208 | 11,090 | 0.928 | 10,766 | 11,550 | 24,658 | 11,113 | 0.932 |
| 6 | 2,037 | 13,336 | 14,100 | 28,423 | 13,523 | 0.946 | 13,396 | 13,636 | 25,046 | 12,955 | 0.982 |
| 7 | 2,037 | 16,530 | 17,403 | 30,072 | 16,208 | 0.950 | 16,536 | 17,979 | 32,054 | 16,577 | 0.920 |
| 8 | 2,037 | 20,843 | 19,854 | 33,412 | 18,371 | 1.050 | 20,788 | 19,325 | 30,683 | 17,695 | 1.076 |
| 9 | 2,037 | 27,689 | 25,201 | 37,021 | 22,821 | 1.099 | 27,618 | 26,374 | 40,338 | 23,550 | 1.047 |
| 10 | 2,036 | 52,867 | 49,260 | 80,627 | 44,593 | 1.073 | 52,576 | 48,898 | 80,165 | 44,117 | 1.075 |
| Ridge regression | Lasso regression | ||||||||||
| 1 | 2,037 | 2,750 | 3,648 | 15,132 | 4,505 | 0.754 | 3,504 | 3,579 | 15,115 | 4,894 | 0.979 |
| 2 | 2,037 | 4,844 | 6,045 | 23,024 | 6,852 | 0.801 | 5,439 | 5,097 | 18,625 | 6,338 | 1.067 |
| 3 | 2,037 | 6,724 | 6,298 | 13,349 | 6,823 | 1.068 | 7,202 | 6,987 | 17,470 | 7,521 | 1.031 |
| 4 | 2,037 | 8,778 | 9,244 | 23,858 | 9,144 | 0.950 | 9,083 | 8,939 | 23,624 | 8,992 | 1.016 |
| 5 | 2,037 | 10,967 | 11,545 | 23,703 | 11,137 | 0.950 | 11,125 | 11,666 | 23,490 | 11,205 | 0.954 |
| 6 | 2,037 | 13,544 | 13,719 | 27,210 | 13,296 | 0.987 | 13,566 | 14,144 | 28,889 | 13,652 | 0.959 |
| 7 | 2,037 | 16,625 | 17,426 | 30,711 | 16,126 | 0.954 | 16,525 | 17,439 | 31,803 | 16,352 | 0.948 |
| 8 | 2,037 | 20,799 | 19,520 | 30,661 | 17,895 | 1.066 | 20,395 | 19,547 | 29,879 | 17,442 | 1.043 |
| 9 | 2,037 | 27,419 | 25,741 | 39,450 | 23,129 | 1.065 | 26,640 | 26,482 | 39,961 | 23,066 | 1.006 |
| 10 | 2,036 | 50,772 | 49,667 | 80,528 | 43,708 | 1.022 | 49,893 | 48,971 | 80,089 | 43,036 | 1.019 |
OLS: ordinary least squares; RMSE: root mean squared error; MAPE: mean absolute prediction error; PR: prediction ratio; lasso: least absolute shrinkage and selection operator
Prediction performance of models using 2009–2012 predictors in predicting 2013 in the test set (n = 20,369).
| mean predicted costs | mean actual costs | R2 | RMSE | MAPE | PR | |
|---|---|---|---|---|---|---|
| OLS with all 2009–2012 Predictors | 16,299 | 16,284 | 15.0% | 36,077 | 16,111 | 1.001 |
| OLS with lasso selected variables | 16,298 | 16,284 | 17.4% | 35,563 | 15,307 | 1.001 |
| ridge regression | 16,347 | 16,284 | 18.0% | 35,448 | 15,279 | 1.004 |
| elastic net regression | ||||||
| 0.1 | 16,351 | 16,284 | 18.2% | 35,402 | 15,208 | 1.004 |
| 0.2 | 16,348 | 16,284 | 18.1% | 35,419 | 15,208 | 1.004 |
| 0.3 | 16,347 | 16,284 | 18.1% | 35,427 | 15,207 | 1.004 |
| 0.4 | 16,347 | 16,284 | 18.0% | 35,431 | 15,207 | 1.004 |
| 0.5 | 16,347 | 16,284 | 18.0% | 35,434 | 15,207 | 1.004 |
| 0.6 | 16,347 | 16,284 | 18.0% | 35,435 | 15,207 | 1.004 |
| 0.7 | 16,346 | 16,284 | 18.0% | 35,437 | 15,207 | 1.004 |
| 0.8 | 16,346 | 16,284 | 18.0% | 35,438 | 15,207 | 1.004 |
| 0.9 | 16,346 | 16,284 | 18.0% | 35,438 | 15,207 | 1.004 |
| Lasso regression | 16,346 | 16,284 | 18.0% | 35,439 | 15,207 | 1.004 |
OLS: ordinary least squares; RMSE: root mean squared error; MAPE: mean absolute prediction error; PR: prediction ratio; lasso: least absolute shrinkage and selection operator
Prediction performance of models using 2009–2012 predictors in predicting 2013 costs in the test set, by deciles of predicted costs.
| Decile | n | Mean predicted costs ($) | Mean actual costs ($) | RMSE ($) | MAPE ($) | PR | Mean predicted costs ($) | Mean actual costs ($) | RMSE ($) | MAPE ($) | PR |
|---|---|---|---|---|---|---|---|---|---|---|---|
| OLS with all 2009–2012 Predictors | OLS with lasso selected predictors | ||||||||||
| 1 | 2,037 | -1,287 | 7,272 | 26,095 | 8,893 | -0.177 | 1,237 | 3,917 | 14,587 | 4,367 | 0.316 |
| 2 | 2,037 | 3,028 | 4,837 | 15,373 | 4,968 | 0.626 | 3,745 | 5,420 | 21,244 | 5,594 | 0.691 |
| 3 | 2,037 | 5,254 | 7,973 | 23,957 | 7,898 | 0.659 | 5,858 | 7,645 | 23,731 | 7,459 | 0.766 |
| 4 | 2,037 | 7,666 | 8,913 | 20,208 | 8,693 | 0.860 | 7,992 | 9,594 | 22,797 | 9,142 | 0.833 |
| 5 | 2,037 | 10,317 | 10,886 | 20,023 | 10,769 | 0.948 | 10,407 | 10,919 | 21,154 | 10,371 | 0.953 |
| 6 | 2,037 | 13,321 | 15,062 | 33,175 | 14,740 | 0.884 | 13,257 | 14,564 | 28,606 | 13,692 | 0.910 |
| 7 | 2,037 | 17,065 | 16,322 | 29,289 | 15,840 | 1.046 | 16,610 | 16,510 | 28,610 | 15,636 | 1.006 |
| 8 | 2,037 | 21,919 | 18,787 | 30,680 | 18,561 | 1.167 | 21,223 | 19,620 | 33,482 | 18,520 | 1.082 |
| 9 | 2,037 | 29,665 | 24,644 | 37,632 | 24,453 | 1.204 | 28,542 | 25,854 | 37,516 | 23,327 | 1.104 |
| 10 | 2,036 | 56,062 | 48,156 | 80,008 | 46,308 | 1.164 | 54,126 | 48,809 | 79,309 | 44,979 | 1.109 |
| Ridge regression | Lasso regression | ||||||||||
| 1 | 2,037 | 2,240 | 3,665 | 12,741 | 4,215 | 0.611 | 3,495 | 3,161 | 12,718 | 4,496 | 1.106 |
| 2 | 2,037 | 4,539 | 5,915 | 23,495 | 6,594 | 0.767 | 5,405 | 5,414 | 19,284 | 6,429 | 0.998 |
| 3 | 2,037 | 6,588 | 6,757 | 18,959 | 7,124 | 0.975 | 7,176 | 7,123 | 21,430 | 7,607 | 1.007 |
| 4 | 2,037 | 8,745 | 9,346 | 21,572 | 9,278 | 0.936 | 9,066 | 9,359 | 21,522 | 9,478 | 0.969 |
| 5 | 2,037 | 11,037 | 11,671 | 25,112 | 11,403 | 0.946 | 11,175 | 11,106 | 23,795 | 10,759 | 1.006 |
| 6 | 2,037 | 13,735 | 13,693 | 25,770 | 13,591 | 1.003 | 13,597 | 14,533 | 28,503 | 13,829 | 0.936 |
| 7 | 2,037 | 16,956 | 17,188 | 31,462 | 16,249 | 0.986 | 16,511 | 16,988 | 30,171 | 15,938 | 0.972 |
| 8 | 2,037 | 21,291 | 19,022 | 30,039 | 17,920 | 1.119 | 20,516 | 19,904 | 33,420 | 17,862 | 1.031 |
| 9 | 2,037 | 28,218 | 25,718 | 38,088 | 23,113 | 1.097 | 26,748 | 25,654 | 37,234 | 22,586 | 1.043 |
| 10 | 2,036 | 50,141 | 49,877 | 79,946 | 43,321 | 1.005 | 49,789 | 49,611 | 79,458 | 43,105 | 1.004 |
OLS: ordinary least squares; RMSE: root mean squared error; MAPE: mean absolute prediction error; PR: prediction ratio; lasso: least absolute shrinkage and selection operator