| Literature DB >> 30901048 |
Wei-Hsuan Lo-Ciganic1, James L Huang1, Hao H Zhang2, Jeremy C Weiss3, Yonghui Wu4, C Kent Kwoh5, Julie M Donohue6, Gerald Cochran7, Adam J Gordon7,8, Daniel C Malone9, Courtney C Kuza10, Walid F Gellad10,11,12.
Abstract
Importance: Current approaches to identifying individuals at high risk for opioid overdose target many patients who are not truly at high risk. Objective: To develop and validate a machine-learning algorithm to predict opioid overdose risk among Medicare beneficiaries with at least 1 opioid prescription. Design, Setting, and Participants: A prognostic study was conducted between September 1, 2017, and December 31, 2018. Participants (n = 560 057) included fee-for-service Medicare beneficiaries without cancer who filled 1 or more opioid prescriptions from January 1, 2011, to December 31, 2015. Beneficiaries were randomly and equally divided into training, testing, and validation samples. Exposures: Potential predictors (n = 268), including sociodemographics, health status, patterns of opioid use, and practitioner-level and regional-level factors, were measured in 3-month windows, starting 3 months before initiating opioids until loss of follow-up or the end of observation. Main Outcomes and Measures: Opioid overdose episodes from inpatient and emergency department claims were identified. Multivariate logistic regression (MLR), least absolute shrinkage and selection operator-type regression (LASSO), random forest (RF), gradient boosting machine (GBM), and deep neural network (DNN) were applied to predict overdose risk in the subsequent 3 months after initiation of treatment with prescription opioids. Prediction performance was assessed using the C statistic and other metrics (eg, sensitivity, specificity, and number needed to evaluate [NNE] to identify one overdose). The Youden index was used to identify the optimized threshold of predicted score that balanced sensitivity and specificity.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30901048 PMCID: PMC6583312 DOI: 10.1001/jamanetworkopen.2019.0968
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Performance Matrix of Machine-Learning Models for Predicting Opioid Overdose in Medicare Beneficiaries
The 4 prediction performance matrixes in the validation sample are the area under the receiver operating characteristic curve (AUC) or C statistic (A); the precision-recall curves, which have improved performance if they are closer to the upper right corner or above the other method (B); the number needed to evaluate (NNE) by different cutoffs of sensitivity (C); and alerts per 100 patients by different cutoffs of sensitivity (D).
DNN indicates deep neural network; GBM, gradient boosting machine; LASSO, least absolute shrinkage and selection operator–type regularized regression; MLR, multivariate logistic regression; and RF, random forest.
Prediction Performance of Gradient Boosting Machine and Deep Neural Network Models in the Validation Sample Divided Into Risk Subgroups
| Performance Metric | GBM | DNN | ||||
|---|---|---|---|---|---|---|
| Low Risk | Medium Risk | High Risk | Low Risk | Medium Risk | High Risk | |
| Total, No. (%) | 144 860 (77.6) | 32 415 (17.4) | 9411 (5.0) | 142 180 (76.2) | 34 759 (18.6) | 9747 (5.2) |
| Predicted score, median (range) | 14.6 (1.4-39.0) | 55.4 (39.0-77.7) | 83.8 (77.7-93.8) | 14.2 (2.1-46.5) | 61.6 (46.5-81.9) | 88.7 (81.9-99.7) |
| No. of actual overdose episodes (% of each subgroup) | 11 (0.01) | 26 (0.08) | 54 (0.57) | 9 (0.01) | 26 (0.07) | 56 (0.57) |
| No. of actual nonoverdose episodes (% of each subgroup) | 144 849 (99.99) | 32 389 (99.92) | 9357 (99.43) | 142 171 (99.99) | 34 733 (99.93) | 9691 (99.43) |
| Sensitivity, % | 0 | 100 | 100 | 0 | 100 | 100 |
| PPV, % | NA | 0.08 | 0.57 | NA | 0.07 | 0.57 |
| NNE | NA | 1247 | 174 | NA | 1337 | 174 |
| Specificity, % | 100 | 0 | 0 | 100 | 0 | 0 |
| NPV, % | 99.99 | NA | NA | 99.99 | NA | NA |
| Overall No. of misclassified overdose episodes (% of overall cohort)c | 11 (0.006) | 32 389 (17.4) | 9357 (5.0) | 9 (0.005) | 34 733 (18.6) | 9691 (5.2) |
| % of All overdose episodes captured over 3 mo (n = 91) | 12.1 | 29.6 | 59.3 | 9.9 | 28.6 | 61.5 |
Abbreviations: DNN, deep neural network; GBM, gradient boosting machine; NA, not able to be calculated owing to 0 denominator; NNE, number needed to evaluate; NPV, negative predictive value; PPV, positive predictive value.
Risk subgroups were classified into low risk (score below the optimized threshold), medium risk (predicted score between the optimized threshold and the top fifth percentile score), and high risk (predicted score in the top fifth percentile). The optimized thresholds were 39 (or probability of 0.39) for GBM and 46.5 (or probability of 0.465) for DNN.
Predicted scores were calculated by the predicted probability of overdose multiplied by 100.
If classifying medium- and high-risk groups as overdose and low-risk group as nonoverdose, then the PPV and NNE were not able to be calculated for the low-risk group because this group was considered as nonoverdose. Similarly, the NPV was not able to calculate for the medium- and high-risk groups because these groups were considered as overdose. Detailed definitions of prediction performance metrics are provided in eFigure 3 in the Supplement.
Figure 2. Calibration Performance of Gradient Boosting Machine (GBM) and Deep Neural Network (DNN) by Risk Group
Risk subgroups were classified into 3 groups using the optimized threshold in the validation sample (n = 186 686): low risk (score below the optimized threshold), medium risk (predicted score between the optimized threshold, identified by the Youden index, and the top fifth percentile score), and high risk (predicted score in the top fifth percentile). The dashed line indicates the overall observed overdose rate without risk stratifications.
Figure 3. Top 50 Important Predictors for Opioid Overdose Selected by Gradient Boosting Machine
Rather than P values or coefficients, the gradient boosting machine reports the importance of predictors included in a model. Importance is a measure of each variable’s cumulative contribution toward reducing square error, or heterogeneity within the subset, after the data set is sequentially split according to that variable. Thus, importance reflects a variable’s significance in prediction. Absolute importance is then scaled to give relative importance, with a maximum importance of 100. For example, the top 10 important predictors identified from the gradient boosting machine model included total opioid dose (eg, >1500 morphine milligram equivalent [MME] during 3 months), diagnosis of alcohol use disorders or substance use disorders (AUD/SUD), mean daily opioid dose (eg, >32 MME), age, disability status, total number of opioid prescriptions (eg, >4), beneficiary’s state residency (eg, Florida, Kentucky, or New Jersey), type of opioid use (eg, with mixed schedules), total number of benzodiazepine prescription fills (eg, >3), and cumulative days of early prescription refills (eg, >19 days). ED indicates emergency department; FFS: fee-for-service.
Comparison of Prediction Performance Between Centers for Medicare & Medicaid Services Measures and Deep Neural Network Measures Over a 12-Month Period
| Performance Metric | DNN Measures | CMS Opioid Safety Measures | |||
|---|---|---|---|---|---|
| Low Risk | Medium Risk | High Risk | Low- or No-Risk Opioid Use | High-Risk Opioid Use | |
| Total, No. (%) | 112 548 (67.5) | 38 846 (23.3) | 15 186 (9.1) | 157 299 (94.4) | 9281 (5.5) |
| Predicted score, median (range) | 14.0 (2.1-46.5) | 62.8 (46.5-81.9) | 88.1 (81.9-99.7) | NA | NA |
| No. of actual overdose episodes (% of each subgroup) | 7 (0.006) | 21 (0.05) | 269 (1.77) | 210 (0.13) | 87 (0.93) |
| No. of actual nonoverdose episodes (% of each subgroup) | 112 541 (99.99) | 38 825 (99.94) | 14 917 (98.22) | 157 089 (99.86) | 9194 (99.06) |
| Sensitivity, % | 0 | 100 | 100 | 0 | 100 |
| PPV, % | NA | 0.05 | 1.77 | NA | 0.93 |
| NNE | NA | 2000 | 56 | NA | 108 |
| Specificity, % | 100 | 0 | 0 | 100 | 0 |
| NPV, % | 99.99 | NA | NA | 99.86 | NA |
| Overall No. of misclassified overdose episodes (% of overall cohort) | 7 (0.004) | 38 825 (23.3) | 14 917 (8.95) | 210 (0.12) | 9194 (5.51) |
| % of all overdose episodes captured over 12 mo (n = 297) | 2.35 | 7.07 | 90.57 | 70.7 | 29.29 |
Abbreviations: CMS, Centers for Medicare & Medicaid Services; DNN, deep neural network; NA, not able to calculate; NNE, number needed to evaluate; NPV, negative predictive value; PPV, positive predictive value.
In contrast to Table 1, the measures were defined according to a 12-month period rather than a 3-month period. The sample size was smaller than in the main analysis because it required people to have at least 12 months of follow-up.
The 2019 CMS opioid safety measures are meant to identify high-risk individuals or utilization behavior.[64] These measures include 3 metrics: (1) high-dose use, defined as higher than 120 morphine milligram equivalent (MME) for 90 or more continuous days, (2) 4 or more opioid prescribers and 4 or more pharmacies, and (3) concurrent opioid and benzodiazepine use for 30 or more days.
If classifying medium- and high-risk groups as overdose for DNN and low-risk group as nonoverdose, then individuals with actual nonoverdose in these 2 groups were misclassified. If classifying those with any of CMS high-risk opioid use measures as overdose, and the remaining group considered as nonoverdose, then individuals with actual nonoverdose in the high-risk groups were misclassified. The PPV and NNE were not able to calculate for the low-risk group because this group was considered as nonoverdose. Similarly, the NPV was not able to calculate for the medium- and high-risk groups because these groups were considered as overdose. Detailed definitions of prediction performance metrics are provided in eFigure 3 in the Supplement.