| Literature DB >> 34744496 |
Ralph Ward1,2, Erin Weeda1,3, David J Taber1,4, Robert Neal Axon1,5, Mulugeta Gebregziabher1,2.
Abstract
Veterans suffer disproportionate health impacts from the opioid epidemic, including overdose, suicide, and death. Prediction models based on electronic medical record data can be powerful tools for identifying patients at greatest risk of such outcomes. The Veterans Health Administration implemented the Stratification Tool for Opioid Risk Mitigation (STORM) in 2018. In this study we propose changes to the original STORM model and propose alternative models that improve risk prediction performance. The best of these proposed models uses a multivariate generalized linear mixed modeling (mGLMM) approach to produce separate predictions for overdose and suicide-related events (SRE) rather than a single prediction for combined outcomes. Further improvements include incorporation of additional data sources and new predictor variables in a longitudinal setting. Compared to a modified version of the STORM model with the same outcome, predictor and interaction terms, our proposed model has a significantly better prediction performance in terms of AUC (84% vs. 77%) and sensitivity (71% vs. 66%). The mGLMM performed particularly well in identifying patients at risk for SREs, where 72% of actual events were accurately predicted among patients with the 100,000 highest risk scores compared with 49.7% for the modified STORM model. The mGLMM's strong performance in identifying true cases (sensitivity) among this highest risk group was the most important improvement given the model's primary purpose for accurately identifying patients at most risk for adverse outcomes such that they are prioritized to receive risk mitigation interventions. Some predictors in the proposed model have markedly different associations with overdose and suicide risks, which will allow clinicians to better target interventions to the most relevant risks. Supplementary Information: The online version contains supplementary material available at 10.1007/s10742-021-00263-7. © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2021.Entities:
Keywords: Decision support; Opioid epidemic; Opioid safety; Risk prediction model
Year: 2021 PMID: 34744496 PMCID: PMC8561350 DOI: 10.1007/s10742-021-00263-7
Source DB: PubMed Journal: Health Serv Outcomes Res Methodol ISSN: 1387-3741
Population characteristics*
| ≥ 1 overdose N(%) | ≥ 1 suicide- related event N(%) | Overall N(%) | ||
|---|---|---|---|---|
| Group size | 165,680 (9.5) | 97,688 (5.6) | 1,744,667 | |
| Race-ethnicity | non-Hispanic white | 122,429 (10.2) | 64,073 (5.3) | 1,203,231 (69) |
| non-Hispanic Black | 31,688 (8.4) | 23,681 (6.3) | 375,726 (21.5) | |
| Hispanic | 6738 (6.9) | 6207 (6.4) | 97,052 (5.6) | |
| Other | 4525 (6.6) | 3727 (5.4) | 68,658 (3.9) | |
| Sex | Female | 12,866 (8) | 11,772 (7.3) | 160,905 (9.2) |
| Male | 152,514 (9.6) | 85,916 (5.4) | 1,583,762 (90.8) | |
| Age | Under 30 | 3468 (4.5) | 7806 (10.1) | 76,982 (4.4) |
| 30–50 | 18,185 (5.3) | 27,567 (8.1) | 341,299 (19.6) | |
| 51–65 | 61,206 (9.4) | 43,441 (6.6) | 653,531 (37.5) | |
| Over 65 | 82,521 (12.3) | 18,874 (2.8) | 672,855 (38.6) | |
| Service-related disability | < 50% | 87,999 (9.6) | 42,347 (4.6) | 913,268 (52.3) |
| ≥ 50% | 77,381 (9.3) | 55,341 (6.7) | 831,399 (47.7) | |
| Marital status | Unmarried | 87,299 (10) | 64,118 (7.3) | 877,311 (50.3) |
| Married | 78,081 (9) | 33,570 (3.9) | 867,356 (49.7) | |
| Urban rural location | Rural or highly rural | 60,120 (9.7) | 30,717 (5) | 619,142 (35.5) |
| Urban | 105,260 (9.4) | 66,971 (6) | 1,125,525 (64.5) | |
*Percentages in OD and SRE columns represent proportion of that subgroup having the outcome; percentages in total column represent that subgroup’s proportion of the full population
Prediction performance measures
| mGLMM | GLMM | Modified STORM | |
|---|---|---|---|
| Area under the ROC curve (AUC) and 95% CI* | 0.837 (0.836, 0.838) | 0.801 (0.80, 0.802) | 0.774 (0.772, 0.776) |
| Sensitivity | 0.71 | 0.69 | 0.66 |
| Specificity | 0.81 | 0.76 | 0.76 |
| Precision (PPV) | 0.09 | 0.135 | 0.13 |
| Negative predictive value (NPV) | 0.99 | 0.98 | 0.98 |
| Number needed to evaluate | 10.6 | 7.4 | 7.8 |
| Maximum Youden score (optimized threshold probability) | 0.52 | .45 | .42 |
Sensitivity, specificity, precision, negative predictive value, and number needed to evaluate were determined at the maximum Youden Index, an optimized threshold that occurs where the receiver operator curve reaches a maximum height above the diagonal line that represents 50% prediction performance (random chance). Sensitivity is the ratio of correctly screened (true positive) cases to the total actual cases; specificity is the ratio of correctly screened non-cases (true negatives) to the total non-cases; precision is the number of true positive cases to the total number screened positive; negative predictive value is the total number of true negative cases to the total number screened negative. Number needed to evaluate is the inverse of precision, and is the number of screened cases for each true positive. While AUC provides a measure of overall prediction performance, improvements in sensitivity are emphasized over other measures since it measures how well a model can identify true OD and SRE cases while minimizing the number of missed cases. Measures that assessed how well the model avoided false positive predictions (specificity and positive predictive value) were considered less important in this setting because correctly identifying cases enables the delivery of potentially life-saving interventions, while false positive results added little risk
*Delong test for difference in AUC: mGLMM vs. GLMM: p < 0.0001; GLMM vs. STORM: p < 0.0001; mGLMM vs. STORM: p < 0.0001
Fig. 1Area under the ROC curve comparison for mGLMM, GLMM and STORM replication models, with AUC values in parenthesis
Fig. 2We assessed prediction performance when the risk stratification cutpoint was set such that the patients with the highest 100,000 predicted risk scores were screened in order to compare how many overdose or suicide events would be correctly identified at this cutpoint (sensitivity) versus how many would be false negative or false positive prediction. For each model’s predictions in a validation cohort, we show numbers of true and false positives (TP, FP), true and false negatives (TN, FN), with corresponding sensitivity (SENS), specificity (SPEC), and positive predictive values (PPV). In the mGLMM model predicting SRE, SENS = 72%, meaning nearly 3 of 4 actual SREs were correctly identified in the validation data, compared with 50% in the modified STORM model. For OD, the mGLMM and STORM models produced very similar results (49.3% and 50%, respectively). Since the mGLMM model produced two predictions per patient (OD and SRE), we combined them using rules that prioritized true positives and false negatives (see footnote). The combined mGLMM OD-SRE predictions included 21,440 TP results, or 17.3% more than from modified STORM (18,240 TP) and 12.1% more than GLMM (19,129 TP). False positive (FP) results were 140,043, 80,872, and 81,715 for mGLMM, GLMM, and STORM, respectively, indicating more patients would be unnecessarily screened using the mGLMM approach in order to produce the gains in TP cases
Fig. 3Case examples. These examples demonstrate differences in model responses as each patient’s risk profile increases from one year to the next. For each patient, results are also shown for 4 different age ranges. The left side of each arrow shows the first year’s predicted risk percentile, and the right side shows the model’s estimate from the current year after new risk factors were added. At each point, a patient’s risk score was calculated using each model’s parameter estimates and the associated predictor variable values; the inverse logit function was applied to this linear predictor to produce the risk score estimate. Risk scores were expressed as percentiles to indicate relative risk among all patients in the opioid risk population. STORM model results are shown in blue; mGLMM OD results in orange, and mGLMM SRE results in purple