| Literature DB >> 34383925 |
Hale M Thompson1, Brihat Sharma1, Sameer Bhalla1, Randy Boley1, Connor McCluskey1, Dmitriy Dligach2, Matthew M Churpek3, Niranjan S Karnik1, Majid Afshar3.
Abstract
OBJECTIVES: To assess fairness and bias of a previously validated machine learning opioid misuse classifier. MATERIALS &Entities:
Keywords: bias and fairness; interpretability; machine learning; natural language processing; opioid use disorder; structural racism
Mesh:
Year: 2021 PMID: 34383925 PMCID: PMC8510285 DOI: 10.1093/jamia/ocab148
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 7.942
Test characteristics and 95% confidence intervals across age, sex, and racial/ethnic subgroups of the external validation cohort (N = 53 974)*
| True positive rate | True negative rate | False positive rate | False negative rate | Precision | |
|---|---|---|---|---|---|
| Age in years | |||||
| 18–44 | 0.837 | 0.991 | 0.009 | 0.163 | 0.646 |
| (0.787–0.880) | (0.990–0.993) | (0.007–0.011) | (0.121–0.213) | (0.593–0.697) | |
| 45–60 | 0.731 | 0.991 | 0.009 | 0.270 | 0.599 |
| (0.672–0.784) | (0.989–0.993) | (0.008–0.011) | (0.216–0.328) | (0.543–0.654) | |
| 61–70 | 0.588 | 0.996 | 0.004 | 0.412 | 0.533 |
| (0.483–0.687) | (0.995–0.997) | (0.003–0.005) | (0.313–0.517) | (0.434–0.630) | |
| ≥ 71 | 0.273 | 1.000 | 0.0003 | 0.727 | 0.429 |
| (0.060–0.610) | (0.999–1.0) | — | (0.390–0.940) | (0.099–0.816) | |
| Sex | |||||
| Female | 0.776 | 0.996 | 0.005 | 0.224 | 0.557 |
| (0.717–0.829) | (0.995–0.996) | (0.004–0.005) | (0.171–0.283) | (0.500–0.612) | |
| Male | 0.728 | 0.993 | 0.007 | 0.273 | 0.647 |
| (0.681–0.770) | (0.992–0.994) | (0.006–0.008) | (0.229–0.319) | (0.573–0.695) | |
| Race/ethnicity | |||||
| Black |
| 0.991 | 0.010 |
| 0.585 |
|
| (0.989–0.992) | (0.008–0.011) |
| (0.535–0.634) | |
| Hispanic/Latinx | 0.833 | 0.997 | 0.003 | 0.167 | 0.698 |
| (0.727–0.911) | (0.996–0.998) | (0.002–0.004) | (0.089–0.273) | (0.589–0.792) | |
| White | 0.827 | 0.996 | 0.004 | 0.174 | 0.635 |
| (0.766–0.877) | (0.995–0.999) | (0.003–0.005) | (0.123–0.234) | (0.573–0.695) | |
| Other | 0.667 | 0.995 | 0.005 | 0.333 | 0.471 |
| (0.447–0.844) | (0.993–0.997) | (0.003–0.008) | (0.153–0.553) | (0.298–0.649) |
Significant point estimates and confidence intervals for race have been italicized.
Figure 1.Plot of bias and fairness point estimates with bootstrapped 95% confidence intervals for the NLP opioid misuse classifier’s predictions for the external validation cohort.
Figure 2.NLP opioid misuse classifier’s top features for positive cases in original development dataset (2007-2017).
Figure 3.NLP opioid misuse classifier’s top features for positive cases in Black subgroup of external validation dataset (2017–2019).
Figure 4.NLP opioid misuse classifier’s top features for positive cases in White subgroup of external validation dataset (2017–2019).
Figure 5.Plot of bias and fairness metrics of the NLP opioid misuse classifier’s prediction for the external validation cohort with cut point adjustment by subgroup.
Figure 6.Plot of bias and fairness metrics of the NLP opioid misuse classifier’s prediction for the external validation cohort after model recalibration by subgroup.
Opioid classifier external validation patient characteristics of false negative predictions for opioid misuse, comparing Black and White subgroups (2017–2019)
| FNR Black subgroup | FNR White subgroup | df |
| |||
|---|---|---|---|---|---|---|
| Age | ||||||
| Median (IQR) |
|
|
|
|
| |
| Sex | n = 106 | % | n = 34 | % | ||
| Female | 30 | 0.28 | 12 | 0.35 | 1 | .44 |
| Male | 76 | 0.72 | 22 | 0.65 | ||
| Insurance | ||||||
| Medicaid | 74 | 0.7 | 23 | 0.68 | 2 | .14 |
| Medicare | 21 | 0.2 | 6 | 0.18 | ||
| Private | 11 | 0.1 | 5 | 0.15 | ||
| Readmission score | ||||||
| Median (IQR) |
|
|
|
|
| |
| Mortality score | ||||||
| Median (IQR) |
|
|
|
|
| |
Opioid misuse classifier external validation patient characteristics of Black subgroup comparing false negative to true positive for opioid misuse (2017–2019)
| Black subgroup | ||||||
|---|---|---|---|---|---|---|
| False negative | True positive | |||||
| n = 106 | n = 230 | df |
| |||
| Age | ||||||
| Median (IQR) |
|
|
|
|
| |
| Sex |
|
|
|
| ||
| Female | 30 | 0.28 | 84 | 0.37 | 1 | .14 |
| Male | 76 | 0.72 | 146 | 0.63 | ||
| Insurance | ||||||
| Medicaid | 74 | 0.7 | 167 | 0.73 | 2 | .33 |
| Medicare | 21 | 0.2 | 36 | 0.16 | ||
| Private | 11 | 0.1 | 27 | 0.12 | ||
| Readmission score | ||||||
| Median (IQR) | 47 | (33–62.5) | 43 | (30.25–57) | .09 | |
| Mortality score | ||||||
| Median (IQR) |
|
|
|
|
| |