| Literature DB >> 33758800 |
Fuchiang R Tsui1,2,3,4, Lingyun Shi1,3, Victor Ruiz1,3, Neal D Ryan5, Candice Biernesser5, Satish Iyengar6, Colin G Walsh7, David A Brent5.
Abstract
OBJECTIVE: Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data.Entities:
Keywords: electronic health records; machine learning; natural language processing; suicide attempt
Year: 2021 PMID: 33758800 PMCID: PMC7966858 DOI: 10.1093/jamiaopen/ooab011
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Figure 1.The diagram of cohort identification process. From all inpatient or emergency department visits between 2007 and 2016, our initial cohort comprised 8588 suicide attempt patients based on diagnoses and randomly selected 77 292 patients without any suicide attempt diagnoses. After applying the exclusion criteria, we had a final cohort with 5099 case patients and 40 139 control patients. The cohort was further divided into training and test datasets for model building and testing, respectively. Abbreviation: UPMC, University of Pittsburgh Medical Center.
Figure 2.The temporal diagram showing historical electronical health record (EHR) data up to 2 years prior to an index visit at an emergency department or an inpatient facility. A case index visit represents a first-time suicide attempt visit and a control index visit represents a randomly selected visit from controls with longitudinal EHR data. A first-time suicide attempt visit () is defined as first known suicide attempt visit between 2005 and 2016; : the index visit; : last point of clinical contact or last clinical encounter prior to the index visit (). A prediction window is defined as the time interval between the index visit time () and the historical most recent clinical-visit time () prior to the index visit.
Figure 3.The process flow of a medical natural language processing (NLP) pipeline, which transforms a narrative sentence in a clinical note to structured outcomes. For example, the sentence has three symptoms (fever, cough, and vomiting) and vomiting concept is negated. Negated concepts are common in clinical notes.
Predictive performance across four predictive models using full-feature (structured and unstructured) data with 4 prediction windows (7, 30, 90, and 730 days)
| Predictive model (number of features) | EXGB ( | LASSO ( | NB ( | RF ( | |
|---|---|---|---|---|---|
|
| Cases/Controls | 273/3980 | |||
| AUC (95% CI |
| 0.9042 (0.8841 − 0.9231) | 0.7580 (0.7340 − 0.7808) | 0.9055 (0.8882 − 0.9220) | |
|
| Ref | <.001 | <.001 | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/71.66 | 90.00/46.63 | 90.00/72.19 | |
|
| 95.00/58.47 | 95.00/22.02 | 95.00/60.95 | ||
|
| 71.43/90.00 | 25.79/90.00 | 69.60/90.00 | ||
|
| 59.71/95.00 | 12.90/95.00 | 57.51/95.00 | ||
|
| Cases/Controls |
| |||
| AUC (95% CI) |
| 0.9086 (0.8964 − 0.9205) | 0.7663 (0.7500 − 0.7825) | 0.9002 (0.8873 − 0.9123) | |
|
| Ref | <.001 | <.001 | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/74.77 | 90.00/45.72 | 90.00/71.73 | |
|
| 95.00/58.59 | 95.00/22.94 | 95.00/59.08 | ||
|
| 73.28/90.00 | 27.03/90.00 | 70.40/90.00 | ||
|
| 60.32/95.00 | 13.51/95.00 | 54.24/95.00 | ||
|
| Cases/Controls |
| |||
| AUC (95% CI) |
| 0.9031 (0.8928 − 0.9127) | 0.7643 (0.7514 − 0.7767) | 0.8848 (0.8745 − 0.8950) | |
|
| Ref | <.001 | <.001 | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/73.16 | 90.00/49.69 | 90.00/66.56 | |
|
| 95.00/59.81 | 95.00/24.99 | 95.00/54.00 | ||
|
| 71.37/90.00 | 25.88/90.00 | 65.29/90.00 | ||
|
| 57.16/95.00 | 12.94/95.00 | 49.43/95.00 | ||
|
| Cases/Controls |
| |||
| AUC (95% CI) |
| 0.8926 (0.8844 − 0.9010) | 0.7554 (0.7448 − 0.7653) | 0.8645 (0.8550 − 0.8730) | |
|
| Ref | <.001 | <.001 | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/70.15 | 90.00/51.10 | 90.00/62.51 | |
|
| 95.00/57.45 | 95.00/28.16 | 95.00/50.47 | ||
|
| 67.56/90.00 | 23.84/90.00 | 58.80/90.00 | ||
|
| 51.97/95.00 | 11.92/95.00 | 43.34/95.00 | ||
Note: EXGB, Least Absolute Shrinkage and Selection Operator (LASSO), and Random Forest (RF) applied further feature engineering frameworks (wrapper and embedded) to get a smaller number of features. The number listed in the parentheses associated with each model represents the final number of features used in the model. A boldfaced number represents the best AUC within each prediction window compared to other models.
EXGB: Ensemble of eXtreme Gradient Boosting; LASSO: Least Absolute Shrinkage and Selection Operator; NB: Naïve Bayes; RF: Random Forest.
All the models were trained in a training dataset and tested in a test (blind or hold-out) dataset. The evaluation metrics include the area under the receiver operating characteristic curve (AUC) with 95% confidence interval, sensitivity (or recall) and specificity (or precision). Each P-values was tested with respect to the AUC of Ensemble of eXtreme Gradient Boosting (EXGB) in the same prediction window. For each model, we started a total of 2126 features (including 215 social features) after applying feature filter.
All 95% confidence intervals were measured through 2000 stratified bootstrap replicates.
Figure 4.Receiver Operating Characteristic (ROC) curves of four ML models. Plots A and B show ROCs in 30- and 730-day prediction windows, respectively. Abbreviations: EXGB, Ensemble of eXtreme Gradient Boosting; LASSO, Least Absolute Shrinkage and Selection Operator.
Figure 5.Plots of predictive model accuracy, measured by the area under a receiver operating characteristic curve (AUC), among 4 predictive models. Plot (A) shows model performance in 30-day prediction window. Plot (B) shows model performance in 730-day prediction window. Abbreviations: EXGB, Ensemble of eXtreme Gradient Boosting; LASSO, Least Absolute Shrinkage and Selection Operator.
Impact of NLP on suicide attempt prediction
| Model versus prediction window | Gradient boosting model | Regression model | |||
|---|---|---|---|---|---|
|
|
|
| Structured-feature-only S-LASSO Model (192 features) | ||
| Prediction window ≤ 7 days | Cases/Controls, |
| |||
| AUC (95% CI |
| 0.9037 (0.8838 − 0.9220) | 0.9042 (0.8841 − 0.9231) | 0.8763 (0.8515 − 0.8989) | |
|
| Ref | <.001 | Ref | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/71.61 | 90.00/71.66 | 90.00/60.00 | |
|
| 95.00/53.72 | 95.00/58.47 | 95.00/36.03 | ||
|
| 70.70/90.00 | 71.43/90.00 | 68.50/90.00 | ||
|
| 58.24/95.00 | 59.71/95.00 | 55.68/95.00 | ||
| Prediction window ≤ 30 days | Cases/Controls, |
| |||
| AUC (95% CI) |
| 0.9007 (0.8880 − 0.9129) | 0.9086 (0.8964 − 0.9205) | 0.8842 (0.8695 − 0.8984) | |
|
| Ref | <.001 | Ref | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/70.20 | 90.00/74.77 | 90.00/67.51 | |
|
| 95.00/56.15 | 95.00/58.59 | 95.00/46.72 | ||
|
| 72.00/90.00 | 73.28/90.00 | 69.28/90.00 | ||
|
| 57.76/95.00 | 60.32/95.00 | 55.68/95.00 | ||
| Prediction window ≤ 90 days | Cases/Controls, |
| |||
| AUC (95% CI) |
| 0.8963 (0.8862 − 0.9062) | 0.9031 (0.8928 − 0.9127) | 0.8763 (0.8634 − 0.8879) | |
|
| Ref | <.001 | Ref | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/67.82 | 90.00/73.16 | 90.00/65.58 | |
|
| 95.00/55.53 | 95.00/59.81 | 95.00/45.31 | ||
|
| 69.62/90.00 | 71.37/90.00 | 66.53/90.00 | ||
|
| 54.99/95.00 | 57.16/95.00 | 51.18/95.00 | ||
| Prediction window ≤ 730 days | Cases/Controls, |
| |||
| AUC (95% CI) |
| 0.8830 (0.8744 − 0.8918) | 0.8926 (0.8844 − 0.9010) | 0.8622 (0.8522 − 0.8719) | |
|
| Ref | <.001 | Ref | <.001 | |
| 4 sets of sensitivity/specificity (%) |
| 90.00/65.71 | 90.00/70.15 | 90.00/62.11 | |
|
| 95.00/52.31 | 95.00/57.45 | 95.00/43.38 | ||
|
| 64.82/90.00 | 67.56/90.00 | 60.27/90.00 | ||
|
| 49.23/95.00 | 51.97/95.00 | 44.95/95.00 | ||
AUC: area under the curve; CI: confidence interval.
Full-feature (including structured and unstructured features) models and structured-feature-only models were compared. Two full-feature models were included: Ensemble of eXtreme Gradient Boosting (EXGB) model and the Least Absolute Shrinkage and Selection Operator (LASSO). Two structured-feature-only models were included: the Structured-EXGB (S-EXGB) model and the structured-LASSO (S-LASSO) model. All the models were trained in a training dataset and tested in a test (blind or hold-out) dataset. Full-feature models performed significantly better than structured-feature-only models. We chose four common sets of metrics (i.e., sensitivity and specificity) based on two sensitivities at 90% and 95% and two specificities at 90% and 95%; given a pre-selected sensitivity or specificity, the corresponding metrics were measured from the test dataset.
All 95% confidence intervals were measured through 2000 stratified bootstrap replicates.
Figure 6.Robustness analysis of Ensemble eXtreme Gradient Boosting (EXGB) model across 18 subgroups based on demographics (age, race, gender, insurance), depression diagnosis, and point of historical most recent clinical contact. Plot (A) shows the EXGB performance in 30-day prediction window. Plot (B) shows the EXGB performance in 730-day prediction window. Age was measured in years. Abbreviations: T, present; F, absent; LastContact, point of historical most recent clinical contact.
Unadjusted and adjusted odds ratios with 95% CI for demographic features
| Feature |
|
| Unadjusted Odds Ratio (95% CI) | Adjusted Odds Ratio(95% CI) |
|---|---|---|---|---|
| Demographic: sex | ||||
| Male | 2133 (41.83) | 15 839 (39.46) |
|
|
| Female | 2966 (58.17) | 24 300 (60.54) | 1 (ref) | ref |
| Demographic: age | ||||
| 10–14 | 253 (4.96) | 1481 (3.69) | 7.16 (5.76–8.9) | 7.62 (6.13–9.48) |
| 15 | 1 368 (26.83) | 4697 (11.70) |
|
|
| 25–34 | 1 103 (21.63) | 5281 (13.16) | 8.74 (7.27–10.51) | 9.6 (7.98–11.54) |
| 35–44 | 901 (17.67) | 5485 (13.67) | 6.87 (5.71–8.28) | 7.4 (6.14–8.92) |
| 45–54 | 903 (17.71) | 8217 (20.47) | 4.6 (3.82–5.54) | 4.81 (4–5.8) |
| 55–64 | 439 (8.61) | 9436 (23.51) | 1.95 (1.6–2.37) | 1.98 (1.62–2.41) |
| 65+ | 132 (2.59) | 5542 (13.81) | 1 (ref) | ref |
| Demographic: race | ||||
| Black | 885 (17.36) | 8075 (20.12) | 0.84 (0.78–0.91) | 0.68 (0.63–0.73) |
| Not specified | 155 (3.04) | 1300 (3.24) |
|
|
| Other | 152 (2.98) | 878 (2.19) | 0.91 (0.77–1.08) | 0.78 (0.66–0.93) |
| White | 3907 (76.62) | 29 886 (74.46) | 1 (ref) | ref |
| Demographic: insurance | ||||
| Medicaid | 2180 (42.75) | 8398 (20.92) |
|
|
| Medicare | 789 (15.47) | 12 663 (31.55) | 0.68 (0.62–0.75) | 1.35 (1.22–1.49) |
| Others | 530 (10.39) | 3617 (9.01) | 1.61 (1.44–1.79) | 1.62 (1.45–1.81) |
| Self-pay | 314 (6.16) | 1356 (3.38) | 2.54 (2.22–2.91) | 2.1 (1.83–2.42) |
| Commercial | 1286 (25.22) | 14 105 (35.14) | 1 (ref) | ref |
EXGB: Ensemble XGB; XGB: extreme gradient boosting.
The adjusted odds ratios (aORs) were estimated while controlling for sex, age, race, and insurance. The boldfaced numbers represent the highest OR/aOR in the increased risk categories or the lowest OR/aOR in the decreased risk categories.
Unadjusted and adjusted odds ratios in top 10 increased-risk and decreased-risk features with status present (true)
| Feature |
|
|
|
|
|---|---|---|---|---|
| Top 10 increased-risk features | ||||
| One or more emergency department visits in 2 years | 4196 (82.29) | 22 405 (55.82) | 3.68 (3.41–3.96) | 3.06 (2.83–3.3) |
| | 748 (14.67) | 481 (1.20) |
|
|
| Episodic mood disorders (ICD-9 296 | 1901 (37.28) | 4051 (10.09) | 5.30 (4.96–5.65) | 5.36 (5–5.74) |
| Suicide attempt (UMLS C0038663 | 654 (12.83) | 616 (1.53) | 2.27 (1.8–2.87) | 2.03 (1.58–2.6) |
| Depressive disorder (UMLS C0011581) | 2556 (50.13) | 12 510 (31.17) | 2.5 (2.11–2.94) | 2.59 (2.19–3.07) |
| Anxiety, dissociative and somatoform disorders (ICD-9 300 | 2252 (44.17) | 9460 (23.57) | 2.57 (2.42–2.72) | 2.89 (2.71–3.08) |
| Drug abuse (UMLS C0013146) | 2127 (41.71) | 9635 (24.00) | 2.28 (1.95–2.67) | 2.11 (1.79–2.49) |
| Depressive disorder, not elsewhere classified (ICD-9 311 | 2182 (42.79) | 8617 (21.47) | 2.74 (2.58–2.91) | 3.17 (2.97–3.38) |
| Suicidal (UMLS C0438696) | 1524 (29.89) | 3111 (7.75) | 1.41 (1.23–1.63) | 1.32 (1.14–1.54) |
| Depressed mood (UMLS C0344315) | 1427 (27.99) | 4413 (10.99) | 1.58 (1.22–2.05) | 1.54 (1.18–2.02) |
| Top 10 Decreased-risk Features from best model in EXGB with all odds ratios and 95% CIs <1 ranked by feature importance | ||||
| One or more outpatient visits in 2 years | 3619 (70.97) | 35 127 (87.51) | 0.35 (0.33–0.37) | 0.44 (0.41–0.48) |
| Hypertensive disease (UMLS C0020538) | 1707 (33.48) | 20 328 (50.64) | 0.6 (0.49–0.74) | 0.76 (0.61–0.95) |
| Anger (UMLS C0002957) | 656 (12.87) | 1229 (3.06) | 0.55 (0.33–0.92) | 0.51 (0.29–0.88) |
| Hypersensitivity (UMLS C0020517) | 3661 (71.80) | 31 540 (78.58) | 0.61 (0.57–0.66) | 0.71 (0.66–0.77) |
| Neoplasms (UMLS C0027651) | 655 (12.85) | 11 508 (28.67) | 0.63 (0.54–0.75) | 0.72 (0.61–0.85) |
| | 12 (0.24) | 856 (2.13) |
|
|
| Effusion (UMLS C0013687) | 382 (7.49) | 5843 (14.56) | 0.69 (0.59–0.8) | 0.75 (0.64–0.87) |
| Diabetes mellitus (ICD-9 250 | 547 (10.73) | 9066 (22.59) | 0.41 (0.38–0.45) | 0.66 (0.6–0.73) |
| Encounter for antenatal screening of mother (ICD-9 V28 | 109 (2.14) | 2052 (5.11) | 0.41 (0.34–0.49) | 0.21 (0.17–0.26) |
| Anesthetics (NDF-RT CN200 | 1473 (28.89) | 19 963 (49.73) | 0.41 (0.39–0.44) | 0.53 (0.5–0.57) |
| SDOH from EXGB ranked by the unadjusted odds ratio | ||||
| Divorced state (UMLS C0086170) | 404 (7.92) | 1809 (4.51) | 1.82 (1.63–2.04) | 2.29 (2.03–2.57) |
| | 280 (5.49) | 1857 (4.63) |
|
|
| Family support (UMLS C0150232) | 157 (3.08) | 578 (1.44) | 0.34 (0.21–0.56) | 0.29 (0.17–0.48) |
| Rehabilitation therapy (UMLS C0034991) | 1071 (21.00) | 6719 (16.74) | 0.43 (0.26–0.71) | 0.45 (0.26–0.77) |
Note: Selected features were limited to a minimum of 1% prevalence in case or control group. The 95% confidence intervals for the selected features were limited to their range either all >1 or <1. The ranking was based on feature importance from the best performing extreme gradient boosting (XGB) model among the ensemble XGB (EXGB). The adjusted odds ratios were estimated while controlling for sex, age, race, and insurance (see Table 3). The SDOH were identified from EXGB ranked by the unadjusted odds ratio. The boldfaced number represents the highest odds ratio (OR) or adjusted OR (aOR) in the increased risk categories or the lowest OR/aOR in the decreased risk categories.
ICD-9: International Classification of Diseases, Ninth Revision; UMLS: Unified Medical Language System; NDF-RT: National Drug File—Reference Terminology.
All 95% confidence intervals were measured through 2000 stratified bootstrap replicates.
We used the Firth logistic regression method to calculate adjusted ORs.
Excluding demographic features that were listed in the top portion of the table.
UMLS Concept Unique Identifier (CUI).
NDF-RT code.