| Literature DB >> 35156000 |
Brian E Cade1, Syed Moin Hassan1, Hassan S Dashti2, Melissa Kiernan1, Milena K Pavlova1, Susan Redline1, Elizabeth W Karlson3.
Abstract
OBJECTIVE: Sleep apnea is associated with a broad range of pathophysiology. While electronic health record (EHR) information has the potential for revealing relationships between sleep apnea and associated risk factors and outcomes, practical challenges hinder its use. Our objectives were to develop a sleep apnea phenotyping algorithm that improves the precision of EHR case/control information using natural language processing (NLP); identify novel associations between sleep apnea and comorbidities in a large clinical biobank; and investigate the relationship between polysomnography statistics and comorbid disease using NLP phenotyping.Entities:
Keywords: electronic health record; electronic medical record; epidemiology; sleep apnea; sleep-disordered breathing
Year: 2022 PMID: 35156000 PMCID: PMC8826997 DOI: 10.1093/jamiaopen/ooab117
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Sample characteristics of samples used in different phases of the study
| All | Screen positive group | Chart review set | PheCAP cases | Polysomnography sample | |
|---|---|---|---|---|---|
|
| 100 616 | 15 741 | 300 | 4876 | 4544 |
| Women, | 56 910 (56.56) | 6784 (43.10) | 137 (45.67) | 1887 (38.70) | 2512 (55.28) |
| Mean age (IQR) | 58.20 (25.65) | 57.18 (18.19) | 58.27 (18.94) | 57.59 (16.87) | 56.77 (24.60) |
| Mean BMI (IQR) | 27.25 (7.78) | 32.05 (9.66) | 31.16 (10.48) | 33.61 (10.15) | 30.54 (9.73) |
| Race/ethnicity | |||||
| Asian, | 2680 (2.66) | 254 (1.62) | 3 (1.00) | 73 (1.50) | 113 (2.49) |
| Black, | 4930 (4.90) | 941 (5.98) | 18 (6.00) | 356 (7.30) | 585 (12.87) |
| Hispanic/Latino, | 3778 (3.75) | 590 (3.75) | 15 (5.00) | 188 (3.86) | 386 (8.49) |
| White, | 85 495 (84.97) | 13 393 (85.08) | 255 (85.00) | 4084 (83.77) | 2993 (65.87) |
| Other race/ethnicity, | 3733 (3.71) | 563 (3.58) | 9 (3.00) | 174 (3.57) | 467 (10.28) |
| Language spoken | |||||
| English, | 97 134 (96.54) | 15 186 (96.47) | 285 (95.00) | 4711 (96.64) | 3971 (87.39) |
| Spanish, | 1456 (1.45) | 220 (1.40) | 4 (1.33) | 76 (1.56) | 442 (9.73) |
| Other languages, | 2026 (2.01) | 335 (2.13) | 11 (3.67) | 88 (1.80) | 131 (2.88) |
Note: “Screen positive” had one or more PheCode diagnoses for sleep apnea (327.3) or obstructive sleep apnea (327.32). The 300 participants in “Chart Review Set” were obtained from the Screen Positive Group and used to perform PheCAP phenotyping. “PheCAP Cases” were classified by lead PheCAP algorithm (PheCAP SICDNLP and NLP CUIs in Table 2). Age and BMI are presented as medians (interquartile range). All other fields, apart from sample size, are presented as total size (percentage). Age and BMI data were based on the first sleep apnea diagnosis date for PheCode cases, the last available visit date for PheCode controls, and the first available polysomnographic recording for the polysomnography sample.
Abbreviations: BMI: body mass index; IQR: interquartile range.
Chart review performance of alternative sleep apnea phenotyping algorithms
| Method | Training recall (sensitivity) | Training precision (PPV) | Training negative predictive value | Training AUC | Validation recall (sensitivity) | Validation precision | Validation negative predictive value | Validation AUC |
|---|---|---|---|---|---|---|---|---|
| ≥1 PheCode | 1.000 | 0.689 | NA | NA | 1.000 | 0.733 | NA | NA |
| ≥2 PheCodes | 0.823 | 0.836 | 0.621 | NA | 0.795 | 0.805 | 0.455 | NA |
| MAP NLP CUIs | 0.774 | 0.850 | 0.582 | 0.819 | 0.727 | 0.831 | 0.442 | 0.786 |
| PheCAP SICDNLP and NLP CUIs | 0.427 | 0.981 | 0.437 | 0.893 | 0.375 | 0.943 | 0.353 | 0.832 |
| PheCAP SICD | 0.387 | 0.941 | 0.411 | 0.790 | 0.341 | 0.938 | 0.341 | 0.756 |
| PheCAP SNLP | 0.331 | 0.911 | 0.385 | 0.820 | 0.250 | 0.917 | 0.313 | 0.753 |
| PheCAP SICDNLP | 0.331 | 0.911 | 0.385 | 0.822 | 0.250 | 0.917 | 0.313 | 0.754 |
| PheCAP SICDNLP, Demographics, and NLP CUIs | 0.403 | 0.980 | 0.426 | 0.892 | 0.352 | 0.939 | 0.345 | 0.830 |
| PheCAP SICDNLP and NLP CUIs plus AHI and CPAP | 0.411 | 0.981 | 0.430 | 0.904 | 0.443 | 0.951 | 0.380 | 0.858 |
Note: A total of 300 chart reviews were performed for participants with one or more sleep apnea PheCode codings. Therefore, certain PheCode-only rows lack negative predictive values by definition. Of the 300 chart reviews, 180 (60%) of these results were used in the training set, and 120 (40%) were used in the validation set. Results for the best performing PheCAP model are shown as “PheCAP SICDNLP and NLP CUIs,” along with chart review performance for PheCode-only definitions using a minimum of 1 and 2 PheCode instances to define a case and a more basic NLP algorithm using MAP. The performance of PheCAP surrogate-only models is shown next (“PheCAP SICD,” “PheCAP SNLP,” “PheCAP SICDNLP”) and is followed by the predictive performance using demographic parameters exclusively. Reduced performance was observed when including demographics and the lead PheCAP model (“PheCAP SICDNLP, Demographics, and NLP CUIs”). Additional modest performance gains were obtained by forcing case status for participants with separately extracted AHI and/or continuous positive airway pressure (joint CPAP CUI/procedure term) evidence. Full results for all models are presented in Table S5. Recall (sensitivity) = true positives/(true positives + false negatives). Precision (Positive Predictive Value) = true positives/(true positives + false positives); Negative Predictive Value = true negatives/(true negatives + false negatives).
Abbreviations: AHI: apnea-hypopnea index; AUC: area under the curve; CPAP: continuous positive airway pressure ventilation; CUIs: concept unique identifiers; MAP: multimodal automated phenotyping; NLP: natural language processing.
Figure 1.Sleep apnea chart review guidelines. Guidelines for adjudicating participants with ≥1 sleep apnea PheCode diagnoses were based on ICSD-3 criteria. Decision criteria in blue boxes resulted in either a true sleep apnea diagnosis (green boxes) or a noncase sleep apnea diagnosis.
Incident disease enrichment among sleep apnea cases
| PheCode | Translation | Odds ratio | Incidence in SA cases | Incidence in matched controls |
|---|---|---|---|---|
| 327.1 | Hypersomnia | 16.38 (11.55–23.24) | 4.71 | 0.30 |
| 327.71 | Restless legs syndrome | 5.55 (4.34–7.09) | 4.19 | 0.78 |
| 263 | Other nutritional deficiency | 4.26 (3.38–5.35) | 4.09 | 0.99 |
| 428.4 | Heart failure with preserved ejection fraction (Diastolic heart failure) | 3.75 (3.07–4.58) | 5.02 | 1.39 |
| 278.11 | Morbid obesity | 3.72 (3.20–4.32) | 9.60 | 2.78 |
| 401.21 | Hypertensive heart disease | 3.05 (2.52–3.69) | 4.96 | 1.68 |
| 278.4 | Abnormal weight gain | 2.94 (2.44–3.54) | 5.22 | 1.84 |
| 327 | Sleep disorders | 2.91 (2.35–3.59) | 4.03 | 1.42 |
| 470 | Septal deviations/turbinate hypertrophy | 2.87 (1.97–4.19) | 1.21 | 0.42 |
| 472 | Chronic pharyngitis and nasopharyngitis | 2.69 (2.00–3.61) | 1.90 | 0.72 |
| 1002 | Symptoms concerning nutrition, metabolism, and development | 2.64 (2.27–3.07) | 7.54 | 3.00 |
| 415.2 | Chronic pulmonary heart disease | 2.50 (1.97–3.16) | 2.93 | 1.19 |
| 291.8 | Alteration of consciousness | 2.49 (2.02–3.08) | 3.60 | 1.47 |
| 251.1 | Hypoglycemia | 2.47 (1.92–3.19) | 2.45 | 1.01 |
| 313.1 | Attention-deficit hyperactivity disorder | 2.45 (1.87–3.22) | 2.17 | 0.89 |
| 306.9 | Tension headache | 2.40 (1.63–3.52) | 1.06 | 0.44 |
| 276.6 | Fluid overload | 2.34 (1.93–2.83) | 4.26 | 1.87 |
| 300.4 | Dysthymic disorder | 2.33 (1.92–2.83) | 4.38 | 1.93 |
| 428.2 | Heart failure not otherwise specified | 2.32 (1.93–2.78) | 4.90 | 2.17 |
| 798.1 | Chronic fatigue syndrome | 2.32 (1.81–2.97) | 2.50 | 1.09 |
| 296.22 | Major depressive disorder | 2.31 (2.07–2.56) | 17.10 | 8.21 |
| 290.1 | Dementias | 2.27 (1.73–2.98) | 2.09 | 0.93 |
| 1013 | Asphyxia and hypoxemia | 2.26 (1.86–2.74) | 4.26 | 1.93 |
| 539 | Bariatric surgery | 2.25 (1.99–2.53) | 11.52 | 5.47 |
| 278.1 | Obesity | 2.25 (2.00–2.52) | 19.00 | 9.46 |
Note: An incident diagnosis was defined as the first diagnosis for a potential comorbidity occurring at least one year after the first diagnosis date for sleep apnea. Otherwise, participants with prior diagnoses were excluded. Sample sizes will therefore vary by PheCode. Totally, 527 PheCodes with ≥1% overall prevalence were tested. Controls were matched for age, sex, BMI, population, and healthcare utilization. It was found that 170 nonredundant PheCodes were significantly associated following Bonferroni correction. Lead results are shown here. Complete results, including sex-stratified results and sample sizes, can be found in Table S5.
Abbreviations: SA: sleep apnea.
Cross-sectional disease enrichment among sleep apnea cases
| PheCode | Translation | Odds ratio | Prevalence in sleep apnea cases | Prevalence in matched controls |
|---|---|---|---|---|
| 327.1 | Hypersomnia | 21.52 (15.95–29.02) | 7.92 | 0.40 |
| 327.71 | Restless legs syndrome | 6.77 (5.55–8.27) | 7.18 | 1.13 |
| 278.11 | Morbid obesity | 5.56 (5.02–6.16) | 23.94 | 5.36 |
| 327 | Sleep disorders | 4.61 (4.01–5.29) | 11.63 | 2.78 |
| 428.4 | Heart failure with preserved ejection fraction (Diastolic heart failure) | 4.45 (3.79–5.23) | 8.41 | 2.02 |
| 415.2 | Chronic pulmonary heart disease | 3.99 (3.35–4.75) | 6.77 | 1.79 |
| 278.1 | Obesity | 3.67 (3.42–3.93) | 52.06 | 22.84 |
| 512.9 | Other dyspnea | 3.64 (3.35–3.94) | 32.81 | 11.84 |
| 263 | Other nutritional deficiency | 3.54 (2.97–4.23) | 6.13 | 1.81 |
| 1013 | Asphyxia and hypoxemia | 3.28 (2.84–3.78) | 9.15 | 2.98 |
| 470 | Septal deviations/turbinate hypertrophy | 3.13 (2.46–3.99) | 3.02 | 0.98 |
| 512.7 | Shortness of breath | 3.08 (2.85–3.33) | 34.54 | 14.61 |
| 401.21 | Hypertensive heart disease | 2.92 (2.50–3.41) | 7.22 | 2.60 |
| 509.1 | Respiratory failure | 2.89 (2.43–3.43) | 5.87 | 2.11 |
| 296.22 | Major depressive disorder | 2.86 (2.65–3.10) | 32.48 | 14.38 |
| 428.1 | Congestive heart failure (CHF) not otherwise specified | 2.80 (2.55–3.09) | 19.26 | 7.84 |
| 276.6 | Fluid overload | 2.79 (2.40–3.25) | 7.34 | 2.76 |
| 291.8 | Alteration of consciousness | 2.74 (2.31–3.24) | 5.97 | 2.27 |
| 278.4 | Abnormal weight gain | 2.70 (2.39–3.05) | 11.80 | 4.72 |
| 539 | Bariatric surgery | 2.61 (2.36–2.88) | 17.31 | 7.43 |
| 290.3 | Other persistent mental disorders due to conditions classified elsewhere | 2.58 (2.13–3.13) | 4.33 | 1.72 |
| 505 | Other pulmonary inflammation or edema | 2.57 (2.15–3.08) | 4.98 | 2.00 |
| 496 | Chronic airway obstruction | 2.53 (2.25–2.86) | 11.38 | 4.82 |
| 313.1 | Attention-deficit hyperactivity disorder | 2.51 (2.08–3.02) | 4.56 | 1.87 |
| 798.1 | Chronic fatigue syndrome | 2.51 (2.05–3.07) | 3.90 | 1.59 |
Note: Totally, 527 PheCodes with ≥1% overall prevalence were tested. Controls were matched for age, sex, BMI, population, and healthcare utilization. Of the tested PheCodes, 281 nonredundant PheCodes had significantly different cross-sectional prevalence between PheCAP-defined cases and matched controls in combined-sex and/or sex-stratified analyses. Lead results are shown here. Complete results, including sex-stratified results, can be found in Table S6.