| Literature DB >> 31143314 |
Kavishwar B Wagholikar1,2, Christina M Fischer3, Alyssa P Goodson4, Christopher D Herrick4, Taylor E Maclean3, Katelyn V Smith3, Liliana Fera3, Thomas A Gaziano3, Jacqueline R Dunning3, Joshua Bosque-Hamilton3, Lina Matta3, Eloy Toscano4, Brent Richter4, Layne Ainsworth4, Michael F Oates4, Samuel Aronson4, Calum A MacRae1,3, Benjamin M Scirica1,3, Akshay S Desai1,3, Shawn N Murphy1,2.
Abstract
BACKGROUND: The conventional approach for clinical studies is to identify a cohort of potentially eligible patients and then screen for enrollment. In an effort to reduce the cost and manual effort involved in the screening process, several studies have leveraged electronic health records (EHR) to refine cohorts to better match the eligibility criteria, which is referred to as phenotyping. We extend this approach to dynamically identify a cohort by repeating phenotyping in alternation with manual screening.Entities:
Keywords: Cohort identification; Electronic health records; Intervention; Phenotyping
Year: 2019 PMID: 31143314 PMCID: PMC6522233 DOI: 10.14740/jocmr3830
Source DB: PubMed Journal: J Clin Med Res ISSN: 1918-3003
List of Algorithms for Pre-Screening
| Inclusion criteria | Model types | Data modalities | Cycle | |||
|---|---|---|---|---|---|---|
| I | II | III | IV | |||
| Heart failure | Logistic regression | Coded data and notes | √ | |||
| Simple inference | EF algorithm | √ | √ | √ | ||
| EF ≤ 40 | Simple regular expression | All notes | √ | |||
| Elaborate text-processing | Notes from cardiology service | √ | √ | √ | ||
| Primary cardiologist at institution | Rules | Notes | √ | |||
| Rules | Coded data and notes | √ | √ | |||
| Logistic regression | Coded data and notes | √ | ||||
EF: ejection fraction.
PSR in Each Screening Cycle
| Cycle | I | II | III | IV | Total |
|---|---|---|---|---|---|
| Screened in (A) | 165 | 24 | 14 | 30 | 233 |
| Screened out (B) | 670 | 26 | 34 | 59 | 789 |
| Total screened (C = A + B) | 835 | 50 | 48 | 89 | 1,022 |
| PSR (A/C)% | 20 | 48 | 29 | 34 | 23 |
PSR: positive screening rate.
Figure 1DP consists of multiple screen cycles. At the start of each cycle, phenotyping is used to identify eligible patients from the EHR, creating an ordered list in which the most eligible patients are listed first. This list is manually screened, and the results are analyzed to improve phenotyping for the next cycle. DP: dynamic phenotyping; EHR: electronic health record.
Figure 2Composition of manual screening performed in each cycle. 1) No HF and EF accounted for 25% of false positives. HF was detected by machine learning and EF was extracted using a simple regular expression from clinical notes. 2) Optimization of EF parser and inferring HF from low EF eliminated many false positives. But failure to exclude patients not primarily managed by the outpatient cardiology clinic at BWH emerged as a challenge. This was because the cardiologist was inferred using total number of EHRs entries authored for the patient. 3) Cardiologist was inferred from number of EHR entries limited to the outpatient setting. But this did not significantly reduce false positives. 4) Use of machine learning to infer the primary cardiologist significantly excluded patients that are not managed at BWH. HF: heart failure; EF: ejection fraction; BWH: Brigham and Women’s Hospital; EHR: electronic health record.