| Literature DB >> 29881766 |
Emily Beth Devine1, Erik Van Eaton1, Megan E Zadworny1, Rebecca Symons1, Allison Devlin1, David Yanez2, Meliha Yetisgen1, Katelyn R Keyloun1, Daniel Capurro1, Rafael Alfonso-Cristancho1, David R Flum1, Peter Tarczy-Hornoch1.
Abstract
BACKGROUND: The availability of high fidelity electronic health record (EHR) data is a hallmark of the learning health care system. Washington State's Surgical Care Outcomes and Assessment Program (SCOAP) is a network of hospitals participating in quality improvement (QI) registries wherein data are manually abstracted from EHRs. To create the Comparative Effectiveness Research and Translation Network (CERTAIN), we semi-automated SCOAP data abstraction using a centralized federated data model, created a central data repository (CDR), and assessed whether these data could be used as real world evidence for QI and research.Entities:
Keywords: comparative effectiveness research; electronic health records; quality improvement; validation studies
Year: 2018 PMID: 29881766 PMCID: PMC5983060 DOI: 10.5334/egems.211
Source DB: PubMed Journal: EGEMS (Wash DC) ISSN: 2327-9214
Amount of Data Feasibly Ingested.
| Abdominal/Oncologic n (%) | Non-cardiac Vascular n (%) | Spine n (%) | |
|---|---|---|---|
| N = 740 | N = 586 | N = 882 | |
| Not possible to abstract | 466 (63%) | 422 (72%) | 487 (55%) |
| Possible to abstract | 274 (37%) | 164 (28%) | 395 (45%) |
| Structured | 77 (10%) | 36 (6%) | 45 (5%) |
| NLP | 45 (6%) | 34 (6%) | 7 (<1%) |
| Possible with additional resources | 152 (21%) | 94 (16%) | 343 (39%) |
| n = 274 | n = 164 | n = 395 | |
| Structured | 77 (28%) | 36 (22%) | 45 (11%) |
| NLP | 45 (16%) | 34 (21%) | 7 (2%) |
| Possible with additional resources | 152 (56%) | 94 (57%) | 343 (87%) |
NLP = natural language processing.
Results, Validation 1. Proportion of data elements that loaded correctly from each EHR, by feed.
| Site | ADT | Labs | Dictations | Radiology Reports | Allergies |
|---|---|---|---|---|---|
| 94% | 98% | 43% | N/A | N/A | |
| 89% | 91% | Not completed | N/A | N/A | |
| 100% | 100% | 100% | 100% | 84% | |
| 99% | 75% | 99% | 93% | 85% | |
ADT = admission/discharge/transfer; Labs = laboratory services; N/A = Not available.
Results, Validation 2. Proportions of data elements standardized, by site and SCOAP form.
| Sites | SCOAP Form | Demographic % | Labs % | Medications % | Radiology Reports % | Dictations (NLP) % | Average % |
|---|---|---|---|---|---|---|---|
| Abdominal/Oncologic | 97% | 98% | 90% | N/A | 89% | 94% | |
| Vascular | 92% | 85% | 77% | N/A | 86% | ||
| Spine | 94% | 100% | 93% | N/A | 94% | ||
| Abdominal/Oncologic | 97% | 95% | 97% | N/A | 89% | 95% | |
| Vascular | 98% | 93% | 97% | N/A | 94% | ||
| Spine | 94% | 99% | 98% | N/A | 95% | ||
| Abdominal/Oncologic | 99% | 89% | N/A | 94% | 89% | 93% | |
| Vascular | 96% | 100% | N/A | N/A | 95% | ||
| Spine | 97% | 90% | N/A | N/A | 92% | ||
| Abdominal/Oncologic | 99% | 73% | N/A | 89% | 91% | 88% | |
| Vascular | 98% | 34% | N/A | N/A | 74% | ||
| Spine | 100% | 89% | N/A | N/A | 93% | ||
Labs = laboratory; N/A = Not available; NLP = Natural language processing.
Results, Validation 3. Among Matched Cases, Number (%) of Discordant/Valid Pairs of Data Elements, Compared to EHR.
| Total N (%) | Neither Abstraction Method Matches EHR N (%) | Manually Abstracted Method Matches EHR N (%) | Automatically Abstracted Method Matches EHR N (%) | Both Methods Match EHR N (%) | |
|---|---|---|---|---|---|
| 2,056 (100%) | 97 (5%) | 1,185 (58%) | 774 (38%) | 2 <1% | |
| A | 757 | 30 (4%) | 512 (68%) | 215 (28%) | 0 |
| B | 269 | 28 (10%) | 157 (58%) | 82 (30%) | 2 (<1%) |
| C | 294 | 14 (5%) | 138 (47%) | 142 (48%) | 0 |
| D | 736 | 25 (3%) | 336 (51%) | 335 (46%) | 0 |
| Abdominal/Oncologic | 1,460 | 60 (4%) | 828 (57%) | 572 (39%) | 0 |
| Spine | 350 | 22 (6%) | 201 (57%) | 126 (36%) | 1 (<1%) |
| Vascular | 246 | 15 (6%) | 154 (63%) | 76 (31%) | 1 (<1%) |
| Binary | 924 | 0 | 692 (75%) | 232 (25%) | 0 |
| Categorical | 776 | 60 (8%) | 397 (51%) | 319 (41%) | 0 |
| Continuous | 356 | 37 (10%) | 94 (26%) | 223 (63%) | 2 (1%) |
| Structured | 1,044 | 52 (5%) | 396 (39%) | 594 (57%) | 0 |
| NLP | 1,012 | 45 (4%) | 787 (78%) | 180 (18%) | 0 |
| Demographics | 650 | 19 (3%) | 236 (36%) | 395 (61%) | 0 |
| Risk Factors | 415 | 44 (12%) | 264 (64%) | 106 (25%) | 1 (<1%) |
| Comorbidities | 270 | 0 | 221 (82%) | 49 (18%) | 0 |
| Pre-operative | 45 | 1 (2%) | 35 (78%) | 9 (20%) | 0 |
| Intraoperative | 199 | 21 (11%) | 88 (44%) | 89 (45%) | 1 (<1%) |
| Perioperative | 159 | 12 (8%) | 107 (67%) | 40 (25%) | 0 |
Results Information Retrieval Statistics.
| Demographics/Admit/Discharge | Manual Abstraction | Automated Abstraction | ||||
|---|---|---|---|---|---|---|
| Recall | Precision | F-score | Recall | Precision | F-score | |
| # Variables | 30 | 30 | 30 | 30 | 30 | 30 |
| Score Category | ||||||
| <0.5 | 0 | 0 | 0 | 0 | 0 | 0 |
| ≥0.5 & <0.8 | 0 | 0 | 0 | 0 | 0 | 0 |
| ≥0.8 | 100% | 100% | 100% | 100% | 100% | 100% |
| # Variables | 13 | 11 | 13 | 13 | 12 | 13 |
| Score Category | ||||||
| <0.5 | 23% | 0 | 23% | 8% | 0 | 8% |
| ≥0.5 & <0.8 | 46% | 9% | 23% | 31% | 0 | 0 |
| ≥0.8 | 31% | 91% | 54% | 62% | 100% | 92% |
| # Variables | 27 | 26 | 27 | 27 | 26 | 27 |
| Score Category | ||||||
| <0.5 | 7% | 0 | 4% | 4% | 0 | 4% |
| ≥0.5 & <0.8 | 0% | 0 | 4% | 4% | 0 | 4% |
| ≥0.8 | 93% | 100% | 93% | 93% | 100% | 93% |
| # Variables | 2 | 2 | 2 | 2 | 1 | 2 |
| Score Category | ||||||
| <0.5 | 100 | 0 | 100 | 100 | 0 | 100 |
| ≥0.5 & <0.8 | 0 | 0 | 0 | 0 | 0 | 0 |
| ≥0.8 | 0 | 100 | 0 | 0 | 100 | 0 |
| # Variables | 73 | 71 | 73 | 73 | 61 | 73 |
| Score Category | ||||||
| <0.5 | 3% | 0 | 3% | 16% | 0 | 16% |
| ≥0.5 & <0.8 | 10% | 1% | 11% | 8% | 0 | 5% |
| ≥0.8 | 88% | 99% | 86% | 75% | 100% | 78% |
| # Variables | 12 | 12 | 12 | 12 | 11 | 12 |
| Score Category | ||||||
| <0.5 | 8% | 0 | 8% | 17% | 0 | 17% |
| ≥0.5 & <0.8 | 0 | 0 | 0% | 0 | 0 | 0 |
| ≥0.8 | 92% | 100% | 92% | 83% | 100% | 83% |