| Literature DB >> 26306288 |
Wen-Wai Yim1, Heather L Evans2, Meliha Yetisgen3.
Abstract
Microbiology lab culture reports are a frequently used diagnostic tool for clinical providers. However, their incorporation into clinical surveillance applications and evidence-based medicine can be severely hindered by the free-text nature of these reports. In this work, we (1) created a microbiology culture template to structure free-text microbiology reports, (2) generated an annotated microbiology report corpus, and (3) built a microbiology information extraction system. Specifically, we combined rule-based, hybrid, and statistical techniques to extract microbiology entities and fill templates for structuring data. System performances were favorable, with entity f1-score 0.889 and relation f1-score 0.795. We plan to incorporate these extractions as features for our ongoing ventilator-associated pneumonia surveillance project, though this tool can be used as an upstream process in other applications. Our newly created corpus includes 1442 unique gram stain and culture microbiology reports generated from a cohort of 715 patients at the University of Washington Medical Facilities.Entities:
Year: 2015 PMID: 26306288 PMCID: PMC4525274
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1.Examples of the microbiology gram stain and culture reports annotated with entities and relations
Figure 2.Overall system architecture
Entity extraction performances. TP: True positive, FN: False negative, FP: False positive, P: Precision, R: Recall, F1: F1-score.
| Entity Type | Entity Label | TP | FN | FP | P | R | F1 |
|---|---|---|---|---|---|---|---|
| Rule-based | MIC | 83 | 1 | 0 | 1.000 | 0.988 | 0.994 |
| No Growth | 26 | 0 | 0 | 1.000 | 1.000 | 1.000 | |
| Rating | 453 | 0 | 3 | 0.993 | 1.000 | 0.997 | |
| Hybrid | Drug | 299 | 12 | 7 | 0.977 | 0.961 | 0.969 |
| Drug resistance | 252 | 11 | 9 | 0.966 | 0.958 | 0.962 | |
| No growth measure | 24 | 2 | 1 | 0.960 | 0.923 | 0.941 | |
| Organism quantity | 637 | 119 | 94 | 0.871 | 0.843 | 0.857 | |
| Reference item | 128 | 6 | 6 | 0.955 | 0.955 | 0.955 | |
| Specimen date | 117 | 10 | 10 | 0.921 | 0.921 | 0.921 | |
| Statistical | Organism | 1133 | 271 | 203 | 0.848 | 0.807 | 0.827 |
| Specimen Description | 102 | 34 | 17 | 0.857 | 0.750 | 0.800 |
Micro entity and relation extraction performances. (Oracle Entities): Relation extraction results based on gold standard entities, (System Entities): Relation extraction results based on entities identified by the system.
| P | R | F1 | |
|---|---|---|---|
| System Entities | 0.903 | 0.875 | 0.889 |
| Relations (Oracle Entities) | 0.981 | 0.982 | 0.981 |
| Relations (System Entities) | 0.836 | 0.759 | 0.795 |
Template match performances.
| System | Gold | Match | P | R | F1 | |
|---|---|---|---|---|---|---|
| Templates | 1365 | 1196 | 776 | 0.569 | 0.649 | 0.606 |