| Literature DB >> 29295172 |
Stephen Wu1, Andrew Wen1, Yanshan Wang2, Sijia Liu2, Hongfang Liu2.
Abstract
Search techniques in clinical text need to make fine-grained semantic distinctions, since medical terms may be negated, about someone other than the patient, or at some time other than the present. While natural language processing (NLP) approaches address these fine-grained distinctions, a task like patient cohort identification from electronic health records (EHRs) simultaneously requires a much more coarse-grained combination of evidence from the text and structured data of each patient's health records. We thus introduce aligned-layer language models, a novel approach to information retrieval (IR) that incorporates the output of other NLP systems. We show that this framework is able to represent standard IR queries, formulate previously impossible multi-layered queries, and customize the desired degree of linguistic granularity.Entities:
Keywords: Electronic Health Records; Information Storage and Retrieval; Natural Language Processing
Mesh:
Year: 2017 PMID: 29295172 PMCID: PMC7466869
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630
Figure 1 -cTAKES processing is followed by the indexing of results into various layers (indexes)
Figure 2 -Aligned layers (tokens, part-of-speech tags, named entities, Stanford dependencies, and CoNLL-X dependencies) for a sample query “pregnancy with preterm delivery.” Span, source, and target indices are with respect to the L0 layer’s indices.
Eight versions of aligned-layer queries for Topic 125. #p are phrase queries, #l are list queries, others are term queries.
| Layers | Query representation |
|---|---|
| TXT | coinfected hepatitis c hiv |
| CUI | cui:C0019196 cui:C0019158 |
| MIX | coinfected cui:C0019158 hiv |
| PH-LS | coinfected #l(cui:C0019158| hepatitis) hiv |
| TXT-MRF | coinfected^0.85 hepatitiŝ0.85 ĉ0.85 hiv^0.85 #p(8|true|coinfected|hepatitis)^0.1 #p(8|false|coinfected|hepatitis)^0.05 #p(8|true|hepatitis|c)^0.1 #p(8|false|hepatitis|c)^0.05 #p(8|true|c|hiv)^0.1 #p(8|false|c|hiv)^0.05 #p(12|true|coinfected|hepatitis|c)^0.1 #p(12|falsejcoinfected| hepatitis|c)^0.05 #p(12|true|hepatitis|c|hiv)^0.1 #p(12|false|hepatitis|c|hiv)^0.05 #p(16|true|coin|fected|hepatitis|c|hiv)^0.1 #p(16|false|coinfected|hepatitis|c|hiv)^0.05 |
| CUI-MRF | cui:C0019196^0.85 cui:C0019158^0.85 #p(8|true|cui:C0019196|cui:C0019158)^0.1#p(8|false|cui:C0019196|cui:C0019158)^0.05 |
| MIX-MRF | coinfected^0.85 cui:C0019158^0.85 hiv^0.85 #p(8|true|coinfected|cui:C0019158)^0.1 #p(8|false|coinfected|cui:C0019158)^0.05 #p(8|true|cui:C0019158|hiv)^0.1 #p(8|false|cui:C0019158|hiv)^0.05 #p(12|true|coinfected|cui:C0019158|hiv)^0.1 #p(12|false|coinfected|cui:C0019158|hiv)^0.05 |
| PH-LS-MRF | coinfected^0.85 #l(cui:C0019158|hepatitis)^0.85 hiv^8.85 #p (8|true|coinfected|#l(cui:C0019158|hepatitis))^0.1 #p(8|false|coinfected|#l(cui:C0019158|hepatitis))^0.05 #p(8|true|#l(cui:C0019158|hepatitis)|hiv)^0.1 #p(8|false|#l(cui:C0019158|hepatitis)|hiv)^0.05 #p(12|true|coinfected|#l(cui:C0019158|hepatitis)|hiv)^0.1 |
Retrieval performance for a range of possible aligned-layer models
| 2011 | 2012 | |||
|---|---|---|---|---|
| Model | BOW | MRF | BOW | MRF |
| TXT | 0.2960 | 0.2936 | 0.2152 | 0.2224 |
| CUI | 0.3042 | 0.3119 | 0.3101 | |
| MIX | 0.2807 | 0.2847 | 0.2215 | 0.2167 |
| CUI-LS | 0.3076 | 0.3095 | 0.2161 | 0.2120 |
| PH-LS | 0.3167 | 0.2172 | 0.2126 | |