| Literature DB >> 29888051 |
Stephen B Johnson1, Prakash Adekkanattu2, Thomas R Campion1,2, James Flory1, Jyotishman Pathak1, Olga V Patterson3,4, Scott L DuVall3,4, Vincent Major5, Yindalon Aphinyanaphongs5.
Abstract
Natural Language Processing (NLP) holds potential for patient care and clinical research, but a gap exists between promise and reality. While some studies have demonstrated portability of NLP systems across multiple sites, challenges remain. Strategies to mitigate these challenges can strive for complex NLP problems using advanced methods (hard-to-reach fruit), or focus on simple NLP problems using practical methods (low-hanging fruit). This paper investigates a practical strategy for NLP portability using extraction of left ventricular ejection fraction (LVEF) as a use case. We used a tool developed at the Department of Veterans Affair (VA) to extract the LVEF values from free-text echocardiograms in the MIMIC-III database. The approach showed an accuracy of 98.4%, sensitivity of 99.4%, a positive predictive value of 98.7%, and F-score of 99.0%. This experience, in which a simple NLP solution proved highly portable with excellent performance, illustrates the point that simple NLP applications may be easier to disseminate and adapt, and in the short term may prove more useful, than complex applications.Entities:
Year: 2018 PMID: 29888051 PMCID: PMC5961788
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1Leo architecture with UIMA-AS as the core component.
Values assigned for concepts of ejection fraction with qualitative modifiers in developing the reference standard.
| Modifiers | Values |
|---|---|
| Severely depressed | 5-29 |
| Moderately depressed | 30-44 |
| Mildly depressed | 45-54 |
| Grossly preserved | 50-55 |
| Normal | 70 |
Figure 2Extraction logic for LVEF implemented in EFEx system.
NLP portability challenges, and mitigation strategies that require advanced methods (hard-to-reach), and more practical methods (low-hanging).
| Challenge | Strategy: Hard-to-Reach | Strategy: Low-Hanging |
|---|---|---|
| Assemble corpora with heterogeneous document types | Use heuristic linkage methods; develop document classifiers | Exploit metadata; focus on single document type |
| Navigate diverse report structures | Customize document segmentation algorithms; employ active learning | Select pattern with low sensitivity to document location |
| Analyze idiosyncratic linguistic expressions | Use machine learning to tailor complex patterns | Re-use or adapt simple patterns developed previously |
| Integrate multiple NLP modules | Employ large number of modules; adapt to meet architecture standards | Employ small number of modules; re-use use or adapt modules previously standardized |
| Lead the dissemination project | Acquire funding to support the innovator site; supply expertise in NLP methods | Draw on existing resources at the adopter site; use conventional software skills |