| Literature DB >> 28269947 |
Tsung-Ting Kuo1, Pallavi Rao2, Cleo Maehara3, Son Doan1, Juan D Chaparro1, Michele E Day1, Claudiu Farcas1, Lucila Ohno-Machado1, Chun-Nan Hsu1.
Abstract
Natural Language Processing (NLP) is essential for concept extraction from narrative text in electronic health records (EHR). To extract numerous and diverse concepts, such as data elements (i.e., important concepts related to a certain medical condition), a plausible solution is to combine various NLP tools into an ensemble to improve extraction performance. However, it is unclear to what extent ensembles of popular NLP tools improve the extraction of numerous and diverse concepts. Therefore, we built an NLP ensemble pipeline to synergize the strength of popular NLP tools using seven ensemble methods, and to quantify the improvement in performance achieved by ensembles in the extraction of data elements for three very different cohorts. Evaluation results show that the pipeline can improve the performance of NLP tools, but there is high variability depending on the cohort.Mesh:
Year: 2017 PMID: 28269947 PMCID: PMC5333200
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076