| Literature DB >> 25117751 |
Sheng Yu1, Kanako K Kumamaru2, Elizabeth George2, Ruth M Dunne3, Arash Bedayat4, Matey Neykov5, Andetta R Hunsaker3, Karin E Dill6, Tianxi Cai5, Frank J Rybicki2.
Abstract
In this paper we describe an efficient tool based on natural language processing for classifying the detail state of pulmonary embolism (PE) recorded in CT pulmonary angiography reports. The classification tasks include: PE present vs. absent, acute PE vs. others, central PE vs. others, and subsegmental PE vs. others. Statistical learning algorithms were trained with features extracted using the NLP tool and gold standard labels obtained via chart review from two radiologists. The areas under the receiver operating characteristic curves (AUC) for the four tasks were 0.998, 0.945, 0.987, and 0.986, respectively. We compared our classifiers with bag-of-words Naive Bayes classifiers, a standard text mining technology, which gave AUC 0.942, 0.765, 0.766, and 0.712, respectively.Entities:
Keywords: CT pulmonary angiography; NILE; Natural language processing; Nested modification structure; Pulmonary embolism
Mesh:
Year: 2014 PMID: 25117751 PMCID: PMC4261018 DOI: 10.1016/j.jbi.2014.08.001
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317