| Literature DB >> 26167382 |
Arika E Wieneke1, Erin J A Bowles1, David Cronkite1, Karen J Wernli1, Hongyuan Gao1, David Carrell1, Diana S M Buist1.
Abstract
BACKGROUND: Pathology reports typically require manual review to abstract research data. We developed a natural language processing (NLP) system to automatically interpret free-text breast pathology reports with limited assistance from manual abstraction.Entities:
Keywords: Breast cancer; natural language processing; pathology; validation
Year: 2015 PMID: 26167382 PMCID: PMC4485196 DOI: 10.4103/2153-3539.159215
Source DB: PubMed Journal: J Pathol Inform
Figure 1The number of reports used for natural language processing system training and testing, including validation and evaluation of the training set
Figure 2This is an example of an abstracted breast pathology report and associated natural language processing data codes. Relevant procedures and results are color-coded to show how they correspond between the report at the top and the data at the bottom
Figure 3How we flagged reports for manual review. The squiggly line is the threshold for positive predictive value (PPV) (reports had to have a PPV above the line) and the diagonal line the threshold for negative predictive value (NPV) (reports had to have a NPV below the line). Reports that fell into the white areas were assigned a data finding. Reports that fell into the grey areas were flagged for manual review
Figure 4An example of a grouping of results used to improve natural language processing (NLP) system performance. “Omit” was at the top, which allowed the NLP system to exclude any nonbreast-related or irrelevant reports. If the reports were not omitted, we next determined if the results belonged to a large category of invasive or not invasive, and then individual categories of ductal or lobular among invasive reports
Figure 5The results of our evaluation, or test, set. Among reports that were not omitted, 49.1% were flagged for manual review, 30.6% were assigned incorrect codes, and 12.7% were completely coded correctly following our all-or-nothing approach
Result findings that were flagged for manual review by the confidence score metric, including the number of reports that was flagged for review. Note that some reports were flagged by multiple codes.
Performance of several procedure, laterality, and result findings from the evaluation set (N:324) and processed by the final version of the NLP systema