| Literature DB >> 35242443 |
Thiago Santos1,2, Amara Tariq3, Judy Wawira Gichoya2,4, Hari Trivedi2,4, Imon Banerjee3,5.
Abstract
Pathology reports primarily consist of unstructured free text and thus the clinical information contained in the reports is not trivial to access or query. Multiple natural language processing (NLP) techniques have been proposed to automate the coding of pathology reports via text classification. In this systematic review, we follow the guidelines proposed by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA; Page et al., 2020: BMJ.) to identify the NLP systems for classifying pathology reports published between the years of 2010 and 2021. Based on our search criteria, a total of 3445 records were retrieved, and 25 articles met the final review criteria. We benchmarked the systems based on methodology, complexity of the prediction task and core types of NLP models: i) Rule-based and Intelligent systems, ii) statistical machine learning, and iii) deep learning. While certain tasks are well addressed by these models, many others have limitations and remain as open challenges, such as, extraction of many cancer characteristics (size, shape, type of cancer, others) from pathology reports. We investigated the final set of papers (25) and addressed their potential as well as their limitations. We hope that this systematic review helps researchers prioritize the development of innovated approaches to tackle the current limitations and help the advancement of cancer research.Entities:
Year: 2022 PMID: 35242443 PMCID: PMC8860734 DOI: 10.1016/j.jpi.2022.100003
Source DB: PubMed Journal: J Pathol Inform
Figure 1Flow diagram of the search and inclusion process in the study. This study was reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.
Figure 2A temporal illustration of the reviewed papers included in this systematic review according to the publication date. Each color indicates different aspects of the review: (i) black indicates a rule-based and intelligent system, (ii) green indicates a statistical machine learning model, and (iii) blue indicates a deep learning model
Benchmarking of the natural language processing systems for pathology reports - prediction task, data and evaluation. Listed in ascending ordered according to the year of publication 2021–2010.
| Citation | Year | Methodology | Prediction Task | Data | Evaluation |
|---|---|---|---|---|---|
| Khosravi et al. | 2021 | Deep learning | Classify cancer vs. benign and high vs. low-risk of prostate disease | Local urology center database of 400 prostate cancer MRI images and pathology re-ports | AUCs of 0.89 and 0.78 for classification of cancer vs benign and high vs low risk, respectively |
| Gao et al. | 2020 | Hierarchical deep learning | Six cancer classification tasks: site, subsite, laterality , histology, behavior, and grade | 546, 806 cancer (all types) pathology re-ports obtained from the SEER cancer registry program | F1 Micro of 0.92, 0.64, 0.92, 0.8, 0.98, and 0.82 for site, subsite, laterality, histology, behavior, and grade, respectively |
| Saib et al. | 2020 | Hierarchical deep learning | Classify 9 ICD-O morphology grading | 1813 breast cancer pathology reports obtained from a local center database | F1 Micro of 0.91 and F1 Macro of 0.69 for classification of 9 ICD-O codes |
| Alawad et al. | 2020 | Deep learning | Two cancer classification tasks: subsite with 317 labels and histology with 556 labels | 878,864 cancer (all types) pathology reports obtained from the SEER cancer registry program | F1 Micro of 0.68 for subsite; F1 Micro of 0.79 for histology |
| Glaser et al. | 2019 | Rule-Based | Extract stage, grade, and presence of muscularis propria | 3,042 Transurethral Resection of the Bladder Tumor (TURBT) reports obtained from a local database | Accuracy of 82%, 88% , and 100% for extracting stage, specimens and grade, respectively |
| Soysal et al. | 2019 | Rule-based | Extract cancer-related information in pathology reports (e.g., tumor size, tumor stage, specimen, biomarkers, and others) | 400 cancer (all types) pathology reports obtained from a local center database | F1 average performance ranging from 0.87 to 0.99 for extracting cancer information |
| Yoon et al. | 2019 | Multi-task deep learning | Four cancer classification tasks: subsite, laterality, behavior, and histological grade | 942 unstructured cancer (all types) pathology reports obtained from the SEER cancer registry program | F1 Micro of 0.98, 0.98, 0.99, and 0.97, for Site, Laterality, Behavior, and Grade, respectively F1 average of 0.98 |
| Gao et al. | 2019 | Hierarchical deep learning | Five cancer classification tasks: site, laterality, behavior, histology, and grade. | 374,899 cancer (all types) pathology reports obtained from the SEER cancer registry program | Accuracy of 0.9, 0.89, 0.96, 0.76, 0.71 for site, laterality, behavior, histology, and grade, respectively |
| Alawad et al. | 2019 | Multi-task deep learning | Five cancer classification tasks: site, laterality, behavior, histology, and grade. | 95,231 (all types) pathology reports obtained from the SEER cancer registry program | F1 Micro of of 0.94, 0.82, 0.95, 0.65, and 0.76 for site, laterality, grade, and behavior, re- spectively |
| Lee et al. | 2018 | Rule-based | Extract tissue slide identifier, biomarker names, and test result identifier | 867 bladder tumor (TURBTs) Pathology Reports obtained from a local center database | F1 score of 0.99, 0.97, and 0.96 for extracting tissue slide identifier, biomarker names, and test result, respectively Accuracy of 0.88 |
| Yoon et al. | 2018 | Deep learning | Classify 12 ICD codes | 942 pathology reports (breast and lung) obtained from a local center database | F1 Micro of 0.78 for classification of 12 ICD-O codes |
| Alawad et al. | 2018 | Multi-task deep learning | Three cancer classification tasks: site, laterality, and histology | 942 site, 642 histology, and 815 laterality pathology reports obtained from a local center database | F1 Micro of 0.77, 0.79, and for primary site, grade, and laterality, respectively |
| Qiu et al. | 2017 | Deep learning | Classify 12 ICD codes | 942 pathology reports (breast and lung) obtained from a local center database | F1 Micro of 0.72 for classification of 12 ICD-O codes |
| Gao et al. | 2017 | Hierarchical deep learning | Two cancer classification tasks: primary site, and grade | 942 cancer (all types) pathology reports obtained from the SEER cancer registry program | F1 Micro score of 0.8 and 0.91 for primary site and histological grade, respectively |
| Schroeck et al. | 2017 | Rule-based | Extract histology, invasion , grade, carcinoma, and presence of muscularis propria | 600 bladder pathology reports obtained from a local center database | Accuracy ranged from 0.83 to 0.96 for extracting histology, invasion, grade, carcinoma, and muscularis. |
| Nguyen et al. | 2017 | Rule-based | Identify cancer notifiable patients | 45.3 million pathology HL7 messages | Sensitivity of 0.96 and specificity of 0.96 |
| Breischneider et al. | 2017 | Rule-Based | Extraction of size, grading, hormone, and lymph nodes | 8,766 breast cancer reports obtained from a local center database | Accuracy of 0.41, 0.77, 0.86, and 0.78 for extracting size, grading, hormone, and lymph nodes, respectively |
| Oleynik et al. | 2017 | Statistical Machine Learning | Classify ICD codes | 94,000 pathology reports obtained from a local center database | F1 of 0.82 and 0.73 for classifying ICD codes from topography and morphology classes. |
| Yala et al. | 2017 | Statistical Machine Learning | IE of tumor characteristics | 91,000 breast pathology reports | F1 score of 0.92 |
| Napolitano et al. | 2016 | Statistical Machine Learning | Classify pathology reports and chunk recognition | 798 surgical pathology reports obtained from a local center database | Accuracy of 0.994 |
| Wieneke et al. | 2015 | Statistical Machine Learning | Three classification tasks: procedure, laterality, and result | 3234 pathology reports | F1 Micro score of 0.8, 0.92, and 0.5 for procedure, laterality, and result, respectively |
| Kavuluru et al. | 2013 | Statistical Machine Learning | Classify ICD-O-3 codes | 56,000 pathology reports obtained from a local center database | F1 Micro of 0.9 and F1 Macro of 0.71 for classification of ICD- O-3 codes F1 score of 0.93 |
| Buckley et al. | 2012 | Rule-based | IE of cancer characteristics | 76,333 breast pathology reports obtained from a local center database | Specificity of 0.96 |
| Martinez et al. | 2011 | Statistical Machine Learning | IE of cancer characteristics | 217 clinical records obtained from a local center database | F1 of 0.58 and 0.7 for Tumor Site and Nodes Examined, respectively |