Literature DB >> 31365274

Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics.

Tomasz Oliwa1, Steven B Maron2, Leah M Chase3, Samantha Lomnicki3, Daniel V T Catenacci3, Brian Furner1, Samuel L Volchenboum1,3.   

Abstract

PURPOSE: Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other institutions. PATIENTS AND METHODS: Pathology reports from patients with gastroesophageal cancer enrolled in The University of Chicago GI oncology tumor bank were used to train and validate a novel composite natural language processing-based pipeline with a supervised machine learning classification step to separate notes into internal (primary review) and external (consultation) reports; a named-entity recognition step to obtain label (accession number), location, date, and sublabels (block identifiers); and a results proofreading step.
RESULTS: We analyzed 188 pathology reports, including 82 internal reports and 106 external consult reports, and successfully extracted named entities grouped as sample information (label, date, location). Our approach identified up to 24 additional unique samples in external consult notes that could have been overlooked. Our classification model obtained 100% accuracy on the basis of 10-fold cross-validation. Precision, recall, and F1 for class-specific named-entity recognition models show strong performance.
CONCLUSION: Through a combination of natural language processing and machine learning, we devised a re-implementable and automated approach that can accurately extract specimen attributes from semistructured pathology notes to dynamically populate a tumor registry.

Entities:  

Mesh:

Year:  2019        PMID: 31365274      PMCID: PMC6873953          DOI: 10.1200/CCI.19.00008

Source DB:  PubMed          Journal:  JCO Clin Cancer Inform        ISSN: 2473-4276


  12 in total

1.  PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Authors:  A L Goldberger; L A Amaral; L Glass; J M Hausdorff; P C Ivanov; R G Mark; J E Mietus; G B Moody; C K Peng; H E Stanley
Journal:  Circulation       Date:  2000-06-13       Impact factor: 29.690

2.  caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research.

Authors:  Rebecca S Crowley; Melissa Castine; Kevin Mitchell; Girish Chavan; Tara McSherry; Michael Feldman
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Information extraction from multi-institutional radiology reports.

Authors:  Saeed Hassanpour; Curtis P Langlotz
Journal:  Artif Intell Med       Date:  2015-10-03       Impact factor: 5.326

4.  Automated Extraction of Grade, Stage, and Quality Information From Transurethral Resection of Bladder Tumor Pathology Reports Using Natural Language Processing.

Authors:  Alexander P Glaser; Brian J Jordan; Jason Cohen; Anuj Desai; Philip Silberman; Joshua J Meeks
Journal:  JCO Clin Cancer Inform       Date:  2018-12

5.  Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records.

Authors:  Sami-Ramzi Leyh-Bannurah; Zhe Tian; Pierre I Karakiewicz; Ulrich Wolffgang; Guido Sauter; Margit Fisch; Dirk Pehrke; Hartwig Huland; Markus Graefen; Lars Budäus
Journal:  JCO Clin Cancer Inform       Date:  2018-12

6.  Assessing the Utility of Automatic Cancer Registry Notifications Data Extraction from Free-Text Pathology Reports.

Authors:  Anthony N Nguyen; Julie Moore; John O'Dwyer; Shoni Philpot
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

7.  Using machine learning to parse breast pathology reports.

Authors:  Adam Yala; Regina Barzilay; Laura Salama; Molly Griffin; Grace Sollender; Aditya Bardia; Constance Lehman; Julliette M Buckley; Suzanne B Coopey; Fernanda Polubriaginof; Judy E Garber; Barbara L Smith; Michele A Gadd; Michelle C Specht; Thomas M Gudewicz; Anthony J Guidi; Alphonse Taghian; Kevin S Hughes
Journal:  Breast Cancer Res Treat       Date:  2016-11-08       Impact factor: 4.872

8.  The registry case finding engine: an automated tool to identify cancer cases from unstructured, free-text pathology reports and clinical notes.

Authors:  David A Hanauer; Gretchen Miela; Arul M Chinnaiyan; Alfred E Chang; Douglas W Blayney
Journal:  J Am Coll Surg       Date:  2007-09-10       Impact factor: 6.113

9.  Classification of cancer stage from free-text histology reports.

Authors:  Ian McCowan; Darren Moore; Mary-Jane Fry
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2006

10.  Automated de-identification of free-text medical records.

Authors:  Ishna Neamatullah; Margaret M Douglass; Li-wei H Lehman; Andrew Reisner; Mauricio Villarroel; William J Long; Peter Szolovits; George B Moody; Roger G Mark; Gari D Clifford
Journal:  BMC Med Inform Decis Mak       Date:  2008-07-24       Impact factor: 2.796

View more
  5 in total

1.  Empowering digital pathology applications through explainable knowledge extraction tools.

Authors:  Stefano Marchesin; Fabio Giachelle; Niccolò Marini; Manfredo Atzori; Svetla Boytcheva; Genziana Buttafuoco; Francesco Ciompi; Giorgio Maria Di Nunzio; Filippo Fraggetta; Ornella Irrera; Henning Müller; Todor Primov; Simona Vatrano; Gianmaria Silvello
Journal:  J Pathol Inform       Date:  2022-09-15

2.  Natural language processing systems for pathology parsing in limited data environments with uncertainty estimation.

Authors:  Anobel Y Odisho; Briton Park; Nicholas Altieri; John DeNero; Matthew R Cooperberg; Peter R Carroll; Bin Yu
Journal:  JAMIA Open       Date:  2020-10-14

3.  Comparison of Machine-Learning Algorithms for the Prediction of Current Procedural Terminology (CPT) Codes from Pathology Reports.

Authors:  Joshua Levy; Nishitha Vattikonda; Christian Haudenschild; Brock Christensen; Louis Vaickus
Journal:  J Pathol Inform       Date:  2022-01-05

Review 4.  Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing.

Authors:  Liwei Wang; Sunyang Fu; Andrew Wen; Xiaoyang Ruan; Huan He; Sijia Liu; Sungrim Moon; Michelle Mai; Irbaz B Riaz; Nan Wang; Ping Yang; Hua Xu; Jeremy L Warner; Hongfang Liu
Journal:  JCO Clin Cancer Inform       Date:  2022-07

5.  Validation of deep learning natural language processing algorithm for keyword extraction from pathology reports in electronic health records.

Authors:  Yoojoong Kim; Jeong Hyeon Lee; Sunho Choi; Jeong Moon Lee; Jong-Ho Kim; Junhee Seok; Hyung Joon Joo
Journal:  Sci Rep       Date:  2020-11-20       Impact factor: 4.379

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.