Literature DB >> 27026618

Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning.

John D Osborne1, Matthew Wyatt2, Andrew O Westfall3, James Willig4, Steven Bethard5, Geoff Gordon6.   

Abstract

OBJECTIVE: To help cancer registrars efficiently and accurately identify reportable cancer cases.
MATERIAL AND METHODS: The Cancer Registry Control Panel (CRCP) was developed to detect mentions of reportable cancer cases using a pipeline built on the Unstructured Information Management Architecture - Asynchronous Scaleout (UIMA-AS) architecture containing the National Library of Medicine's UIMA MetaMap annotator as well as a variety of rule-based UIMA annotators that primarily act to filter out concepts referring to nonreportable cancers. CRCP inspects pathology reports nightly to identify pathology records containing relevant cancer concepts and combines this with diagnosis codes from the Clinical Electronic Data Warehouse to identify candidate cancer patients using supervised machine learning. Cancer mentions are highlighted in all candidate clinical notes and then sorted in CRCP's web interface for faster validation by cancer registrars.
RESULTS: CRCP achieved an accuracy of 0.872 and detected reportable cancer cases with a precision of 0.843 and a recall of 0.848. CRCP increases throughput by 22.6% over a baseline (manual review) pathology report inspection system while achieving a higher precision and recall. Depending on registrar time constraints, CRCP can increase recall to 0.939 at the expense of precision by incorporating a data source information feature.
CONCLUSION: CRCP demonstrates accurate results when applying natural language processing features to the problem of detecting patients with cases of reportable cancer from clinical notes. We show that implementing only a portion of cancer reporting rules in the form of regular expressions is sufficient to increase the precision, recall, and speed of the detection of reportable cancer cases when combined with off-the-shelf information extraction software and machine learning.
© The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  electronic health records; information extraction; machine learning; natural language processing; neoplasms; user-computer interface

Mesh:

Year:  2016        PMID: 27026618      PMCID: PMC5070519          DOI: 10.1093/jamia/ocw006

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  10 in total

1.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

2.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  Pattern-based information extraction from pathology reports for cancer registration.

Authors:  Giulio Napolitano; Colin Fox; Richard Middleton; David Connolly
Journal:  Cancer Causes Control       Date:  2010-07-23       Impact factor: 2.506

4.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

5.  Symbolic rule-based classification of lung cancer stages from free-text pathology reports.

Authors:  Anthony N Nguyen; Michael J Lawley; David P Hansen; Rayleen V Bowman; Belinda E Clarke; Edwina E Duhig; Shoni Colquist
Journal:  J Am Med Inform Assoc       Date:  2010 Jul-Aug       Impact factor: 4.497

6.  Comparing methods for identifying pancreatic cancer patients using electronic data sources.

Authors:  Jeff Friedlin; Marc Overhage; Mohammed A Al-Haddad; Joshua A Waters; J Juan R Aguilar-Saavedra; Joe Kesterson; Max Schmidt
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

7.  Extracting and integrating data from entire electronic health records for detecting colorectal cancer cases.

Authors:  Hua Xu; Zhenming Fu; Anushi Shah; Yukun Chen; Neeraja B Peterson; Qingxia Chen; Subramani Mani; Mia A Levy; Qi Dai; Josh C Denny
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

8.  Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model.

Authors:  Anni Coden; Guergana Savova; Igor Sominsky; Michael Tanenblatt; James Masanz; Karin Schuler; James Cooper; Wei Guan; Piet C de Groen
Journal:  J Biomed Inform       Date:  2008-12-27       Impact factor: 6.317

9.  Facilitating cancer research using natural language processing of pathology reports.

Authors:  Hua Xu; Kristin Anderson; Victor R Grann; Carol Friedman
Journal:  Stud Health Technol Inform       Date:  2004

Review 10.  Text mining of cancer-related information: review of current status and future directions.

Authors:  Irena Spasić; Jacqueline Livsey; John A Keane; Goran Nenadić
Journal:  Int J Med Inform       Date:  2014-06-24       Impact factor: 4.046

  10 in total
  18 in total

1.  Automated Cancer Registry Notifications: Validation of a Medical Text Analytics System for Identifying Patients with Cancer from a State-Wide Pathology Repository.

Authors:  Anthony N Nguyen; Julie Moore; John O'Dwyer; Shoni Philpot
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

2.  Interactive NLP in Clinical Care: Identifying Incidental Findings in Radiology Reports.

Authors:  Gaurav Trivedi; Esmaeel R Dadashzadeh; Robert M Handzel; Wendy W Chapman; Shyam Visweswaran; Harry Hochheiser
Journal:  Appl Clin Inform       Date:  2019-09-04       Impact factor: 2.342

3.  Clinical Informatics Researcher's Desiderata for the Data Content of the Next Generation Electronic Health Record.

Authors:  Timothy I Kennell; James H Willig; James J Cimino
Journal:  Appl Clin Inform       Date:  2017-12-21       Impact factor: 2.342

4.  Validation of an alcohol misuse classifier in hospitalized patients.

Authors:  Daniel To; Brihat Sharma; Niranjan Karnik; Cara Joyce; Dmitriy Dligach; Majid Afshar
Journal:  Alcohol       Date:  2019-09-28       Impact factor: 2.405

5.  Phenotype Detection Registry System (PheDRS) - Implementation of a Generalizable Single Institution Clinical Registry Architecture.

Authors:  John D Osborne; Adarsh Khare; Donald M Dempsey; J Michael Wells; Matt Wyatt; Geoff Gordon; Wayne H Liang; James Cimino
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

6.  Systems Biology and Kidney Disease.

Authors:  Jennifer A Schaub; Habib Hamidi; Lalita Subramanian; Matthias Kretzler
Journal:  Clin J Am Soc Nephrol       Date:  2020-01-28       Impact factor: 8.237

Review 7.  Use of Natural Language Processing to Extract Clinical Cancer Phenotypes from Electronic Medical Records.

Authors:  Guergana K Savova; Ioana Danciu; Folami Alamudun; Timothy Miller; Chen Lin; Danielle S Bitterman; Georgia Tourassi; Jeremy L Warner
Journal:  Cancer Res       Date:  2019-08-08       Impact factor: 12.701

Review 8.  Artificial intelligence and machine learning in precision and genomic medicine.

Authors:  Sameer Quazi
Journal:  Med Oncol       Date:  2022-06-15       Impact factor: 3.738

9.  CUILESS2016: a clinical corpus applying compositional normalization of text mentions.

Authors:  John D Osborne; Matthew B Neu; Maria I Danila; Thamar Solorio; Steven J Bethard
Journal:  J Biomed Semantics       Date:  2018-01-10

10.  A bibliometric analysis of natural language processing in medical research.

Authors:  Xieling Chen; Haoran Xie; Fu Lee Wang; Ziqing Liu; Juan Xu; Tianyong Hao
Journal:  BMC Med Inform Decis Mak       Date:  2018-03-22       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.