Literature DB >> 25008281

Text mining of cancer-related information: review of current status and future directions.

Irena Spasić1, Jacqueline Livsey2, John A Keane3, Goran Nenadić3.   

Abstract

PURPOSE: This paper reviews the research literature on text mining (TM) with the aim to find out (1) which cancer domains have been the subject of TM efforts, (2) which knowledge resources can support TM of cancer-related information and (3) to what extent systems that rely on knowledge and computational methods can convert text data into useful clinical information. These questions were used to determine the current state of the art in this particular strand of TM and suggest future directions in TM development to support cancer research.
METHODS: A review of the research on TM of cancer-related information was carried out. A literature search was conducted on the Medline database as well as IEEE Xplore and ACM digital libraries to address the interdisciplinary nature of such research. The search results were supplemented with the literature identified through Google Scholar.
RESULTS: A range of studies have proven the feasibility of TM for extracting structured information from clinical narratives such as those found in pathology or radiology reports. In this article, we provide a critical overview of the current state of the art for TM related to cancer. The review highlighted a strong bias towards symbolic methods, e.g. named entity recognition (NER) based on dictionary lookup and information extraction (IE) relying on pattern matching. The F-measure of NER ranges between 80% and 90%, while that of IE for simple tasks is in the high 90s. To further improve the performance, TM approaches need to deal effectively with idiosyncrasies of the clinical sublanguage such as non-standard abbreviations as well as a high degree of spelling and grammatical errors. This requires a shift from rule-based methods to machine learning following the success of similar trends in biological applications of TM. Machine learning approaches require large training datasets, but clinical narratives are not readily available for TM research due to privacy and confidentiality concerns. This issue remains the main bottleneck for progress in this area. In addition, there is a need for a comprehensive cancer ontology that would enable semantic representation of textual information found in narrative reports.
Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

Entities:  

Keywords:  Cancer; Data mining; Electronic medical records; Natural language processing

Mesh:

Year:  2014        PMID: 25008281     DOI: 10.1016/j.ijmedinf.2014.06.009

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  50 in total

1.  An Automated Feature Engineering for Digital Rectal Examination Documentation using Natural Language Processing.

Authors:  Selen Bozkurt; Jung In Park; Kathleen Mary Kan; Michelle Ferrari; Daniel L Rubin; James D Brooks; Tina Hernandez-Boussard
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

Review 2.  Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.

Authors:  S Velupillai; D Mowery; B R South; M Kvist; H Dalianis
Journal:  Yearb Med Inform       Date:  2015-08-13

3.  Automatic mining of symptom severity from psychiatric evaluation notes.

Authors:  George Karystianis; Alejo J Nevado; Chi-Hun Kim; Azad Dehghan; John A Keane; Goran Nenadic
Journal:  Int J Methods Psychiatr Res       Date:  2017-12-22       Impact factor: 4.035

4.  Nuclear shape descriptors by automated morphometry may distinguish aggressive variants of squamous cell carcinoma from relatively benign skin proliferative lesions: a pilot study.

Authors:  Weixi Yang; Rong Tian; Tongqing Xue
Journal:  Tumour Biol       Date:  2015-03-10

5.  Text-mining in cancer research may help identify effective treatments.

Authors:  Yi-Wen Hsiao; Tzu-Pin Lu
Journal:  Transl Lung Cancer Res       Date:  2019-12

6.  Detecting unplanned care from clinician notes in electronic health records.

Authors:  Suzanne Tamang; Manali I Patel; Douglas W Blayney; Julie Kuznetsov; Samuel G Finlayson; Yohan Vetteth; Nigam Shah
Journal:  J Oncol Pract       Date:  2015-05       Impact factor: 3.840

7.  Finding Cervical Cancer Symptoms in Swedish Clinical Text using a Machine Learning Approach and NegEx.

Authors:  Rebecka Weegar; Maria Kvist; Karin Sundström; Søren Brunak; Hercules Dalianis
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

8.  Efficient identification of nationally mandated reportable cancer cases using natural language processing and machine learning.

Authors:  John D Osborne; Matthew Wyatt; Andrew O Westfall; James Willig; Steven Bethard; Geoff Gordon
Journal:  J Am Med Inform Assoc       Date:  2016-03-28       Impact factor: 4.497

Review 9.  Drug repurposing in oncology: Compounds, pathways, phenotypes and computational approaches for colorectal cancer.

Authors:  Patrycja Nowak-Sliwinska; Leonardo Scapozza; Ariel Ruiz i Altaba
Journal:  Biochim Biophys Acta Rev Cancer       Date:  2019-04-26       Impact factor: 10.680

10.  A decision support system for mammography reports interpretation.

Authors:  Marzieh Esmaeili; Seyed Mohammad Ayyoubzadeh; Nasrin Ahmadinejad; Marjan Ghazisaeedi; Azin Nahvijou; Keivan Maghooli
Journal:  Health Inf Sci Syst       Date:  2020-04-01
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.