Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction.

Literature DB >> 27431038

Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction.

Giulio Napolitano¹, Adele Marshall², Peter Hamilton³, Anna T Gavin⁴.

Abstract

BACKGROUND AND AIMS: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.
MATERIALS AND METHODS: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: 'semi-structured' and 'unstructured'. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.
RESULTS: The best result of 99.4% accuracy - which included only one semi-structured report predicted as unstructured - was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.
CONCLUSIONS: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.

Entities: Disease Gene

Keywords: Cancer staging; Information extraction; Natural language processing; Supervised machine learning; Surgical pathology report

Mesh：

Year: 2016 PMID： 27431038 DOI： 10.1016/j.artmed.2016.06.001

Source DB: PubMed Journal: Artif Intell Med ISSN： 0933-3657 Impact factor: 5.326

Keyword Cloud
Cited

7 in total

1. Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics.

Authors: Tomasz Oliwa; Steven B Maron; Leah M Chase; Samantha Lomnicki; Daniel V T Catenacci; Brian Furner; Samuel L Volchenboum
Journal: JCO Clin Cancer Inform Date: 2019-08

2. Dimension reduction technique using a multilayered descriptor for high-precision classification of ovarian cancer tissue using optical coherence tomography: a feasibility study.

Authors: Catherine St-Pierre; Wendy-Julie Madore; Etienne De Montigny; Dominique Trudel; Caroline Boudoux; Nicolas Godbout; Anne-Marie Mes-Masson; Kurosh Rahimi; Frédéric Leblond
Journal: J Med Imaging (Bellingham) Date: 2017-10-12

3. Next Generation Quality: Assessing the Physician in Clinical History Completeness and Diagnostic Interpretations Using Funnel Plots and Normalized Deviations Plots in 3,854 Prostate Biopsies.

Authors: Michael Bonert; Ihab El-Shinnawy; Michael Carvalho; Phillip Williams; Samih Salama; Damu Tang; Anil Kapoor
Journal: J Pathol Inform Date: 2017-11-23

Review 4. Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.

Authors: Seyedmostafa Sheikhalishahi; Riccardo Miotto; Joel T Dudley; Alberto Lavelli; Fabio Rinaldi; Venet Osmani
Journal: JMIR Med Inform Date: 2019-04-27

Review 5. Artificial intelligence (AI) in medicine, current applications and future role with special emphasis on its potential and promise in pathology: present and future impact, obstacles including costs and acceptance among pathologists, practical and philosophical considerations. A comprehensive review.

Authors: Zubair Ahmad; Shabina Rahim; Maha Zubair; Jamshid Abdul-Ghafar
Journal: Diagn Pathol Date: 2021-03-17 Impact factor: 2.644

6. Automatic Classification of Cancer Pathology Reports: A Systematic Review.

Authors: Thiago Santos; Amara Tariq; Judy Wawira Gichoya; Hari Trivedi; Imon Banerjee
Journal: J Pathol Inform Date: 2022-01-20

7. A Deep Neural Network for Early Detection and Prediction of Chronic Kidney Disease.

Authors: Vijendra Singh; Vijayan K Asari; Rajkumar Rajasekaran
Journal: Diagnostics (Basel) Date: 2022-01-05

7 in total