Literature DB >> 21792466

Automated classification of free-text pathology reports for registration of incident cases of cancer.

V Jouhet1, G Defossez, A Burgun, P le Beux, P Levillain, P Ingrand, V Claveau.   

Abstract

OBJECTIVE: Our study aimed to construct and evaluate functions called "classifiers", produced by supervised machine learning techniques, in order to categorize automatically pathology reports using solely their content.
METHODS: Patients from the Poitou-Charentes Cancer Registry having at least one pathology report and a single non-metastatic invasive neoplasm were included. A descriptor weighting function accounting for the distribution of terms among targeted classes was developed and compared to classic methods based on inverse document frequencies. The classification was performed with support vector machine (SVM) and Naive Bayes classifiers. Two levels of granularity were tested for both the topographical and the morphological axes of the ICD-O3 code. The ability to correctly attribute a precise ICD-O3 code and the ability to attribute the broad category defined by the International Agency for Research on Cancer (IARC) for the multiple primary cancer registration rules were evaluated using F1-measures.
RESULTS: 5121 pathology reports produced by 35 pathologists were selected. The best performance was achieved by our class-weighted descriptor, associated with a SVM classifier. Using this method, the pathology reports were properly classified in the IARC categories with F1-measures of 0.967 for both topography and morphology. The ICD-O3 code attribution had lower performance with a 0.715 F1-measure for topography and 0.854 for morphology.
CONCLUSION: These results suggest that free-text pathology reports could be useful as a data source for automated systems in order to identify and notify new cases of cancer. Future work is needed to evaluate the improvement in performance obtained from the use of natural language processing, including the case of multiple tumor description and possible incorporation of other medical documents such as surgical reports.

Entities:  

Mesh:

Year:  2011        PMID: 21792466     DOI: 10.3414/ME11-01-0005

Source DB:  PubMed          Journal:  Methods Inf Med        ISSN: 0026-1270            Impact factor:   2.176


  10 in total

1.  Identifying free-text features to improve automated classification of structured histopathology reports for feline small intestinal disease.

Authors:  Abdullah Awaysheh; Jeffrey Wilcke; François Elvinger; Loren Rees; Weiguo Fan; Kurt Zimmerman
Journal:  J Vet Diagn Invest       Date:  2017-11-30       Impact factor: 1.279

2.  Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes.

Authors:  Yujia Bao; Zhengyi Deng; Yan Wang; Heeyoon Kim; Victor Diego Armengol; Francisco Acevedo; Nofal Ouardaoui; Cathy Wang; Giovanni Parmigiani; Regina Barzilay; Danielle Braun; Kevin S Hughes
Journal:  JCO Clin Cancer Inform       Date:  2019-09

3.  Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection.

Authors:  Ghulam Mujtaba; Liyana Shuib; Ram Gopal Raj; Retnagowri Rajandram; Khairunisa Shaikh; Mohammed Ali Al-Garadi
Journal:  PLoS One       Date:  2017-02-06       Impact factor: 3.240

Review 4.  Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review.

Authors:  Seyedmostafa Sheikhalishahi; Riccardo Miotto; Joel T Dudley; Alberto Lavelli; Fabio Rinaldi; Venet Osmani
Journal:  JMIR Med Inform       Date:  2019-04-27

5.  Rule-Based Information Extraction from Free-Text Pathology Reports Reveals Trends in South African Female Breast Cancer Molecular Subtypes and Ki67 Expression.

Authors:  Okechinyere J Achilonu; Elvira Singh; Gideon Nimako; René M J C Eijkemans; Eustasius Musenge
Journal:  Biomed Res Int       Date:  2022-01-20       Impact factor: 3.411

6.  Discovering body site and severity modifiers in clinical texts.

Authors:  Dmitriy Dligach; Steven Bethard; Lee Becker; Timothy Miller; Guergana K Savova
Journal:  J Am Med Inform Assoc       Date:  2013-10-03       Impact factor: 4.497

7.  Temporal representation of care trajectories of cancer patients using data from a regional information system: an application in breast cancer.

Authors:  Gautier Defossez; Alexandre Rollet; Olivier Dameron; Pierre Ingrand
Journal:  BMC Med Inform Decis Mak       Date:  2014-04-02       Impact factor: 2.796

8.  Automatic Extraction of ICD-O-3 Primary Sites from Cancer Pathology Reports.

Authors:  Ramakanth Kavuluru; Isaac Hands; Eric B Durbin; Lisa Witt
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2013-03-18

9.  Web Application for the Automated Extraction of Diagnosis and Site From Pathology Reports for Keratinocyte Cancers.

Authors:  Bridie S Thompson; Sam Hardy; Nirmala Pandeya; Jean Claude Dusingize; Adele C Green; Athon Millane; Daniel Bourke; Ronald Grande; Cameron D Bean; Catherine M Olsen; David C Whiteman
Journal:  JCO Clin Cancer Inform       Date:  2020-08

10.  Hierarchical attention networks for information extraction from cancer pathology reports.

Authors:  Shang Gao; Michael T Young; John X Qiu; Hong-Jun Yoon; James B Christian; Paul A Fearn; Georgia D Tourassi; Arvind Ramanthan
Journal:  J Am Med Inform Assoc       Date:  2018-03-01       Impact factor: 4.497

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.