David Martinez1, Graham Pitson2, Andrew MacKinlay3, Lawrence Cavedon4. 1. Department of Computing and Information Systems, The University of Melbourne, Doug McDonell Building, Parkville, 3010 VIC, Australia. Electronic address: david.martinez.iraola@gmail.com. 2. Barwon Health, Geelong Hospital, 1/75 Bellerine Street, Geelong, 3220 VIC, Australia. 3. Department of Computing and Information Systems, The University of Melbourne, Doug McDonell Building, Parkville, 3010 VIC, Australia. 4. School of Computer Science and IT, RMIT University, 124 Latrobe St, Melbourne, 3000 VIC, Australia.
Abstract
OBJECTIVE: We address the task of extracting information from free-text pathology reports, focusing on staging information encoded by the TNM (tumour-node-metastases) and ACPS (Australian clinico-pathological stage) systems. Staging information is critical for diagnosing the extent of cancer in a patient and for planning individualised treatment. Extracting such information into more structured form saves time, improves reporting, and underpins the potential for automated decision support. METHODS AND MATERIAL: We investigate the portability of a text mining model constructed from records from one health centre, by applying it directly to the extraction task over a set of records from a different health centre, with different reporting narrative characteristics. Other than a simple normalisation step on features associated with target labels, we apply the models from one system directly to the other. RESULTS: The best F-scores for in-hospital experiments are 81%, 85%, and 94% (for staging T, N, and M respectively), while best cross-hospital F-scores reach 84%, 81%, and 91% for the same respective categories. CONCLUSIONS: Our performance results compare favourably to the best levels reported in the literature, and--most relevant to our aim here--the cross-corpus results demonstrate the portability of the models we developed.
OBJECTIVE: We address the task of extracting information from free-text pathology reports, focusing on staging information encoded by the TNM (tumour-node-metastases) and ACPS (Australian clinico-pathological stage) systems. Staging information is critical for diagnosing the extent of cancer in a patient and for planning individualised treatment. Extracting such information into more structured form saves time, improves reporting, and underpins the potential for automated decision support. METHODS AND MATERIAL: We investigate the portability of a text mining model constructed from records from one health centre, by applying it directly to the extraction task over a set of records from a different health centre, with different reporting narrative characteristics. Other than a simple normalisation step on features associated with target labels, we apply the models from one system directly to the other. RESULTS: The best F-scores for in-hospital experiments are 81%, 85%, and 94% (for staging T, N, and M respectively), while best cross-hospital F-scores reach 84%, 81%, and 91% for the same respective categories. CONCLUSIONS: Our performance results compare favourably to the best levels reported in the literature, and--most relevant to our aim here--the cross-corpus results demonstrate the portability of the models we developed.
Authors: Prakash Adekkanattu; Guoqian Jiang; Yuan Luo; Paul R Kingsbury; Zhenxing Xu; Luke V Rasmussen; Jennifer A Pacheco; Richard C Kiefer; Daniel J Stone; Pascal S Brandt; Liang Yao; Yizhen Zhong; Yu Deng; Fei Wang; Jessica S Ancker; Thomas R Campion; Jyotishman Pathak Journal: AMIA Annu Symp Proc Date: 2020-03-04
Authors: Stephen B Johnson; Prakash Adekkanattu; Thomas R Campion; James Flory; Jyotishman Pathak; Olga V Patterson; Scott L DuVall; Vincent Major; Yindalon Aphinyanaphongs Journal: AMIA Jt Summits Transl Sci Proc Date: 2018-05-18
Authors: Okechinyere J Achilonu; Elvira Singh; Gideon Nimako; René M J C Eijkemans; Eustasius Musenge Journal: Biomed Res Int Date: 2022-01-20 Impact factor: 3.411