Literature DB >> 31401235

Cross-registry neural domain adaptation to extract mutational test results from pathology reports.

Anthony Rios1, Eric B Durbin2, Isaac Hands3, Susanne M Arnold4, Darshil Shah5, Stephen M Schwartz6, Bernardo H L Goulart6, Ramakanth Kavuluru7.   

Abstract

OBJECTIVE: We study the performance of machine learning (ML) methods, including neural networks (NNs), to extract mutational test results from pathology reports collected by cancer registries. Given the lack of hand-labeled datasets for mutational test result extraction, we focus on the particular use-case of extracting Epidermal Growth Factor Receptor mutation results in non-small cell lung cancers. We explore the generalization of NNs across different registries where our goals are twofold: (1) to assess how well models trained on a registry's data port to test data from a different registry and (2) to assess whether and to what extent such models can be improved using state-of-the-art neural domain adaptation techniques under different assumptions about what is available (labeled vs unlabeled data) at the target registry site.
MATERIALS AND METHODS: We collected data from two registries: the Kentucky Cancer Registry (KCR) and the Fred Hutchinson Cancer Research Center (FH) Cancer Surveillance System. We combine NNs with adversarial domain adaptation to improve cross-registry performance. We compare to other classifiers in the standard supervised classification, unsupervised domain adaptation, and supervised domain adaptation scenarios.
RESULTS: The performance of ML methods varied between registries. To extract positive results, the basic convolutional neural network (CNN) had an F1 of 71.5% on the KCR dataset and 95.7% on the FH dataset. For the KCR dataset, the CNN F1 results were low when trained on FH data (Positive F1: 23%). Using our proposed adversarial CNN, without any labeled data, we match the F1 of the models trained directly on each target registry's data. The adversarial CNN F1 improved when trained on FH and applied to KCR dataset (Positive F1: 70.8%). We found similar performance improvements when we trained on KCR and tested on FH reports (Positive F1: 45% to 96%).
CONCLUSION: Adversarial domain adaptation improves the performance of NNs applied to pathology reports. In the unsupervised domain adaptation setting, we match the performance of models that are trained directly on target registry's data by using source registry's labeled data and unlabeled examples from the target registry.
Copyright © 2019 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Cancer registry; Domain adaptation; Natural language processing; Neural networks; Text classification; Text mining

Mesh:

Substances:

Year:  2019        PMID: 31401235      PMCID: PMC6736690          DOI: 10.1016/j.jbi.2019.103267

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  15 in total

1.  Convolutional Neural Networks for Biomedical Text Classification: Application in Indexing Biomedical Articles.

Authors:  Anthony Rios; Ramakanth Kavuluru
Journal:  ACM BCB       Date:  2015-09

Review 2.  Deep learning for healthcare: review, opportunities and challenges.

Authors:  Riccardo Miotto; Fei Wang; Shuang Wang; Xiaoqian Jiang; Joel T Dudley
Journal:  Brief Bioinform       Date:  2018-11-27       Impact factor: 11.622

3.  Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports.

Authors:  John X Qiu; Hong-Jun Yoon; Paul A Fearn; Georgia D Tourassi
Journal:  IEEE J Biomed Health Inform       Date:  2017-05-03       Impact factor: 5.772

4.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions.

Authors:  Jenna Wiens; John Guttag; Eric Horvitz
Journal:  J Am Med Inform Assoc       Date:  2014-01-30       Impact factor: 4.497

5.  Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources.

Authors:  Simon Kocbek; Lawrence Cavedon; David Martinez; Christopher Bain; Chris Mac Manus; Gholamreza Haffari; Ingrid Zukerman; Karin Verspoor
Journal:  J Biomed Inform       Date:  2016-10-11       Impact factor: 6.317

6.  Generalizing biomedical relation classification with neural adversarial domain adaptation.

Authors:  Anthony Rios; Ramakanth Kavuluru; Zhiyong Lu
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

7.  NCCN Guidelines Insights: Non-Small Cell Lung Cancer, Version 4.2016.

Authors:  David S Ettinger; Douglas E Wood; Wallace Akerley; Lyudmila A Bazhenova; Hossein Borghaei; David Ross Camidge; Richard T Cheney; Lucian R Chirieac; Thomas A D'Amico; Thomas J Dilling; M Chris Dobelbower; Ramaswamy Govindan; Mark Hennon; Leora Horn; Thierry M Jahan; Ritsuko Komaki; Rudy P Lackner; Michael Lanuti; Rogerio Lilenbaum; Jules Lin; Billy W Loo; Renato Martins; Gregory A Otterson; Jyoti D Patel; Katherine M Pisters; Karen Reckamp; Gregory J Riely; Steven E Schild; Theresa A Shapiro; Neelesh Sharma; James Stevenson; Scott J Swanson; Kurt Tauer; Stephen C Yang; Kristina Gregory; Miranda Hughes
Journal:  J Natl Compr Canc Netw       Date:  2016-03       Impact factor: 11.908

8.  Molecular testing guideline for selection of lung cancer patients for EGFR and ALK tyrosine kinase inhibitors: guideline from the College of American Pathologists, International Association for the Study of Lung Cancer, and Association for Molecular Pathology.

Authors:  Neal I Lindeman; Philip T Cagle; Mary Beth Beasley; Dhananjay Arun Chitale; Sanja Dacic; Giuseppe Giaccone; Robert Brian Jenkins; David J Kwiatkowski; Juan-Sebastian Saldivar; Jeremy Squire; Erik Thunnissen; Marc Ladanyi
Journal:  J Thorac Oncol       Date:  2013-07       Impact factor: 15.609

Review 9.  Mechanisms of resistance to EGFR-targeted drugs: lung cancer.

Authors:  Floriana Morgillo; Carminia Maria Della Corte; Morena Fasano; Fortunato Ciardiello
Journal:  ESMO Open       Date:  2016-05-11

10.  Diagnosis code assignment: models and evaluation metrics.

Authors:  Adler Perotte; Rimma Pivovarov; Karthik Natarajan; Nicole Weiskopf; Frank Wood; Noémie Elhadad
Journal:  J Am Med Inform Assoc       Date:  2013-12-02       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.