| Literature DB >> 36081613 |
Mohammed Alawad1, Shang Gao1, John Qiu1, Noah Schaefferkoetter1, Jacob D Hinkle1, Hong-Jun Yoon1, J Blair Christian1, Xiao-Cheng Wu2, Eric B Durbin3,4,5, Jong Cheol Jeong4,5, Isaac Hands3,5, David Rust3, Georgia Tourassi1.
Abstract
Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.Entities:
Keywords: NLP; Transfer learning; convolutional neural network; information extraction; pathology reports
Year: 2019 PMID: 36081613 PMCID: PMC9450101 DOI: 10.1109/bhi.2019.8834586
Source DB: PubMed Journal: IEEE EMBS Int Conf Biomed Health Inform ISSN: 2641-3590