Literature DB >> 30128778

Large Scale Semi-Automated Labeling of Routine Free-Text Clinical Records for Deep Learning.

Hari M Trivedi1, Maryam Panahiazar2, April Liang3, Dmytro Lituiev2, Peter Chang4, Jae Ho Sohn4, Yunn-Yi Chen5, Benjamin L Franc4, Bonnie Joe4, Dexter Hadley2.   

Abstract

Breast cancer is a leading cause of cancer death among women in the USA. Screening mammography is effective in reducing mortality, but has a high rate of unnecessary recalls and biopsies. While deep learning can be applied to mammography, large-scale labeled datasets, which are difficult to obtain, are required. We aim to remove many barriers of dataset development by automatically harvesting data from existing clinical records using a hybrid framework combining traditional NLP and IBM Watson. An expert reviewer manually annotated 3521 breast pathology reports with one of four outcomes: left positive, right positive, bilateral positive, negative. Traditional NLP techniques using seven different machine learning classifiers were compared to IBM Watson's automated natural language classifier. Techniques were evaluated using precision, recall, and F-measure. Logistic regression outperformed all other traditional machine learning classifiers and was used for subsequent comparisons. Both traditional NLP and Watson's NLC performed well for cases under 1024 characters with weighted average F-measures above 0.96 across all classes. Performance of traditional NLP was lower for cases over 1024 characters with an F-measure of 0.83. We demonstrate a hybrid framework using traditional NLP techniques combined with IBM Watson to annotate over 10,000 breast pathology reports for development of a large-scale database to be used for deep learning in mammography. Our work shows that traditional NLP and IBM Watson perform extremely well for cases under 1024 characters and can accelerate the rate of data annotation.

Entities:  

Keywords:  Artificial intelligence; Deep learning; IBM Watson; Machine learning; Mammography; Natural language processing (NLP); Pathology

Year:  2019        PMID: 30128778      PMCID: PMC6382632          DOI: 10.1007/s10278-018-0105-8

Source DB:  PubMed          Journal:  J Digit Imaging        ISSN: 0897-1889            Impact factor:   4.056


  15 in total

1.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors:  Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

2.  The Yale cTAKES extensions for document classification: architecture and application.

Authors:  Vijay Garla; Vincent Lo Re; Zachariah Dorey-Stein; Farah Kidwai; Matthew Scotch; Julie Womack; Amy Justice; Cynthia Brandt
Journal:  J Am Med Inform Assoc       Date:  2011-05-27       Impact factor: 4.497

Review 3.  Deep learning.

Authors:  Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal:  Nature       Date:  2015-05-28       Impact factor: 49.962

4.  Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis.

Authors:  Heung-Il Suk; Seong-Whan Lee; Dinggang Shen
Journal:  Neuroimage       Date:  2014-07-18       Impact factor: 6.556

5.  Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson's Natural Language Processing Algorithm.

Authors:  Hari Trivedi; Joseph Mesterhazy; Benjamin Laguna; Thienkhai Vu; Jae Ho Sohn
Journal:  J Digit Imaging       Date:  2018-04       Impact factor: 4.056

6.  A context-sensitive deep learning approach for microcalcification detection in mammograms.

Authors:  Juan Wang; Yongyi Yang
Journal:  Pattern Recognit       Date:  2018-01-10       Impact factor: 7.740

Review 7.  An open letter to panels that are deciding guidelines for breast cancer screening.

Authors:  Daniel B Kopans
Journal:  Breast Cancer Res Treat       Date:  2015-04-14       Impact factor: 4.872

8.  National Performance Benchmarks for Modern Screening Digital Mammography: Update from the Breast Cancer Surveillance Consortium.

Authors:  Constance D Lehman; Robert F Arao; Brian L Sprague; Janie M Lee; Diana S M Buist; Karla Kerlikowske; Louise M Henderson; Tracy Onega; Anna N A Tosteson; Garth H Rauscher; Diana L Miglioretti
Journal:  Radiology       Date:  2016-12-05       Impact factor: 11.105

9.  Correlating mammographic and pathologic findings in clinical decision support using natural language processing and data mining methods.

Authors:  Tejal A Patel; Mamta Puppala; Richard O Ogunti; Joe E Ensor; Tiancheng He; Jitesh B Shewale; Donna P Ankerst; Virginia G Kaklamani; Angel A Rodriguez; Stephen T C Wong; Jenny C Chang
Journal:  Cancer       Date:  2016-08-29       Impact factor: 6.860

10.  Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis.

Authors:  Geert Litjens; Clara I Sánchez; Nadya Timofeeva; Meyke Hermsen; Iris Nagtegaal; Iringo Kovacs; Christina Hulsbergen-van de Kaa; Peter Bult; Bram van Ginneken; Jeroen van der Laak
Journal:  Sci Rep       Date:  2016-05-23       Impact factor: 4.379

View more
  5 in total

1.  Phenotyping severity of patient-centered outcomes using clinical notes: A prostate cancer use case.

Authors:  Selen Bozkurt; Rohan Paul; Jean Coquet; Ran Sun; Imon Banerjee; James D Brooks; Tina Hernandez-Boussard
Journal:  Learn Health Syst       Date:  2020-07-17

2.  Generalization error analysis for deep convolutional neural network with transfer learning in breast cancer diagnosis.

Authors:  Ravi K Samala; Heang-Ping Chan; Lubomir M Hadjiiski; Mark A Helvie; Caleb D Richter
Journal:  Phys Med Biol       Date:  2020-05-11       Impact factor: 3.609

3.  Neural Network Assisted Pathology Case Identification.

Authors:  Jerome Cheng
Journal:  J Pathol Inform       Date:  2022-01-20

Review 4.  Machine and deep learning approaches for cancer drug repurposing.

Authors:  Naiem T Issa; Vasileios Stathias; Stephan Schürer; Sivanesan Dakshanamurthy
Journal:  Semin Cancer Biol       Date:  2020-01-03       Impact factor: 15.707

Review 5.  Empowering study of breast cancer data with application of artificial intelligence technology: promises, challenges, and use cases.

Authors:  Maryam Panahiazar; Nolan Chen; Dmytro Lituiev; Dexter Hadley
Journal:  Clin Exp Metastasis       Date:  2021-10-26       Impact factor: 5.150

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.