Literature DB >> 33259943

Pre-training phenotyping classifiers.

Dmitriy Dligach1, Majid Afshar2, Timothy Miller3.   

Abstract

Recent transformer-based pre-trained language models have become a de facto standard for many text classification tasks. Nevertheless, their utility in the clinical domain, where classification is often performed at encounter or patient level, is still uncertain due to the limitation on the maximum length of input. In this work, we introduce a self-supervised method for pre-training that relies on a masked token objective and is free from the limitation on the maximum input length. We compare the proposed method with supervised pre-training that uses billing codes as a source of supervision. We evaluate the proposed method on one publicly-available and three in-house datasets using the standard evaluation metrics such as the area under the ROC curve and F1 score. We find that, surprisingly, even though self-supervised pre-training performs slightly worse than supervised, it still preserves most of the gains from pre-training.
Copyright © 2020 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Automatic phenotyping; Natural language processing; Pre-training

Mesh:

Year:  2020        PMID: 33259943      PMCID: PMC7856089          DOI: 10.1016/j.jbi.2020.103626

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  13 in total

1.  Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors:  Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal:  J Am Med Inform Assoc       Date:  2010 Sep-Oct       Impact factor: 4.497

2.  Recognizing obesity and comorbidities in sparse data.

Authors:  Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2009-04-23       Impact factor: 4.497

3.  Electronic health records-driven phenotyping: challenges, recent advances, and perspectives.

Authors:  Jyotishman Pathak; Abel N Kho; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2013-12       Impact factor: 4.497

4.  Symbolic rule-based classification of lung cancer stages from free-text pathology reports.

Authors:  Anthony N Nguyen; Michael J Lawley; David P Hansen; Rayleen V Bowman; Belinda E Clarke; Edwina E Duhig; Shoni Colquist
Journal:  J Am Med Inform Assoc       Date:  2010 Jul-Aug       Impact factor: 4.497

5.  Semi-supervised Learning for Phenotyping Tasks.

Authors:  Dmitriy Dligach; Timothy Miller; Guergana K Savova
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

6.  Patient Representation Transfer Learning from Clinical Notes based on Hierarchical Attention Network.

Authors:  Yuqi Si; Kirk Roberts
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2020-05-30

Review 7.  Natural Language Processing for EHR-Based Computational Phenotyping.

Authors:  Zexian Zeng; Yu Deng; Xiaoyu Li; Tristan Naumann; Yuan Luo
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2018-06-25       Impact factor: 3.710

8.  Patient representation learning and interpretable evaluation using clinical notes.

Authors:  Madhumita Sushil; Simon Šuster; Kim Luyckx; Walter Daelemans
Journal:  J Biomed Inform       Date:  2018-07-03       Impact factor: 6.317

Review 9.  A review of approaches to identifying patient phenotype cohorts using electronic health records.

Authors:  Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J Embi; Noemie Elhadad; Stephen B Johnson; Albert M Lai
Journal:  J Am Med Inform Assoc       Date:  2013-11-07       Impact factor: 4.497

10.  MIMIC-III, a freely accessible critical care database.

Authors:  Alistair E W Johnson; Tom J Pollard; Lu Shen; Li-Wei H Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G Mark
Journal:  Sci Data       Date:  2016-05-24       Impact factor: 6.444

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.