Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Pre-training phenotyping classifiers.

Literature DB >> 33259943

Pre-training phenotyping classifiers.

Dmitriy Dligach¹, Majid Afshar², Timothy Miller³.

Abstract

Recent transformer-based pre-trained language models have become a de facto standard for many text classification tasks. Nevertheless, their utility in the clinical domain, where classification is often performed at encounter or patient level, is still uncertain due to the limitation on the maximum length of input. In this work, we introduce a self-supervised method for pre-training that relies on a masked token objective and is free from the limitation on the maximum input length. We compare the proposed method with supervised pre-training that uses billing codes as a source of supervision. We evaluate the proposed method on one publicly-available and three in-house datasets using the standard evaluation metrics such as the area under the ROC curve and F1 score. We find that, surprisingly, even though self-supervised pre-training performs slightly worse than supervised, it still preserves most of the gains from pre-training.

Entities: Chemical

Keywords: Automatic phenotyping; Natural language processing; Pre-training

Mesh：

Year: 2020 PMID： 33259943 PMCID： PMC7856089 DOI： 10.1016/j.jbi.2020.103626

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
References

13 in total

1. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497

Pre-training phenotyping classifiers.

1. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

2. Recognizing obesity and comorbidities in sparse data.

3. Electronic health records-driven phenotyping: challenges, recent advances, and perspectives.

4. Symbolic rule-based classification of lung cancer stages from free-text pathology reports.

5. Semi-supervised Learning for Phenotyping Tasks.

6. Patient Representation Transfer Learning from Clinical Notes based on Hierarchical Attention Network.

Review 7. Natural Language Processing for EHR-Based Computational Phenotyping.

8. Patient representation learning and interpretable evaluation using clinical notes.

Review 9. A review of approaches to identifying patient phenotype cohorts using electronic health records.

10. MIMIC-III, a freely accessible critical care database.