Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse.

Literature DB >> 31233140

Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse.

Dmitriy Dligach^1,2,3, Majid Afshar^2,3, Timothy Miller⁴.

Abstract

OBJECTIVE: Our objective is to develop algorithms for encoding clinical text into representations that can be used for a variety of phenotyping tasks.
MATERIALS AND METHODS: Obtaining large datasets to take advantage of highly expressive deep learning methods is difficult in clinical natural language processing (NLP). We address this difficulty by pretraining a clinical text encoder on billing code data, which is typically available in abundance. We explore several neural encoder architectures and deploy the text representations obtained from these encoders in the context of clinical text classification tasks. While our ultimate goal is learning a universal clinical text encoder, we also experiment with training a phenotype-specific encoder. A universal encoder would be more practical, but a phenotype-specific encoder could perform better for a specific task.
RESULTS: We successfully train several clinical text encoders, establish a new state-of-the-art on comorbidity data, and observe good performance gains on substance misuse data. DISCUSSION: We find that pretraining using billing codes is a promising research direction. The representations generated by this type of pretraining have universal properties, as they are highly beneficial for many phenotyping tasks. Phenotype-specific pretraining is a viable route for trading the generality of the pretrained encoder for better performance on a specific phenotyping task.
CONCLUSIONS: We successfully applied our approach to many phenotyping tasks. We conclude by discussing potential limitations of our approach.

Keywords: biomedical informatics; natural language processing; phenotyping

Mesh：

Year: 2019 PMID： 31233140 PMCID： PMC6798566 DOI： 10.1093/jamia/ocz072

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

10 in total

1. Catastrophic forgetting in connectionist networks.

Authors:
Journal: Trends Cogn Sci Date: 1999-04 Impact factor: 20.229

2. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

Authors: Özlem Uzuner; Brett R South; Shuying Shen; Scott L DuVall
Journal: J Am Med Inform Assoc Date: 2011-06-16 Impact factor: 4.497

3. $\mathtt {Deepr}$: A Convolutional Net for Medical Records.

Authors: Phuoc Nguyen; Truyen Tran; Nilmini Wickramasinghe; Svetha Venkatesh
Journal: IEEE J Biomed Health Inform Date: 2016-12-01 Impact factor: 5.772

4. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks.

Authors: Edward Choi; Mohammad Taha Bahadori; Andy Schuetz; Walter F Stewart; Jimeng Sun
Journal: JMLR Workshop Conf Proc Date: 2016-12-10

5. Development of the Alcohol Use Disorders Identification Test (AUDIT): WHO Collaborative Project on Early Detection of Persons with Harmful Alcohol Consumption--II.

Authors: J B Saunders; O G Aasland; T F Babor; J R de la Fuente; M Grant
Journal: Addiction Date: 1993-06 Impact factor: 6.526

6. Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation.

Authors: Majid Afshar; Andrew Phillips; Niranjan Karnik; Jeanne Mueller; Daniel To; Richard Gonzalez; Ron Price; Richard Cooper; Cara Joyce; Dmitriy Dligach
Journal: J Am Med Inform Assoc Date: 2019-03-01 Impact factor: 4.497

Review 7. Systematic review of comorbidity indices for administrative data.

Authors: Mansour T A Sharabiani; Paul Aylin; Alex Bottle
Journal: Med Care Date: 2012-12 Impact factor: 2.983

8. Patient representation learning and interpretable evaluation using clinical notes.

Authors: Madhumita Sushil; Simon Šuster; Kim Luyckx; Walter Daelemans
Journal: J Biomed Inform Date: 2018-07-03 Impact factor: 6.317

9. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records.

Authors: Riccardo Miotto; Li Li; Brian A Kidd; Joel T Dudley
Journal: Sci Rep Date: 2016-05-17 Impact factor: 4.379

10. MIMIC-III, a freely accessible critical care database.

Authors: Alistair E W Johnson; Tom J Pollard; Lu Shen; Li-Wei H Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G Mark
Journal: Sci Data Date: 2016-05-24 Impact factor: 6.444

10 in total

6 in total

Review 1. A scoping review of ethics considerations in clinical natural language processing.

Authors: Oliver J Bear Don't Walk; Harry Reyes Nieva; Sandra Soo-Jin Lee; Noémie Elhadad
Journal: JAMIA Open Date: 2022-05-26

2. Using Natural Language Processing and Machine Learning to Identify Hospitalized Patients with Opioid Use Disorder.

Authors: Suzanne V Blackley; Erin MacPhaul; Bianca Martin; Wenyu Song; Joji Suzuki; Li Zhou
Journal: AMIA Annu Symp Proc Date: 2021-01-25

3. Pre-training phenotyping classifiers.

Authors: Dmitriy Dligach; Majid Afshar; Timothy Miller
Journal: J Biomed Inform Date: 2020-11-28 Impact factor: 6.317

4. External validation of an opioid misuse machine learning classifier in hospitalized adult patients.

Authors: Majid Afshar; Brihat Sharma; Sameer Bhalla; Hale M Thompson; Dmitriy Dligach; Randy A Boley; Ekta Kishen; Alan Simmons; Kathryn Perticone; Niranjan S Karnik
Journal: Addict Sci Clin Pract Date: 2021-03-17

5. Improving the Performance of Outcome Prediction for Inpatients With Acute Myocardial Infarction Based on Embedding Representation Learned From Electronic Medical Records: Development and Validation Study.

Authors: Yanqun Huang; Zhimin Zheng; Moxuan Ma; Xin Xin; Honglei Liu; Xiaolu Fei; Lan Wei; Hui Chen
Journal: J Med Internet Res Date: 2022-08-03 Impact factor: 7.076

6. Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients.

Authors: Brihat Sharma; Dmitriy Dligach; Kristin Swope; Elizabeth Salisbury-Afshar; Niranjan S Karnik; Cara Joyce; Majid Afshar
Journal: BMC Med Inform Decis Mak Date: 2020-04-29 Impact factor: 3.298

6 in total