| Literature DB >> 32477672 |
Suraj Rajendran1,2, Umit Topaloglu1.
Abstract
Half a million people die every year from smoking-related issues across the United States. It is essential to identify individuals who are tobacco-dependent in order to implement preventive measures. In this study, we investigate the effectiveness of deep learning models to extract smoking status of patients from clinical progress notes. A Natural Language Processing (NLP) Pipeline was built that cleans the progress notes prior to processing by three deep neural networks: a CNN, a unidirectional LSTM, and a bidirectional LSTM. Each of these models was trained with a pre- trained or a post-trained word embedding layer. Three traditional machine learning models were also employed to compare against the neural networks. Each model has generated both binary and multi-class label classification. Our results showed that the CNN model with a pre-trained embedding layer performed the best for both binary and multi- class label classification. ©2020 AMIA - All rights reserved.Entities:
Year: 2020 PMID: 32477672 PMCID: PMC7233082
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc