| Literature DB >> 33786419 |
Xi Yang1, Hanyuan Yang2, Tianchen Lyu1, Shuang Yang1, Yi Guo1, Jiang Bian1, Hua Xu3, Yonghui Wu1.
Abstract
This study presents a natural language processing (NLP) tool to extract quantitative smoking information (e.g., Pack-Year, Quit Year, Smoking Year, and Pack per Day) from clinical notes and standardized them into Pack-Year unit. We annotated a corpus of 200 clinical notes from patients who had low-dose CT imaging procedures for lung cancer screening and developed an NLP system using a two-layer rule-engine structure. We divided the 200 notes into a training set and a test set and developed the NLP system only using the training set. The experimental results on the test set showed that our NLP system achieved the best F1 scores of 0.963 and 0.946 for lenient and strict evaluation, respectively.Entities:
Keywords: natural language processing; quantitative smoking information extraction; tobacco use
Year: 2021 PMID: 33786419 PMCID: PMC8006894 DOI: 10.1109/ICHI48887.2020.9374369
Source DB: PubMed Journal: IEEE Int Conf Healthc Inform ISSN: 2575-2626