| Literature DB >> 31345210 |
Ellen L Palmer1, John Higgins2, Saeed Hassanpour2, James Sargent2, Christina M Robinson3, Jennifer A Doherty4, Tracy Onega2.
Abstract
BACKGROUND: Approximately 20% of deaths in the US each year are attributable to smoking, yet current practices in the recording of this health risk in electronic health records (EHRs) have not led to discernable changes in health outcomes. Several groups have developed algorithms for extracting smoking behaviors from clinical notes, but none of these approaches were assessed with external data to report on anticipated clinical performance.Entities:
Keywords: Electronic health records; Informatics pipeline; Natural language processing; Smokers registry
Mesh:
Year: 2019 PMID: 31345210 PMCID: PMC6657182 DOI: 10.1186/s12911-019-0864-2
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Data Availability and Linkage Process. Information sources utilized to assess the quality of the informatics pipeline for smoking behavior data. a Breakdown of information captured by the informatics pipeline b Breakdown of information from the semi-structured EHR fields c Breakdown of information from the NHCR and d Workflow of the information sources utilized, and how they relate to each other. Abbreviations: Electronic Health Record (EHR), New Hampshire Colonoscopy Registry (NHCR), Pack years (PY)
Smoking status and lung cancer screening eligibility identification in the EHR using an informatics pipeline
| Concordance between NHCR reported status and smokers’ registry status | |||
|---|---|---|---|
| Pipeline: Clinical notes only ( | Pipeline and Semi-structured ( | Ever smoker vs. Never smoker ( | |
| Cohen’s Kappa | 0.56 | 0.62 | 0.59 |
| Concordance | 63.9% | 67.8% | 83.6% |
Summary statistics for smoking status detection using the informatics pipeline on clinical notes only, the pipeline merged with semi-structured data, and the merged pipeline with semi-structure data simplified to ever smoker vs. never smoker
Smoking status classification between the external questionnaire and the joint EHR sources
| EHR smoking status | Total | ||||||
|---|---|---|---|---|---|---|---|
| Current | Former | Never | Smoker | Unknown | |||
| External smoking status | Current | 9 | 9 | 1 | 68 | 247 | |
| Former | 55 | 27 | 1 | 500 | 614 | ||
| Never | 23 | 6 | 1 | 533 | 643 | ||
| Total | 238 | 46 | 116 | 3 | 1,101 | 1,504 | |
Cross-tabulation of the smoking statuses individuals reported on an unrelated questionnaire against our informatics pipeline identified smoking statuses. Current = current smoker; former = former smoker; never = never smoker; smoker = smoker temporality unknown; unknown = record did not contain enough information relating to smoking behaviors to classify. Bolded numbers are concordant for status between the EHR and external records
Fig. 2Plot of Pack Year History Reporting to External Questionnaire vs. EHR. Pack year history reported to NHCR (y-axis) against EHR derived pack year history (x-axis). Blue dashed lines indicate a 30-pack year history, the threshold for lung cancer screening. Gray dotted lines are at every 5 years, up to the 30-year line
Lung cancer screening eligibility classification between the external questionnaire and EHR sources
| EHR lung cancer screening eligibility | Total | ||||
|---|---|---|---|---|---|
| Eligible | Not eligible | Missing data | |||
| External lung cancer screening eligibility | Eligible | 41 | 24 | 264 | 329 |
| Not eligible | 7 | 124 | 948 | 1,079 | |
| Missing data | 6 | 13 | 77 | 96 | |
| Total | 54 | 161 | 1289 | 1504 | |
Lung cancer screening eligibility prediction based on external reporting and EHR derived smoking status and pack year history
Lung cancer screening eligibility
| Pipeline: Clinical notes only ( | Pipeline and Semi-structured ( | |
|---|---|---|
| PPV | 75.0% | 85.4% |
| NPV | 77.2% | 83.8% |
| Sensitivity | 20.9% | 63.1% |
| Specificity | 97.5% | 94.7% |
Summary statistics for identifying patients eligible for lung cancer screening using EHR data. These metrics assumed true eligibility was captured by the NHCR questionnaire