| Literature DB >> 27668266 |
Farhood Farjah1, Scott Halgrim2, Diana S M Buist3, Michael K Gould4, Steven B Zeliadt5, Elizabeth T Loggers6, David S Carrell3.
Abstract
INTRODUCTION: The incidence of incidentally detected lung nodules is rapidly rising, but little is known about their management or associated patient outcomes. One barrier to studying lung nodule care is the inability to efficiently and reliably identify the cohort of interest (i.e. cases). Investigators at Kaiser Permanente Southern California (KPSC) recently developed an automated method to identify individuals with an incidentally discovered lung nodule, but the feasibility of implementing this method across other health systems is unknown.Entities:
Keywords: data collection; lung neoplasms; natural language processing; solitary pulmonary nodule
Year: 2016 PMID: 27668266 PMCID: PMC5013935 DOI: 10.13063/2327-9214.1254
Source DB: PubMed Journal: EGEMS (Wash DC) ISSN: 2327-9214
Figure 1.Nodule Size Ascertained by NLP Versus Chart Abstraction Among Individuals with a Confirmed Lung Nodule
Population and Sample Characteristics
| 65 (18–89) | 66 (20–89) | |
| 43% | 43% | |
| | 83% | 83% |
| | 4.6% | 3.8% |
| | 5.8% | 6.4% |
| | 1.5% | |
| | 0.9% | 1.2% |
| | 0.3% | |
| | 4.0% | 4.4% |
| 4.1% | 5.2% | |
| | 86% | 82% |
| | 3.7% | 4.4% |
| | 11% | 14% |
| | 6.0% | 6.8% |
| | 8.0% | 8.6% |
| | 2.9% | 2.2% |
| | 3.2% | 3.2% |
| | 13% | 13% |
| | 27% | 28% |
| | 48% | 50% |
| | 41% | 37% |
| | 11% | 12% |
| | 0.6% | |
| | 31% | 28% |
Notes: Alaskan Native (AN), American Indian (AI), Native Hawaiian (NH), Pacific Islander (PI)
Information obtained from administrative data, tumor registry data, and structured data elements within the electronic health record (e.g., smoking status).
Columns do not necessarily add up to 100 percent. Data for esophageal cancer and melanoma are not shown because there were five or fewer subjects within one of the samples.
Ascertained using structured data elements within the electronic health record. These data were obtained within a median of 18 days (range 1–2,593 days) prior to the date of the CT.
Calculated among former and current smokers.
Accuracy (95% Confidence Interval) of NLP in Identifying Individuals with a Lung Nodule
| 49% (40–56%) | 18% (15–22%) | 31% (27–36%) | 22% (18–26%) | 9.2% (6.8–12%) | |
| 54% (45–64%) | 24% (20–28%) | 38% (33–42%) | 25% (21–28%) | 13% (10–16%) | |
| 96% (88–100%) | 88% (79–94%) | 90% (95% CI 85–95%) | 86% (79–92%) | 87% (74–95%) | |
| 86% (75%–94%) | 90% (87–93%) | 86% (82–89%) | 93% (90–95%) | 94% (92–96%) | |
| 87% (77–94%) | 66% (57–75%) | 75% (68–81%) | 77% (69–84%) | 61% (48–72%) | |
| 96% (87–100%) | 97% (95–99%) | 95% (92–97%) | 96% (94–98%) | 99% (97–99%) |
Notes: Natural language processing (NLP), positive predictive value (PPV), negative predictive value (NPV), confidence interval (CI).
Provider Documentation of Radiologic Characteristics and Lung Cancer Risk Factors among Individuals with a Lung Nodule
| | 100% (98–100%) |
| | 99% (96–100%) |
| | 99% (96–100%) |
| | 31% (24–39%) |
| | 7.7% (4.0–13%) |
| | 5.1% (2.2–10%) |
| | 95% (90–98%) |
| | 89% (83–94%) |
| | 86% (79–91%) |
| | 78% (67–86%) |
| | 35% (27–43%) |
| | 32% (25–40%) |
| | 10% (6.0–16%) |
| | 5.1% (2.2–9.9%) |
| | 3.2% (1.0–7.3%) |
Notes:
Data for the following variables were suppressed because five or fewer subjects had documentation about the risk factor: radon, silica, cadmium, asbestos, arsenic, beryllium, chromium, diesel fuel, nickel, coal smoke, soot, and dust.
Ascertained by chart abstraction of structured and unstructured data within the electronic health record (EHR) 30 days prior to the date of the CT.
Calculated among former smokers.
Figure 2.Correlation Between Chart Abstracted and NLP Abstracted Nodule Size