| Literature DB >> 29202185 |
Sunghwan Sohn1, Yanshan Wang1, Chung-Il Wi2, Elizabeth A Krusemark2, Euijung Ryu1, Mir H Ali3, Young J Juhn2, Hongfang Liu1.
Abstract
OBJECTIVE: To assess clinical documentation variations across health care institutions using different electronic medical record systems and investigate how they affect natural language processing (NLP) system portability.Entities:
Keywords: asthma; documentation variation; electronic medical records; natural language processing; portability
Year: 2018 PMID: 29202185 PMCID: PMC7378885 DOI: 10.1093/jamia/ocx138
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Predetermined asthma criteria (PAC).
Corpus statistics of Mayo Clinic and SCH (n = 298 patients each)
| Category | Mayo | SCH |
|---|---|---|
| Total no. of documents | 9604 | 30 589 |
| Total no. of tokens | 2 212 389 | 10 117 963 |
| No. of documents/patient, median (IQR) | 27 (18) | 80 (69.8) |
| No. of tokens/document, median (IQR) | 186 (210) | 103 (331) |
| No. of asthma-related concepts | 19.5 (32.8) | 65.5 (88) |
| No. of asthma-related concepts/document, median (IQR) | 2 (3) | 1 (2) |
| No. of note types | 16 | 32 |
| No. of sections | 17 | 54 |
aEach concept consists of a set of keywords. IQR: interquartile range.
Figure 2.Distribution of asthma-related concepts.
Figure 3.Note types that contain asthma-related concepts (y axis is proportion of asthma-related concepts; includes note types with proportion ≥ 0.01).
Figure 4.Sections that contain asthma-related concepts (y axis is proportion of asthma-related concepts; includes sections with proportion ≥ 0.01).
The similarities of Mayo Clinic and SCH corpora
| Data source | Topic | ||
|---|---|---|---|
| Whole corpus | 0.669 | 0.581 | 0.944 |
| Asthma-related concepts | 0.971 | 0.855 | NA |
NA: not applicable.
Figure 5.A heat map of note type similarity based on topics at Mayo (bottom label) and SCH (right label).
NLP-PAC performance for asthma ascertainment
| Metrics | Mayo | SCH Stage 1 (prototype) | SCH Stage 2 (refinement) |
|---|---|---|---|
| Sensitivity | 0.972 | 0.840 | 0.920 |
| Specificity | 0.957 | 0.924 | 0.964 |
| PPV | 0.905 | 0.788 | 0.896 |
| NPV | 0.988 | 0.945 | 0.973 |
| 0.937 | 0.813 | 0.908 |