| Literature DB >> 36238199 |
Sunyang Fu1, Maria Vassilaki2, Omar A Ibrahim1, Ronald C Petersen2,3, Sandeep Pagali4, Jennifer St Sauver2, Sungrim Moon1, Liwei Wang1, Jungwei W Fan1,2, Hongfang Liu1, Sunghwan Sohn1.
Abstract
The secondary use of electronic health records (EHRs) faces challenges in the form of varying data quality-related issues. To address that, we retrospectively assessed the quality of functional status documentation in EHRs of persons participating in Mayo Clinic Study of Aging (MCSA). We used a convergent parallel design to collect quantitative and qualitative data and independently analyzed the findings. We discovered a heterogeneous documentation process, where the care practice teams, institutions, and EHR systems all play an important role in how text data is documented and organized. Four prevalent instrument-assisted documentation (iDoc) expressions were identified based on three distinct instruments: Epic smart form, questionnaire, and occupational therapy and physical therapy templates. We found strong differences in the usage, information quality (intrinsic and contextual), and naturality of language among different type of iDoc expressions. These variations can be caused by different source instruments, information providers, practice settings, care events and institutions. In addition, iDoc expressions are context specific and thus shall not be viewed and processed uniformly. We recommend conducting data quality assessment of unstructured EHR text prior to using the information.Entities:
Keywords: aging; electronic health records; functional status (activity levels); information quality; natural language processing
Year: 2022 PMID: 36238199 PMCID: PMC9552292 DOI: 10.3389/fdgth.2022.958539
Source DB: PubMed Journal: Front Digit Health ISSN: 2673-253X
Figure 1Overview of study design.
Figure 2Overview of ADL documentation process at institution 1.
Corpus statistics of three sites.
| Institution 1 | Institution 2 | Institution 3 | |
|---|---|---|---|
| No. of patients | 200 | 203 | 270 |
| No. of documents | 1421 | 4,174 | 981 |
| No. of ADL expressions | 900 | 3,611 | 416 |
| STS similarity, median (IQR)* | 0.29 (0.10) | 0.30 (0.17) | 0.31 (0.08) |
IQR, interquartile range.
Figure 3Distribution of ADL expressions in EHR documents across different institutions from the sampled annotation cohorts.
Summary of instrument representations and implications for NLP development.
| Instrument Type | EHR system smart form | Questionnaire form | PT/OT template |
|---|---|---|---|
| Instrument Form | [Context indicator] + [Affirmative statement] + [Template elements] | [Question] + [Answer] + [Date] | [Objective] + [Description] + [Summary of assessment] |
| Examples | Follow up (Sec ID) | Do you have serious difficulty walking or climbing stairs? Yes 04/25/2022 | PT Goals: Bed Mobility Goal: Moderate assistance Time Frame to Reach Bed Mobility Goal: 7-day (s) Bed Mobility Goal Status: Transfer Goal: Maximal assistance Transfer Device: Other: Time Frame to Reach Transfer Goal: 7-day (s) Transfer Goal Status: Ambulation Goal: Ambulation |
| Naturality of Language | High | Moderate | Low |
| Impact to Sections Detector | No | No | No |
| Impact to Sentence Detector | No | Yes | Yes |
| Impact to Context Algorithm | No | Yes | No |
| Preprocessing Needed for NLP Development | No | Yes | Yes |