| Literature DB >> 35694445 |
Denis Newman-Griffis1,2, Jonathan Camacho Maldonado1, Pei-Shu Ho1, Maryanne Sacco1, Rafael Jimenez Silva1, Julia Porcino1, Leighton Chan1.
Abstract
Background: Invaluable information on patient functioning and the complex interactions that define it is recorded in free text portions of the Electronic Health Record (EHR). Leveraging this information to improve clinical decision-making and conduct research requires natural language processing (NLP) technologies to identify and organize the information recorded in clinical documentation.Entities:
Keywords: ICF; artificial intelligence; clinical coding; disability evaluation; electronic health records; functional status; international classification of functioning disability and health; natural language processing
Year: 2021 PMID: 35694445 PMCID: PMC9180751 DOI: 10.3389/fresc.2021.742702
Source DB: PubMed Journal: Front Rehabil Sci ISSN: 2673-6861
FIGURE 1 |Flowchart illustration of the annotation process. Data sources and document counts are provided for Mobility and Self-Care/Domestic Life annotations separately.
FIGURE 2 |Structure of annotations for functional status information. Free text is annotated to identify activity mentions describing specific observations. Each activity mention may include one or more Action components, which can be mapped to second-level ICF categories.
FIGURE 3 |Conceptual illustration of the ICF coding process. Given an activity mention, an embedding representation of the report is calculated and then compared with other activity mentions, in Classification; or available ICF categories, in Candidate Selection.
Free text corpora used to train word embedding models for text representation.
| Training corpus | Number of notes | Number of words (approx.) | Data description |
|---|---|---|---|
| MIMIC | 2,083,180 | 497 million | Critical care admissions ( |
| NIHCC | 63,605 | 11.8 million | Physical therapy and occupational therapy encounters, used in our prior work on coding Mobility information to the ICF ( |
| SSA | 65,514 | 664 million | Clinical data associated with disability benefits claims submitted to SSA. New in this study. |
MIMIC-III was used to train both FastText and BERT models; NIHCC and SSA were used for FastText embeddings only.
Datasets of documents annotated for functional status information, drawn from U.S. Social Security Administration disability benefits cases.
| Mobility | Self-Care/Domestic Life | |
|---|---|---|
| Number of documents annotated | 289 | 329 |
| With activity mentions | 251 | 285 |
| Total activity mentions | 2,455 | 3,990 |
| Including at least one Action | 2,323 (94.6%) | 3,866 (96.9%) |
| Total number of Actions | 3,176 | 4,665 |
| Training set size (documents / Actions) | 203/2,361 | 229/3,350 |
| Test set size (documents / Actions) | 45/815 | 56/1,315 |
Separate sets of documents were annotated for Mobility (ICF Activities and Participation Chapter 4) and Self-Care/Domestic Life (ICF Activities and Participation Chapters 5 and 6).
ICF category descriptions and frequencies for Mobility dataset (3,176 samples total).
| Mobility category | Description | Frequency | % of all samples | Training samples | Test samples |
|---|---|---|---|---|---|
| d450 | Walking | 730 | 23.0% | 559 (77%) | 171 (23%) |
| d410 | Changing basic body position | 560 | 17.6% | 419 (75%) | 141 (25%) |
| d415 | Maintaining a body position | 508 | 16.0% | 385 (76%) | 123 (24%) |
| d440 | Fine hand use | 319 | 10.0% | 247 (77%) | 72 (23%) |
| d430 | Lifting and carrying objects | 244 | 7.7% | 167 (68%) | 77 (32%) |
| d475 | Driving | 215 | 6.8% | 165 (77%) | 50 (23%) |
| d445 | Hand and arm use | 163 | 5.1% | 104 (64%) | 59 (36%) |
| d455 | Moving around | 147 | 4.6% | 99 (67%) | 48 (33%) |
| Other | Mobility-related activities for which no specific ICF category could be identified | 123 | 3.9% | 96 (78%) | 27 (22%) |
| d470 | Using transportation | 103 | 3.2% | 80 (78%) | 23 (22%) |
| d460 | Moving around in different locations | 55 | 1.7% | 34 (62%) | 21 (38%) |
| d435 | Moving objects with lower extremities | 5 | 0.2% | 4 (80%) | 1 (20%) |
| d420 | Transferring oneself | 4 | 0.2% | 2 (50%) | 2 (50%) |
Categories are ordered by frequency in the dataset. Sample count and relative distribution between training data (203 documents, 2,361 samples) and test data (45 documents, 815 samples) are given for each category. Descriptions given are the preferred name of each category in the ICF.
ICF category descriptions and frequencies for Self-Care/Domestic Life dataset (4,665 samples total).
| Self-care/domestic life category | Description | Frequency | % of all samples | Training samples | Test samples |
|---|---|---|---|---|---|
| d570 | Looking after one’s health | 2,032 | 43.6% | 1,496 (74%) | 536 (26%) |
| Manage medication | Ability to manage medication (SNOMED CT code 285033005) | 520 | 11.1% | 359 (69%) | 161 (31%) |
| d540 | Dressing | 353 | 7.6% | 268 (76%) | 85 (24%) |
| d520 | Caring for body parts | 312 | 6.7% | 228 (73%) | 84 (27%) |
| d640 | Doing housework | 297 | 6.4% | 205 (69%) | 92 (31%) |
| d630 | Preparing meals | 222 | 4.8% | 165 (74%) | 57 (26%) |
| Other | Self-Care/Domestic Life activities for which no specific ICF category could be identified | 174 | 3.7% | 127 (73%) | 47 (27%) |
| Therapy | Compliance behavior to therapeutic regimen (SNOMED CT code 709007004) | 143 | 3.1% | 103 (72%) | 40 (28%) |
| d620 | Acquisition of goods and services | 142 | 3.0% | 101 (71%) | 41 (29%) |
| d510 | Washing oneself | 121 | 2.6% | 90 (74%) | 31 (26%) |
| d550 | Eating | 102 | 2.2% | 57 (56%) | 45 (44%) |
| d560 | Drinking | 82 | 1.8% | 60 (73%) | 22 (27%) |
| d660 | Assisting others | 79 | 1.7% | 46 (58%) | 33 (42%) |
| d650 | Caring for household objects | 40 | 0.8% | 24 (60%) | 16 (40%) |
| d530 | Toileting | 29 | 0.6% | 15 (52%) | 14 (48%) |
| d610 | Acquiring a place to live | 17 | 0.3% | 6 (35%) | 11 (65%) |
Categories are ordered by frequency in the dataset. Sample count and relative distribution between training data (229 documents, 3,350 samples) and test data (56 documents, 1,315 samples) are given for each category. Descriptions given are the preferred name of each category in the ICF.
FIGURE 4 |Development experiment results for selecting word embeddings. Development set performance (macro-averaged F-1 with 10-fold cross validation) is shown using each embedding strategy for both Mobility (A,B) and Self-Care/Domestic Life (C,D) data, using both classification (A,C) and candidate selection (B,D) approaches.
FIGURE 5 |The test set performance on automated ICF coding in Mobility (A) and Self-Care/Domestic Life (B) test sets. Performance is reported for the best classification (Mobility: NIHCC embeddings; Self-Care/Domestic Life: SSA embeddings) and candidate selection (both datasets: clinicalBERT embeddings) models.
FIGURE 6 |Automated coding performance for each distinct category in the Mobility dataset. Classification results are shown in (A), and candidate selection results in (B). Categories are ordered by descending frequency [illustrated in (C)].
FIGURE 7 |Automated coding performance for each distinct category in the Self-Care/Domestic Life dataset. Classification results are shown in (A), and candidate selection results in (B). Categories are ordered by descending frequency [illustrated in (C)].
Examples for the related labels of ICF category d570, Manage Medication, and Therapy.
| Category | Examples | Notes |
|---|---|---|
| d570 | Her sleep varies and she never feels rested | Not annotated; these fall within the Body Functions domain of the ICF. |
| She | Suicidal actions are annotated as indicating risks to health. | |
| He | Reference to alcohol consumption. | |
| Patient | Indicates the person is taking care of themselves. | |
| Her tendency to | Significant context is needed to clarify the impact on self-care. | |
| Manage medication | He is currently prescribed medication by his neurologist to slow down the progression of his symptoms | Not annotated; does not state whether the person is actually taking the medications or not. |
| Pt is currently | Medications the patient is currently taking; the medications themselves are not annotated. | |
| She | Reason for medication not needed; the specific medication is annotated to clarify what action is being performed. | |
| Therapy | He has had no psychiatric care and no history of psychiatric hospitalization | Not annotated; reference to therapeutic care the patient has not used. |
| She | Therapy for a particular purpose related to health. | |
| He was | Counseling for a particular purpose related to health. |
Brief notes are provided for each example as to why it was or was not annotated as shown. Activity mentions are indicated using yellow highlights and Actions are indicated using underlines.