| Literature DB >> 35687381 |
Jennifer Lee1,2, Samuel Yang1,2, Cynthia Holland-Hall1,2, Emre Sezgin1, Manjot Gill2, Simon Linwood1, Yungui Huang1, Jeffrey Hoffman1,2.
Abstract
BACKGROUND: With the increased sharing of electronic health information as required by the US 21st Century Cures Act, there is an increased risk of breaching patient, parent, or guardian confidentiality. The prevalence of sensitive terms in clinical notes is not known.Entities:
Keywords: adolescent; child; eHealth; natural language processing; patient portals; privacy
Year: 2022 PMID: 35687381 PMCID: PMC9233261 DOI: 10.2196/38482
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Combining natural language processing (NLP) and expert definitions to identify sensitive notes in the electronic health record. Overview of sensitive term identification protocol. Four categories and sensitive keywords representative for each category were identified by local subject matter experts. A natural language processing tool trained on the entire cohort of notes at the organization was used for keyword expansion. Each sensitive keyword was expanded to 60 potentially related terms. Each related term was manually annotated as a "sensitive" or "not sensitive" term by board-certified pediatricians and adolescent medicine specialists. Exact string word matching was used to determine if a sensitive term was documented in a clinical note; ptsd: posttraumatic stress disorder.
Patient, encounter, and provider characteristics.
| Populations and characteristics | Values, n (%) | ||
|
| |||
|
|
|
| |
|
|
| Less than 13 | 209,359 (74.84) |
|
|
| 13 to 18 | 59,415 (21.23) |
|
|
| 18 to 21 | 10,963 (3.92) |
|
|
|
| |
|
|
| Male | 142,539 (50.95) |
|
|
| Female | 137,180 (49.04) |
|
|
| Unknown | 18 (0.01) |
|
|
|
| |
|
|
| White | 151,988 (54.33) |
|
|
| Black or African American | 66,995 (23.95) |
|
|
| Latino or Hispanic | 19,053 (6.81) |
|
|
| Other or unknown | 41,701 (14.91) |
|
| 763,133 | ||
|
| Ambulatory care | 536,201 (70.3) | |
|
| Emergency care | 188,204 (24.7) | |
|
| Inpatient care | 38,728 (5.1) | |
|
| |||
|
| Resident | 888 (37.92) | |
|
| Attending | 828 (35.35) | |
|
| Fellow | 393 (16.78) | |
|
| Advanced practice provider | 233 (9.94) | |
|
| |||
|
| Notes with sensitive terms | 501,762 (37.49) | |
Most frequently used sensitive terms by category.
| Category and term | Term frequency, n | Note frequency, n | |
|
| |||
|
| tobacco | 190,547 | 119,764 |
|
| alcohol | 143,945 | 107,871 |
|
| substance | 101,183 | 78,997 |
|
| smoker | 51,572 | 50,538 |
|
| cigarettes | 36,444 | 35,443 |
|
| substance abuse | 28,970 | 23,131 |
|
| thca | 21,216 | 14,153 |
|
| marijuana | 16,625 | 10,985 |
|
| smoked | 14,508 | 14,271 |
|
| cocaine | 13,747 | 8618 |
|
|
|
| |
|
| anxiety | 418,766 | 143,968 |
|
| depression | 270,661 | 150,934 |
|
| mood | 267,706 | 122,293 |
|
| suicidal | 224,989 | 72,709 |
|
| suicidal ideation | 140,918 | 57,057 |
|
| suicide | 109,123 | 46,463 |
|
| sib | 66,977 | 35,713 |
|
| panic | 52,040 | 32,025 |
|
| bipolar | 46,729 | 35,539 |
|
| depressive | 41,511 | 26,025 |
|
|
|
| |
|
| sexual | 238,310 | 84,710 |
|
| pregnancy | 118,872 | 77,337 |
|
| hiv | 80,306 | 56,072 |
|
| partner | 62,456 | 33,155 |
|
| sexually | 44,491 | 33,902 |
|
| sexually active | 37,149 | 29,817 |
|
| sexual abuse | 36,000 | 16,030 |
|
| stic | 33,904 | 22,679 |
|
| sex | 30,612 | 21,714 |
|
| partners | 23,406 | 13,461 |
|
|
|
| |
|
| abuse | 156,957 | 70,712 |
|
| food insecurity | 21,108 | 14,848 |
|
| bullying | 17,259 | 10,712 |
|
| conflict | 14,997 | 9657 |
|
| cpsd | 13,962 | 9081 |
|
| weapons | 12,016 | 9485 |
|
| abuse or neglect | 11,671 | 6212 |
|
| emotional abuse or neglect | 11,195 | 5784 |
|
| perpetration | 11,017 | 5403 |
|
| ycsue | 10,511 | 6488 |
athc: tetrahydrocannabinol.
bsi: suicidal ideation.
csti: sexually transmitted infection.
dcps: Child Protective Services.
eycsu: Youth Christian Social Union.
Figure 2Percent of notes containing sensitive terms by age of patient and category. Line graph depicting percent of clinical notes containing at least one sensitive term over age. Sensitive terms are found in a portion of clinical notes for all patient ages. This figure demonstrates that while all categories show an upward trend during adolescent age, in the first year of life, reproductive health and substance abuse categories are the most frequently documented.
Figure 3Percent of notes containing sensitive terms by age and note type. This heat map demonstrates the specific note types that contain at least one sensitive term of any category. Sensitive terms are found in a portion of all clinical note types examined in all age groups. This figure demonstrates that while all categories show an upward trend of including sensitive notes during adolescent age, the history and physical note is most likely to contain sensitive term overall. APN: ambulatory progress note; EPN: emergency care and urgent care progress note; H&P: history and physical note; ICN: inpatient consult note; IDS: inpatient discharge summary; IPN: inpatient progress note.