| Literature DB >> 34613399 |
Braja G Patra1, Mohit M Sharma1, Veer Vekaria1, Prakash Adekkanattu2, Olga V Patterson3,4, Benjamin Glicksberg5, Lauren A Lepow5, Euijung Ryu6, Joanna M Biernacka6, Al'ona Furmanchuk7, Thomas J George8, William Hogan9, Yonghui Wu8, Xi Yang8, Jiang Bian8, Myrna Weissman10, Priya Wickramaratne10, J John Mann10, Mark Olfson10, Thomas R Campion1,2, Mark Weiner1, Jyotishman Pathak1.
Abstract
OBJECTIVE: Social determinants of health (SDoH) are nonclinical dispositions that impact patient health risks and clinical outcomes. Leveraging SDoH in clinical decision-making can potentially improve diagnosis, treatment planning, and patient outcomes. Despite increased interest in capturing SDoH in electronic health records (EHRs), such information is typically locked in unstructured clinical notes. Natural language processing (NLP) is the key technology to extract SDoH information from clinical text and expand its utility in patient care and research. This article presents a systematic review of the state-of-the-art NLP approaches and tools that focus on identifying and extracting SDoH data from unstructured clinical text in EHRs.Entities:
Keywords: electronic health records; information extraction; machine learning; natural language processing; population health outcomes; social determinants of health
Mesh:
Year: 2021 PMID: 34613399 PMCID: PMC8633615 DOI: 10.1093/jamia/ocab170
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 7.942
Figure 1.The County Health Rankings model of population health.
Figure 2.PRISMA workflow of included articles.
Figure 3.Publication years of included articles.
Figure 4.(a) Frequency of SDoH categories in the collected publications. (b) Heatmap of different SDoH categories combinations implemented in publications. (c) Year-wise frequencies of SDoH categories that were extracted using NLP.
Abbreviations: A, alcohol abuse/use; C, cigarettes/smoking status; CA, child abuse/adverse childhood experiences; CC, clinical care (access to care/quality of care); CS, community safety; DE, diet & exercise; DS, drug/substance abuse; E, employment; ED, education; EN, environment (water/air quality); F, financial/income issues; FI, food insecurity; H, housing issues; SA, sexual activity/abuse; SF, social connection/isolation or family problem; T, Transportation.
Figure 5.(a) Frequencies of rule-based and semiautomated methods for SDoH lexicon creation. (b) Frequencies of existing tools and systems (rule-based, supervised, and unsupervised) for SDoH identification/extraction. (c) Year-wise frequencies of different NLP methods that were used to extract different SDoH categories.
Abbreviations: A, alcohol abuse/use; C, cigarettes/smoking status; CA, child abuse/adverse childhood experiences; CC, clinical care (access to care/quality of care); CS, community safety; DE, diet & exercise; DS, drug/substance abuse; E, employment; ED, education; EN, environment (water/air quality); F, financial/income issues; FI, food insecurity; H, housing issues; SA, sexual activity/abuse; SF, social connection/isolation or family problem; T, Transportation.
Tools used for SDoH identifications and the corresponding citations
| NLP systems, terminologies, and infrastructure | Task (citations) |
|---|---|
| cTAKES | alcohol use status, tobacco cessation, diet and exercise, |
| Moonstone NLP | housing and social issues, |
| ARC | homelessness, |
| V3NLP | homelessness, |
| MediClass | opioid related overdose, |
| I2E | social isolation |
| UMLS | lifestyle modification |
| HITEx | smoking status |
| MTERMS | homelessness, social support, and drug abuse |
| MedTagger and MedTime | smoking status |
| VINCI | MST |
| VISA | adverse childhood experiences |
| CRIS-IE | smoking |
| TextHunter | cannabis use, |
Abbreviations: ARC, automated retrieval console; CRIS, clinical record interactive search; cTAKES, clinical Text Analysis and Knowledge Extraction System; HITEx, health information text extraction; MST, military sexual trauma ; VISA, veterans indexed search for analytics; VINCI, VA Informatics and Computing Infrastructure.
EHR data sources used for SDoH experiments. Here * represents different databases from the same source
| Datasets | Citations |
|---|---|
| 100 synthetic data sets using Monte Carlo methods |
|
| Academic Health Center Information Exchange (AHC-IE), Academic Health Center (AHC) |
|
| Brigham and Women’s Hospital or Massachusetts General Hospital |
|
| Centre Clinical Record Interactive Search (CRIS) |
|
| Cerner Corporation, Kansas, MO |
|
| Child Health Department Netherlands |
|
| Columbia University Irving Medical Center (CUIMC), Columbia University Medical Center (CUMC) |
|
| Epic EHR systems* |
|
| Fairview Health System |
|
| four HMOs |
|
| Group Health, Washington State |
|
| Informatics for Integrating Biology and the Bedside (i2b2) smoking database |
|
| Kaiser Permanente* |
|
| Level I Trauma Center |
|
| Loyola University Medical Center |
|
| Marshfield Clinic’s Enterprise Data Warehouse (MC-EDW)* |
|
| Mayo Clinic |
|
| Medical University of South Carolina (MUSC) Research Data Warehouse |
|
| Midwestern academic medical center |
|
| MIMIC-II |
|
| MIMIC-III |
|
| Minnesota Disability Determination Services |
|
| MTSamples |
|
| Multilevel academic health care system |
|
| National Homeless Registry |
|
| Loyola University Medical Center |
|
| Partners Healthcare System |
|
| SLaM Case Register |
|
| South London and Maudsley (SLaM) Biomedical Research Centre (BRC) Case Register |
|
| State child welfare agencies |
|
| UK Clinical Record Interactive Search (UK-CRIS) |
|
| University of Pittsburgh Medical Center (UPMC) |
|
| University of Minnesota* |
|
| University of Vermont Medical Center (UVMMC) |
|
| University Hospital, University of Utah, Salt Lake City |
|
| University of Massachusetts Memorial Health Care |
|
| University of Utah Health Sciences Center |
|
| University of Washington (UW) and Harborview Medical Centers |
|
| Urban tertiary academic center 18 |
|
| US academic health system |
|
| Vanderbilt HER, Vanderbilt Synthetic Derivative, Vanderbilt University Medical Center (VUMC) |
|
| VeteransHealthAdministration, VA’sCorporate Data Warehouse (CDW)* |
|
SDoH classes and the corresponding healthcare outcomes
| SDoH categories | Outcome |
|---|---|
| Transportation | Multiple social and behavioral factors, |
| Housing issues | 30-day readmission, |
| Environment (water/air quality) | Food and drug allergies |
| Employment | Suicide, |
| Education | Mental and behavioral disorders and multiple SDOH, |
| Financial issues/income | Multiple Social and Behavioral factors, |
| Social connection/isolation or family problem | Multiple social and behavioral factors, |
| Community safety | Mental illness |
| Food insecurity | Multiple social and behavioral factors, |
| Child abuse/adverse childhood experiences | Childhood abuse, |
| Sexual activity/abuse | Suicide, |
| Drug/substance abuse | Chronic opioid therapy, |
| Alcohol abuse/use | Multiple social and behavioral factors, |
| Cigarettes/smoking status | Multiple social and behavioral factors, |
| Diet & exercise | dementia, |
| Clinical care (access to care/quality of care) | Mental and behavioral disorders and multiple SDOH, |