Literature DB >> 35005628

MIMIC-SBDH: A Dataset for Social and Behavioral Determinants of Health.

Hiba Ahsan1, Emmie Ohnuki1, Avijit Mitra1, Hong Yu1.   

Abstract

Social and Behavioral Determinants of Health (SBDHs) are environmental and behavioral factors that have a profound impact on health and related outcomes. Given their importance, physicians document SBDHs of their patients in Electronic Health Records (EHRs). However, SBDHs are mostly documented in unstructured EHR notes. Determining the status of the SBDHs requires manually reviewing the notes which can be a tedious process. Therefore, there is a need to automate identifying the patients' SBDH status in EHR notes. In this work, we created MIMIC-SBDH, the first publicly available dataset of EHR notes annotated for patients' SBDH status. Specifically, we annotated 7,025 discharge summary notes for the status of 7 SBDHs as well as marked SBDH-related keywords. Using this annotated data for training and evaluation, we evaluated the performance of three machine learning models (Random Forest, XGBoost, and Bio-ClinicalBERT) on the task of identifying SBDH status in EHR notes. The performance ranged from the lowest 0.69 F1 score for Drug Use to the highest 0.96 F1 score for Community-Present. In addition to standard evaluation metrics such as the F1 score, we evaluated four capabilities that a model must possess to perform well on the task using the CheckList tool (Ribeiro et al., 2020). The results revealed several shortcomings of the models. Our results highlighted the need to perform more capability-centric evaluations in addition to standard metric comparisons.

Entities:  

Year:  2021        PMID: 35005628      PMCID: PMC8734043     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  17 in total

1.  Using Neural Multi-task Learning to Extract Substance Abuse Information from Clinical Notes.

Authors:  Kevin Lybarger; Meliha Yetisgen; Mari Ostendorf
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

2.  Detecting Social and Behavioral Determinants of Health with Structured and Free-Text Clinical Data.

Authors:  Daniel J Feller; Oliver J Bear Don't Walk Iv; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad
Journal:  Appl Clin Inform       Date:  2020-03-04       Impact factor: 2.342

3.  Extracting Alcohol and Substance Abuse Status from Clinical Notes: The Added Value of Nursing Data.

Authors:  Maxim Topaz; Ludmila Murga; Ofrit Bar-Bachar; Kenrick Cato; Sarah Collins
Journal:  Stud Health Technol Inform       Date:  2019-08-21

4.  Proactive tobacco cessation outreach to smokers of low socioeconomic status: a randomized clinical trial.

Authors:  Jennifer S Haas; Jeffrey A Linder; Elyse R Park; Irina Gonzalez; Nancy A Rigotti; Elissa V Klinger; Emily Z Kontos; Alan M Zaslavsky; Phyllis Brawarsky; Lucas X Marinacci; Stella St Hubert; Eric W Fleegler; David R Williams
Journal:  JAMA Intern Med       Date:  2015-02       Impact factor: 21.873

5.  Using natural language processing on the free text of clinical documents to screen for evidence of homelessness among US veterans.

Authors:  Adi V Gundlapalli; Marjorie E Carter; Miland Palmer; Thomas Ginter; Andrew Redd; Steven Pickard; Shuying Shen; Brett South; Guy Divita; Scott Duvall; Thien M Nguyen; Leonard W D'Avolio; Matthew Samore
Journal:  AMIA Annu Symp Proc       Date:  2013-11-16

6.  Barriers to Psychosocial Services among Homeless Women Veterans.

Authors:  Alison B Hamilton; Ines Poza; Vivian Hines; Donna L Washington
Journal:  J Soc Work Pract Addict       Date:  2012-02-22

7.  Identifying patient smoking status from medical discharge records.

Authors:  Ozlem Uzuner; Ira Goldstein; Yuan Luo; Isaac Kohane
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

8.  Using implicit information to identify smoking status in smoke-blind medical discharge summaries.

Authors:  Richard Wicentowski; Matthew R Sydes
Journal:  J Am Med Inform Assoc       Date:  2007-10-18       Impact factor: 4.497

9.  MIMIC-III, a freely accessible critical care database.

Authors:  Alistair E W Johnson; Tom J Pollard; Lu Shen; Li-Wei H Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G Mark
Journal:  Sci Data       Date:  2016-05-24       Impact factor: 6.444

10.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Authors:  Jinhyuk Lee; Wonjin Yoon; Sungdong Kim; Donghyeon Kim; Sunkyu Kim; Chan Ho So; Jaewoo Kang
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.