Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Regular expression-based learning to extract bodyweight values from clinical notes.

Literature DB >> 25746391

Regular expression-based learning to extract bodyweight values from clinical notes.

Maureen A Murtaugh¹, Bryan Smith Gibson², Doug Redd³, Qing Zeng-Treitler³.

Abstract

BACKGROUND: Bodyweight related measures (weight, height, BMI, abdominal circumference) are extremely important for clinical care, research and quality improvement. These and other vitals signs data are frequently missing from structured tables of electronic health records. However they are often recorded as text within clinical notes. In this project we sought to develop and validate a learning algorithm that would extract bodyweight related measures from clinical notes in the Veterans Administration (VA) Electronic Health Record to complement the structured data used in clinical research.
METHODS: We developed the Regular Expression Discovery Extractor (REDEx), a supervised learning algorithm that generates regular expressions from a training set. The regular expressions generated by REDEx were then used to extract the numerical values of interest. To train the algorithm we created a corpus of 268 outpatient primary care notes that were annotated by two annotators. This annotation served to develop the annotation process and identify terms associated with bodyweight related measures for training the supervised learning algorithm. Snippets from an additional 300 outpatient primary care notes were subsequently annotated independently by two reviewers to complete the training set. Inter-annotator agreement was calculated. REDEx was applied to a separate test set of 3561 notes to generate a dataset of weights extracted from text. We estimated the number of unique individuals who would otherwise not have bodyweight related measures recorded in the CDW and the number of additional bodyweight related measures that would be additionally captured.
RESULTS: REDEx's performance was: accuracy=98.3%, precision=98.8%, recall=98.3%, F=98.5%. In the dataset of weights from 3561 notes, 7.7% of notes contained bodyweight related measures that were not available as structured data. In addition 2 additional bodyweight related measures were identified per individual per year.
CONCLUSION: Bodyweight related measures are frequently stored as text in clinical notes. A supervised learning algorithm can be used to extract this data. Implications for clinical care, epidemiology, and quality improvement efforts are discussed.

Entities: Disease Species

Keywords: Bodyweight; Natural language processing; Text classification

Mesh：

Year: 2015 PMID： 25746391 DOI： 10.1016/j.jbi.2015.02.009

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
Cited

9 in total

1. Extracting Information from Electronic Medical Records to Identify the Obesity Status of a Patient Based on Comorbidities and Bodyweight Measures.

Authors: Rosa L Figueroa; Christopher A Flores
Journal: J Med Syst Date: 2016-07-11 Impact factor: 4.460

2. A method to advance adolescent sexual health research: Automated algorithm finds sexual history documentation.

Authors: Caryn Robertson; Gargi Mukherjee; Holly Gooding; Swaminathan Kandaswamy; Evan Orenstein
Journal: Front Digit Health Date: 2022-07-22

Review 3. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.

Authors: Kory Kreimeyer; Matthew Foster; Abhishek Pandey; Nina Arya; Gwendolyn Halford; Sandra F Jones; Richard Forshee; Mark Walderhaug; Taxiarchis Botsis
Journal: J Biomed Inform Date: 2017-07-17 Impact factor: 6.317

4. Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing.

Authors: Joseph S Redman; Yamini Natarajan; Jason K Hou; Jingqi Wang; Muzammil Hanif; Hua Feng; Jennifer R Kramer; Roxanne Desiderio; Hua Xu; Hashem B El-Serag; Fasiha Kanwal
Journal: Dig Dis Sci Date: 2017-08-31 Impact factor: 3.199

Review 5. Overview of the First Natural Language Processing Challenge for Extracting Medication, Indication, and Adverse Drug Events from Electronic Health Record Notes (MADE 1.0).

Authors: Abhyuday Jagannatha; Feifan Liu; Weisong Liu; Hong Yu
Journal: Drug Saf Date: 2019-01 Impact factor: 5.606

6. Regular Expression-Based Learning for METs Value Extraction.

Authors: Douglas Redd; Jinqiu Kuang; April Mohanty; Bruce E Bray; Qing Zeng-Treitler
Journal: AMIA Jt Summits Transl Sci Proc Date: 2016-07-20

Review 7. Clinical concept extraction: A methodology review.

Authors: Sunyang Fu; David Chen; Huan He; Sijia Liu; Sungrim Moon; Kevin J Peterson; Feichen Shen; Liwei Wang; Yanshan Wang; Andrew Wen; Yiqing Zhao; Sunghwan Sohn; Hongfang Liu
Journal: J Biomed Inform Date: 2020-08-06 Impact factor: 6.317

Review 8. Factors influencing the development of primary care data collection projects from electronic health records: a systematic review of the literature.

Authors: Marie-Line Gentil; Marc Cuggia; Laure Fiquet; Camille Hagenbourger; Thomas Le Berre; Agnès Banâtre; Eric Renault; Guillaume Bouzille; Anthony Chapron
Journal: BMC Med Inform Decis Mak Date: 2017-09-25 Impact factor: 2.796

9. The internal validation of weight and weight change coding using weight measurement data within the UK primary care Electronic Health Record.

Authors: Brian D Nicholson; Paul Aveyard; Willie Hamilton; Clare R Bankhead; Constantinos Koshiaris; Sarah Stevens; Frederick Dr Hobbs; Rafael Perera
Journal: Clin Epidemiol Date: 2019-01-25 Impact factor: 4.790

9 in total