| Literature DB >> 28815116 |
Robert Goodloe1, Eric Farber-Eger1, Jonathan Boston1, Dana C Crawford2, William S Bush2.
Abstract
Body mass index (BMI) is an important outcome and covariate adjustment for many clinical association studies. Accurate assessment of BMI, therefore, is a critical part of many study designs. Electronic health records (EHRs) are a growing source of clinical data for research purposes, and have proven useful for identifying and replicating genetic associations. EHR-based data collected for clinical and billing purposes have several unique properties, including a high degree of heterogeneity or "clinical noise." In this work, we propose a new method for reducing the problems of transcription and recording error for height and weight and apply these methods to a subset of the Vanderbilt University Medical Center biorepository known as EAGLE BioVU (n=15,863). After processing, we show that the distribution of BMI from EAGLE BioVU closely matches population-based estimates from the National Health and Nutrition Examination Surveys (NHANES), and that our approach retains far more data points than traditional outlier detection methods.Entities:
Year: 2017 PMID: 28815116 PMCID: PMC5543370
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1.The null (diagonal) line represents near-identical index and observed weight (A) and height (B) values. Deviating lines represent original values that were recorded in pounds (lb), kilograms (kg), double-converted kilograms (kgx2), meters (m), feet (ft), inches (in), and centimeters (cm). Lone circles likely represent transcriptional errors. Squares represent pediatric measures which may represent true changes. Data points are color coded by age range. These plots were truncated to display an interpretable graph.
Figure 2.The distribution of weights (in kilograms) recorded in the electronic health record for a single patient over the course of seven years. X-axis represents independent clinic visits in order of visit and the y-axis represents the corresponding weights recorded and assumed to be in kilograms.
Figure 3.Distribution of raw body mass index (BMI) values from EAGLE BioVU.
Figure 4.Comparison of median body mass index (BMI) values from EAGLE BioVU (A) to BMI values from the National Health and Nutrition Examination Surveys (NHANES) (B).
Frequencies of all observations within EAGLE BioVU by processing method
| Variable | Methods | Raw Data Total | ||
|---|---|---|---|---|
| - | ||||
| Weight | 155,781 (66%) | 226,685 (96%) | 230,701 (98%) | 235,624 |
| Height | 57,707 (51%) | 106,424 (94%) | 111,536 (99%) | 112,862 |