| Literature DB >> 32252642 |
Ryung S Kim1, Viswanathan Shankar2.
Abstract
BACKGROUND: Electronic Health Records (EHR) has been increasingly used as a tool to monitor population health. However, subject-level errors in the records can yield biased estimates of health indicators. There is an urgent need for methods to estimate the prevalence of health indicators using large and real-time EHR while correcting the potential bias.Entities:
Keywords: Big data; Electronic health records; Measurement error; Multiple imputations; Population health surveillance; Selection bias
Mesh:
Year: 2020 PMID: 32252642 PMCID: PMC7137316 DOI: 10.1186/s12874-020-00956-6
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Data elements in the 2013–14 NYC HANES, limited to the in-care population and stratified by whether the participant was in the chart review study, and 2013 NYC Macroscope
Simulation studies: prevalence estimate by four methods
| True Population Prevalence | Prevalence Estimate (95% CI) | ||||
|---|---|---|---|---|---|
| Prevalence ( | Prevalence ( | Health Survey ( | Post-stratified EHR | Mosteller estimator | Subject-level imputation estimator |
| 0.3 | 0.30 | 0.300 | 0.299 | 0.300 | 0.303 |
| 0.3 | 0.31 | 0.300 | 0.309 | 0.303 | 0.302 |
| 0.3 | 0.32 | 0.299 | 0.319 | 0.305 | 0.302 |
| 0.3 | 0.33 | 0.298 | 0.329 | 0.305 | 0.303 |
| 0.3 | 0.35 | 0.300 | 0.349 | 0.308 | 0.304 |
The size of health survey (n1) and the size of subjects linked between two sources (n12) are both 500
Simulation studies: square root of MSE of four methods
| True Population Prevalence | Squared Root of MSE | ||||
|---|---|---|---|---|---|
| Prevalence ( | Prevalence ( | Health Survey ( | Post-stratified EHR | Mosteller estimator | Subject-level imputation model |
| 0.3 | 0.30 | 0.021 | 0.015 | 0.019 | |
| 0.3 | 0.31 | 0.021 | 0.017 | 0.019 | |
| 0.3 | 0.32 | 0.022 | 0.019 | 0.021 | |
| 0.3 | 0.33 | 0.021 | 0.029 | 0.021 | |
| 0.3 | 0.35 | 0.049 | 0.023 | ||
Square root of MSE for estimating p1 is shown. The size of health survey (n1) and the size of subjects linked between two sources (n12) are both 500. For each row, the best performing method in each row is highlighted in bold
Simulation studies: square root of MSE by different sample sizes
| Size of health Survey ( | Size of subjects linked between two sources ( | Health Survey | Post-stratified EHR | Mosteller estimator | Subject-level imputation model |
|---|---|---|---|---|---|
| 250 | 50 | 0.033 | 0.026 | 0.049 | |
| 125 | 0.031 | 0.024 | 0.046 | ||
| 250 | 0.030 | 0.023 | 0.040 | ||
| 500 | 100 | 0.022 | 0.019 | 0.032 | |
| 250 | 0.023 | 0.019 | 0.031 | ||
| 500 | 0.022 | 0.019 | 0.021 | ||
| 1000 | 200 | 0.016 | 0.019 | 0.027 | |
| 500 | 0.015 | 0.019 | 0.022 | ||
| 1000 | 0.016 | 0.019 | 0.015 |
Prevalence (p1) measured in health survey (Y1) is fixed at 0.3 and the prevalence (p2) measured in EHR (Y2) is fixed at 0.32. The size of EHR (n2) is fixed at 100,000. Square root of MSE for estimating p1 is shown. The best performing method in each row is highlighted in bold
Prevalence estimate and 95% confidence/credibility intervals of select health outcomes among adults in care in New York City (NYC), obtained from the NYC Macroscope 2013 and NYC HANES 2013–14
| Outcomes | Prevalence Estimate (95% CI) | ||||
|---|---|---|---|---|---|
| NYC HANES | Crude NYC Macroscope | Post-stratified | Subject-level imputation model | Mosteller estimator | |
| Hypertension Diagnosis | 34.3 | 33.7 | 34.7 | 35.6 | 34.7 |
| (31.3, 37.4) | (33.6, 33.8) | (34.6, 34.8) | (30.4, 41.1) | (34.0, 35.4) | |
| Diabetes Diagnosis | 13.3 | 14.8 | 14.9 | 13.8 | 13.9 |
| (11.3, 15.6) | (15.8, 16.0) | (14.9, 15.0) | (10.6, 17.7) | (11.5, 16.5) | |
| Smoking | 17.3 | 15.9 | 15.0 | 19.0 | 16.9 |
| (15.1, 19.9) | (15.8, 16.0) | (14.9, 15.1) | (16.0, 22.5) | (14.4, 19.7) | |
| Obesity | 31.7 | 29.1 | 28.0 | 30.9 | 31.1 |
| (28.7, 34.8) | (29.0, 29.2) | (27.9, 28.1) | (26.5, 35.7) | (27.9, 34.6) | |
| Depression | 19.0 | 8.6 | 8.3 | 20.3 | 18.9 |
| (16.6, 21.6) | (8.5, 8.6) | (8.3, 8.4) | (17.2, 23.9) | (16.5, 21.5) | |
| Influenza Vaccination | 48.6 | 21.2 | 21.7 | 48.2 | 48.5 |
| (45.4, 51.8) | (21.1, 21.3) | (21.6, 21.8) | (43.8, 52.5) | (45.3, 51.7) | |
The units are in percentage
Relative weights used in Mosteller estimator
| Outcomes | NYC HANES:Macroscope |
|---|---|
| Hypertension Diagnosis | 0.075:0.925 |
| Diabetes Diagnosis | 0.665:0.335 |
| Smoking | 0.812:0.188 |
| Obesity | 0.855:0.145 |
| Depression | 0.993:0.007 |
| Influenza Vaccination | 0.997:0.003 |