| Literature DB >> 29584812 |
Ellie Paige1,2, Jessica Barrett1,3, David Stevens1, Ruth H Keogh4, Michael J Sweeting1, Irwin Nazareth5, Irene Petersen5, Angela M Wood1.
Abstract
The benefits of using electronic health records (EHRs) for disease risk screening and personalized health-care decisions are being increasingly recognized. Here we present a computationally feasible statistical approach with which to address the methodological challenges involved in utilizing historical repeat measures of multiple risk factors recorded in EHRs to systematically identify patients at high risk of future disease. The approach is principally based on a 2-stage dynamic landmark model. The first stage estimates current risk factor values from all available historical repeat risk factor measurements via landmark-age-specific multivariate linear mixed-effects models with correlated random intercepts, which account for sporadically recorded repeat measures, unobserved data, and measurement errors. The second stage predicts future disease risk from a sex-stratified Cox proportional hazards model, with estimated current risk factor values from the first stage. We exemplify these methods by developing and validating a dynamic 10-year cardiovascular disease risk prediction model using primary-care EHRs for age, diabetes status, hypertension treatment, smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol in 41,373 persons from 10 primary-care practices in England and Wales contributing to The Health Improvement Network (1997-2016). Using cross-validation, the model was well-calibrated (Brier score = 0.041, 95% confidence interval: 0.039, 0.042) and had good discrimination (C-index = 0.768, 95% confidence interval: 0.759, 0.777).Entities:
Mesh:
Year: 2018 PMID: 29584812 PMCID: PMC6030927 DOI: 10.1093/aje/kwy018
Source DB: PubMed Journal: Am J Epidemiol ISSN: 0002-9262 Impact factor: 4.897
Figure 1.Schematic showing the landmark age approach. The dashed lines indicate historical repeat measures of smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol, modeled by means of landmark-age–specific multivariate linear mixed-effects models. The diamonds show the landmark age (time of risk prediction). The arrows indicate the 10-year follow-up to the point of a cardiovascular disease event or censoring, modeled via a landmark Cox model.
Characteristics of Participants in the Study Sample, The Health Improvement Network, United Kingdom, 1997–2016
| Characteristic | Sample and Baseline Characteristic | Mean (SD) No. of Measurements per Year | ||||||
|---|---|---|---|---|---|---|---|---|
| Study Sample ( | Restricted Samplea ( | Study Sample | Restricted Samplea | |||||
| No. of Persons | % | Mean (SD) | No. of Persons | % | Mean (SD) | |||
| Age at study entry, years | 47.9 (13.6) | 47.5 (12.3) | ||||||
| Male sex | 17,592 | 54 | 6,819 | 55 | ||||
| History of diabetesb | 3,743 | 12 | 2,175 | 18 | ||||
| Prescription for blood-pressure–lowering medicationb | 9,935 | 31 | 4,685 | 38 | ||||
| Prescription for statinsb | 5,617 | 17 | 2,003 | 16 | ||||
| Current smokerb | 9,453 | 29 | 3,358 | 27 | 0.6 (0.4) | 0.6 (0.4) | ||
| Systolic blood pressure, mm Hgc | 134.8 (21.0) | 135.3 (21.1) | 1.4 (1.4) | 1.6 (1.4) | ||||
| Total cholesterol level, mmol/Lc | 5.5 (1.1) | 5.4 (1.0) | 0.4 (0.4) | 0.5 (0.4) | ||||
| HDL-C level, mmol/Lc | 1.4 (0.4) | 1.4 (0.4) | 0.3 (0.3) | 0.4 (0.3) | ||||
Abbreviations: HDL-C, high-density lipoprotein cholesterol; SD, standard deviation.
a The restricted sample contained only patients with at least 1 measurement for each variable (smoking status, systolic blood pressure, total cholesterol, and HDL-C).
b Number and percentage were calculated across the follow-up period (e.g., a diagnosis of diabetes at any point during follow-up was counted as a history of diabetes for that individual).
c Based on the first measurement taken after study entry.
Crude Cardiovascular Disease Incidence Rate per 1,000 Person-Years According to Age at Study Entry, Sex, and Calendar Year of Statin Prescription, The Health Improvement Network, United Kingdom. 1997–2016
| Factor | No. of Incident CVD Cases | Total No. of PY | Crude IR per 1,000 PY |
|---|---|---|---|
| Age at study entry, years | |||
| 40–44 | 167 | 57,754 | 2.9 |
| 45–49 | 239 | 53,056 | 4.5 |
| 50–54 | 307 | 49,903 | 6.2 |
| 55–59 | 356 | 37,132 | 9.6 |
| 60–64 | 382 | 29,552 | 12.9 |
| 65–69 | 396 | 22,417 | 17.7 |
| 70–74 | 386 | 15,626 | 24.7 |
| 75–79 | 299 | 10,575 | 28.3 |
| 80–84 | 187 | 5,317 | 35.2 |
| Sex | |||
| Male | 1,520 | 198,797 | 7.6 |
| Female | 1,341 | 232,166 | 5.8 |
| Calendar year of statin initiationa | |||
| 1997–2001 | 225 | 4,828 | 46.6 |
| 2002–2006 | 968 | 38,857 | 24.9 |
| 2007–2011 | 687 | 46,662 | 14.7 |
| 2012–2016 | 365 | 27,543 | 13.3 |
Abbreviations: CVD, cardiovascular disease; IR, incidence rate; PY, person-years.
a Calendar year of the prescribing date of the index statin prescription.
Figure 2.Calibration and risk discrimination statistics for 3 models of cardiovascular disease risk prediction (n = 32,328), The Health Improvement Network, United Kingdom, 1997–2016. A) Calibration statistics for each risk prediction model. The graph shows the Brier score (▪) and 95% confidence interval (CI; bars) for each model. A lower Brier score is interpreted as better calibration. B) Risk discrimination statistics for each risk prediction model. The graph shows the C-index (▪) and 95% CI (bars) for each model. A higher C-index value is interpreted as better discrimination. C) Change in risk discrimination for each risk prediction model. The graph shows the change in C-index (▪) and its 95% CI (bars) for each risk prediction model in relation to the basic model (referent). The basic model included age and sex plus the last observed measures for diabetes status and hypertension treatment. The model with estimated current values of the risk factors included all factors in the basic model plus predicted current values for smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol. The model with age interactions included all factors in the basic model plus predicted current values for smoking status, systolic blood pressure, total cholesterol, and high-density lipoprotein cholesterol, plus interactions of age with all risk factors.
Figure 3.Overall and age-adjusted values for C-index, The Health Improvement Network, United Kingdom, 1997–2016. Dashed lines, 95% confidence intervals.