Literature DB >> 34666503

Re-CHARGE-AF: Recalibration of the CHARGE-AF Model for Atrial Fibrillation Risk Prediction in Patients With Acute Stroke.

Jeffrey M Ashburner^1,2, Xin Wang³, Xinye Li³, Shaan Khurshid^3,4, Darae Ko⁵, Ana Trisini Lipsanopoulos³, Priscilla R Lee³, Taylor Carmichael³, Ashby C Turner⁶, Corban Jackson⁷, Patrick T Ellinor^3,8, Emelia J Benjamin^9,10, Steven J Atlas^1,2, Daniel E Singer^1,2, Ludovic Trinquart^9,11, Steven A Lubitz^3,8, Christopher D Anderson¹².

Abstract

Background Performance of existing atrial fibrillation (AF) risk prediction models in poststroke populations is unclear. We evaluated predictive utility of an AF risk model in patients with acute stroke and assessed performance of a fully refitted model. Methods and Results Within an academic hospital, we included patients aged 46 to 94 years discharged for acute ischemic stroke between 2003 and 2018. We estimated 5-year predicted probabilities of AF using the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE-AF) model, by recalibrating CHARGE-AF to the baseline risk of the sample, and by fully refitting a Cox proportional hazards model to the stroke sample (Re-CHARGE-AF) model. We compared discrimination and calibration between models and used 200 bootstrap samples for optimism-adjusted measures. Among 551 patients with acute stroke, there were 70 incident AF events over 5 years (cumulative incidence, 15.2%; 95% CI, 10.6%-19.5%). Median predicted 5-year risk from CHARGE-AF was 4.8% (quartile 1-quartile 3, 2.0-12.6) and from Re-CHARGE-AF was 16.1% (quartile 1-quartile 3, 8.0-26.2). For CHARGE-AF, discrimination was moderate (C statistic, 0.64; 95% CI, 0.57-0.70) and calibration was poor, underestimating AF risk (Greenwood-Nam D'Agostino chi-square, P<0.001). Calibration with recalibrated baseline risk was also poor (Greenwood-Nam D'Agostino chi-square, P<0.001). Re-CHARGE-AF improved discrimination (P=0.001) compared with CHARGE-AF (C statistic, 0.74 [95% CI, 0.68-0.79]; optimism-adjusted, 0.70 [95% CI, 0.65-0.75]) and was well calibrated (Greenwood-Nam D'Agostino chi-square, P=0.97). Conclusions Covariates from an established AF risk model enable accurate estimation of AF risk in a poststroke population after recalibration. A fully refitted model was required to account for varying baseline AF hazard and strength of associations between covariates and incident AF.

Entities: Chemical

Keywords: atrial fibrillation; ischemic stroke; predicted risk

Mesh：

Year: 2021 PMID： 34666503 PMCID： PMC8751842 DOI： 10.1161/JAHA.121.022363

Source DB: PubMed Journal: J Am Heart Assoc ISSN： 2047-9980 Impact factor: 5.501

Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation fully refitted CHARGE‐AF

Clinical Perspective

What Is New?

We evaluated the predictive utility of an established atrial fibrillation (AF) risk model (Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation [CHARGE‐AF]) in an acute stroke population, a setting where risk of AF is higher than in the population samples in which it was based. CHARGE‐AF and recalibrated CHARGE‐AF to the baseline risk of the poststroke sample were poorly calibrated and substantially underestimated the risk of AF. A fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model was required to account for varying baseline AF hazard and strength of associations between covariates and incident AF

What Are the Clinical Implications?

As more risk estimates from prognostic models are incorporated into clinical tools, our results highlight the importance of evaluating model performance to ensure that accurate and useful information is being provided in the context of the population being treated. Atrial fibrillation (AF) is a common cardiac arrhythmia associated with a 5‐fold increased risk of stroke. AF‐related strokes have a high rate of recurrence, and are associated with substantial morbidity, long‐term disability, and mortality. , , Oral anticoagulants are effective for preventing strokes caused by AF. Identifying patients with stroke at high risk for AF can be challenging but important for preventing recurrent strokes. AF may be asymptomatic even at the time of stroke, and detection may require extended cardiac rhythm monitoring. , , Clinical guidelines support prolonged rhythm monitoring (≈30 days) for AF within 6 months in patients who have experienced an acute ischemic stroke with no other apparent cause (class IIa, level of evidence C), and insertion of an implantable loop recorder to optimize detection of AF in patients with cryptogenic stroke in whom external ambulatory monitoring is inconclusive (class IIa, level of evidence B‐R). Detection of AF with cardiac rhythm monitoring may occur in up to 20% of patients, but varies greatly by the timing, duration, and type of monitor used. , , However, implantable cardiac rhythm monitoring is costly and invasive. Assessing individual patient risk for AF may enable more efficient use of cardiac rhythm monitoring in individuals most likely to have had an AF‐related stroke. Although developed and validated in multiple community cohorts, the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) risk prediction model has demonstrated poor calibration in healthcare‐related data sets, and has not been evaluated in an acute stroke population. We sought to assess the performance of the CHARGE‐AF model to predict 5‐year incident AF after acute stroke, where risk of AF is higher than in the population samples in which it was based. Based on our results, we performed full refitting of the CHARGE‐AF (Re‐CHARGE‐AF) model and assessed whether the updated model achieved favorable performance for prediction of AF following ischemic stroke.

Methods

Study Sample

Eligible patients were aged 46 to 94 years who were discharged from Massachusetts General Hospital following hospitalization for acute ischemic stroke between January 1, 2003, and December 31, 2018. In order to maximize follow‐up information, we included only those patients who were connected to a Massachusetts General Hospital primary care physician, defined by having at least 1 primary care visit during the 3 years before the stroke event. , Patients were excluded if they had a prevalent diagnosis of AF at the time of stroke, were diagnosed with AF within 7 days of the stroke event, or did not visit their Massachusetts General Hospital primary care physician following discharge. This medical records–based study was approved with a waiver of informed consent by the local Mass General Brigham institutional review board. Mass General Brigham data contain protected health information and cannot be publicly shared. The data processing scripts used to perform analyses will be made available to interested researchers upon reasonable request to the corresponding author.

Ascertainment of Clinical Factors

Patient characteristics, comorbidities, and medication lists were obtained from a central data repository at Mass General Brigham. Age, sex, and race or ethnicity were ascertained at the time of stroke. Height, weight, and systolic and diastolic blood pressure (BP) recorded closest to the date of stroke were obtained. If a value was not documented on the date of the stroke event, we accepted the most proximal weight or BP documented in the electronic health record (EHR) within 5 years before the stroke event (lookback period for weight [median, 0.27 years; quartile 1–quartile 3, 0.08–1.02] and BP [median, 0.22 years; quartile 1–quartile 3, 0.05–0.76]). Sensitivity analyses limiting weight or BP to within 3 years before the stroke event excluded 93 patients from the sample but produced similar results in analyses of discrimination and calibration. The distribution of BP values documented on the date of stroke was similar to those documented before the stroke date, so we included BP values from the stroke date. We used the height documented closest to the stroke date without any time restrictions. Antihypertensive medication use was assessed based on any medications listed before the stroke date. Smoking status (current versus not) was assigned based on smoking status reported within a structured field in the health monitoring section of the EHR within 2 years; a problem list entry or International Classification of Diseases, Ninth Revision (ICD‐9) or International Classification of Diseases, Tenth Revision (ICD‐10), billing code within the prior year; or based on free text within the prior year using a natural language processing algorithm if smoking status using structured fields was unavailable. Patients were considered to have diagnosed diabetes at the time of stroke (type 1 or type 2) using a previously validated algorithm. Patients with congestive heart failure were identified using an internally validated algorithm that required 1 inpatient primary discharge or 2 outpatient visits with problem list terms or ICD‐9 or ICD‐10 billing codes for congestive heart failure (Table S1). Patients with previous myocardial infarction were identified using ICD‐9 or ICD‐10 billing codes. Death was ascertained based on EHR data linked to the Social Security Death Index.

Outcomes

The primary outcome was incident AF within 5 years of acute ischemic stroke. Incident AF status was ascertained using a previously validated EHR algorithm, which utilized problem list entries and inpatient or outpatient ICD‐9 or ICD‐10 codes.

Estimated AF Risk

We utilized CHARGE‐AF because it is a widely validated tool specifically designed to estimate risk of AF and has previously been compared with other metrics for estimating AF risk. , , , We estimated AF risk based on clinical risk factors in 3 ways. First, we calculated the CHARGE‐AF score using published components and weights. We converted the CHARGE‐AF score into 5‐year predicted probability of AF using the formula , where is an individual’s CHARGE‐AF score. Second, we implemented CHARGE‐AF while recalibrating it to the baseline risk of the sample. , We generated an updated baseline risk by calculating the average 5‐year AF‐free survival and calculated the mean CHARGE‐AF score in the study sample. The CHARGE‐AF score was converted into 5‐year predicted probability using the same formula with updated constants: . Third, because adjusting the baseline risk alone did not result in a well‐calibrated model, we fully refitted a Cox proportional hazards model with AF incidence within 5 years as the outcome and included covariate terms for each component of the CHARGE‐AF model to create the Re‐CHARGE‐AF model. Censoring occurred at the time of death, last primary care visit during follow‐up if no visit history after 5 years, or after 5 years of follow‐up. We calculated predicted probabilities of 5‐year AF for the Re‐CHARGE‐AF model using the formula , where the baseline risk represents the 5‐year AF‐free survival at the mean values of the risk factors in the sample, is an individual’s Re‐CHARGE‐AF score calculated using the regression coefficients from the updated Cox model (β) and the level for each risk factor (X), and the remaining constant is the Re‐CHARGE‐AF score at the mean values of the risk factors in the sample.

Statistical Analysis

For descriptive data we calculated mean and SDs or number and percentages. We plotted the distribution of the estimated 5‐year predicted probability of the CHARGE‐AF model and the Re‐CHARGE‐AF model. To assess model performance, we compared discrimination and calibration between the CHARGE‐AF model and the Re‐CHARGE‐AF model. We assessed discrimination by comparing hazard ratios (HRs) among groups defined by both CHARGE‐AF model and the Re‐CHARGE‐AF model. For each model, we created groups of predicted risk based on tertiles of the linear predictor values and then based on the 16th, 50th, and 84th percentiles of the linear predictor values. We also assessed discrimination by calculating Harrell C statistic and Royston‐Sauerbrei R 2 and by plotting cumulative incidence curves for risk groups. We compared Harrell C statistic between CHARGE‐AF and Re‐CHARGE‐AF with bias correction using 200 bootstrap samples. We assessed calibration by visually comparing the predicted and observed 5‐year AF risks from the CHARGE‐AF model, the CHARGE‐AF model with recalibrated baseline risk, and the Re‐CHARGE‐AF model with patients divided into risk groups based on quintiles, and also tested calibration using the Greenwood‐Nam‐D’Agostino test (where a significant P value suggested the presence of miscalibration). To assess internal validity of discrimination and calibration estimates for the Re‐CHARGE‐AF model, we constructed 200 bootstrap samples and calculated the estimate of optimism and the optimism‐adjusted C statistic, and generated an optimism‐corrected calibration plot. We considered a 2‐sided P value <0.05 to indicate statistical significance.

Results

Among 1110 patients discharged alive following an admission for acute ischemic stroke and connected to a Massachusetts General Hospital primary care physician, 228 (20.5%) had a prevalent diagnosis of AF, 132 (11.9%) had no follow‐up primary care visits after discharge, 68 (6.1%) did not meet age eligibility, 81 (7.3%) had missing data preventing AF risk estimation, and 50 (4.5%) had AF diagnosed within 7 days of the stroke, resulting in 551 patients for analysis (Figure 1). The mean age of patients was 68.0 years (SD, 11.8 years), 45.7% were women, and 80.6% were non‐Hispanic White. Additional baseline characteristics included in AF risk estimation are shown in Table 1. Characteristics of 559 patients with acute ischemic stroke linked with a primary care physician excluded from analyses (Table S2) and of patients from the original CHARGE‐AF derivation sample (Table S3) compared with the 551 patients included in these analyses are shown in the supplementary material.

Figure 1

Patient flow diagram.

Table 1

Baseline Patient Characteristics

	N=551
Age, y	68.0±11.8
Female sex	252 (45.7)
Race or ethnicity
Non‐Hispanic White	444 (80.6)
Black	48 (8.7)
Asian	17 (3.1)
Hispanic	19 (3.5)
Other/unknown ^*	23 (4.2)
Height, cm	167.9±10.9
Weight, kg	81.7±18.6
Systolic BP, mm Hg	141±24
Diastolic BP, mm Hg	77±12
Smoking (current)	114 (20.7)
Antihypertensive medication use	309 (56.1)
Diabetes	160 (29.0)
Heart failure	45 (8.2)
Myocardial infarction	34 (6.2)

Values are mean±SD or number (percentage). BP indicates blood pressure.

Other refers to 1 patient with race listed as "American Indian/Native Alaskan." The other 22 patients have Unknown race.

Patient flow diagram.

There were a total of 1110 patients discharged alive with acute ischemic strokes who were connected to a Massachusetts General Hospital primary care physician between 2003 and 2018. After applying the specified exclusion criteria, the analytic sample included 551 patients. AF indicates atrial fibrillation. Baseline Patient Characteristics Values are mean±SD or number (percentage). BP indicates blood pressure. Other refers to 1 patient with race listed as "American Indian/Native Alaskan." The other 22 patients have Unknown race. Over 5 years of follow‐up, there were 70 incident AF diagnoses (Kaplan‐Meier cumulative incidence, 15.2%; 95% CI, 10.6%–19.5%) and 32 death events that occurred before an AF diagnosis or the end of follow‐up (5.8%). The median duration of follow‐up among the entire patient sample was 1.92 years and among censored patients was 2.25 years. An estimate of potential follow‐up using the reverse Kaplan‐Meier method was 2.54 years (quartile 1–quartile 3: 0.99–5.00 years). The estimated β coefficients and HRs from the original CHARGE‐AF model and the Re‐CHARGE‐AF model are shown in Table 2. Heart failure, age, and myocardial infarction remained strong predictors of AF incidence in both models, although there were large differences with wide CIs in the β coefficients and corresponding HRs for other variables (Figure S1). This includes some variables indicating decreased AF risk in our sample in contrast to the original CHARGE‐AF model results. The distributions of the estimated 5‐year predicted probability of AF for the CHARGE‐AF model and the Re‐CHARGE‐AF model are depicted in Figure 2. The distribution of AF risk for Re‐CHARGE‐AF is shifted to the right towards higher estimated AF risk. The median predicted 5‐year AF risk from the CHARGE‐AF model was 4.8% (quartile 1–quartile 3, 2.0%–12.6%). The median predicted 5‐year AF risk from the refitted model was 16.1% (quartile 1–quartile 3, 8.0%–26.2%).

Table 2

Estimated β Coefficients From the CHARGE‐AF and the Re‐CHARGE‐AF Models

	CHARGE‐AF Estimated β (SE) ¹²	CHARGE‐AF HR (95% CI) ¹²	Re‐CHARGE‐AF Estimated β (SE)	Re‐CHARGE‐AF HR (95% CI)
Age (5 y)	0.508 (0.022)	1.66 (1.59–1.74)	0.286 (0.065)	1.33 (1.17–1.51)
Race (White)*	0.465 (0.093)	1.59 (1.33–1.91)	−0.686 (0.309)	0.50 (0.27–0.92)
Height (10 cm)	0.248 (0.036)	1.28 (1.19–1.38)	−0.133 (0.128)	0.88 (0.68–1.13)
Weight (15 kg)	0.115 (0.033)	1.12 (1.05–1.20)	0.421 (0.117)	1.52 (1.21–1.92)
Systolic BP (20 mm Hg)	0.197 (0.033)	1.22 (1.14–1.30)	0.023 (0.114)	1.02 (0.82–1.28)
Diastolic BP (10 mm Hg)	−0.101 (0.032)	0.90 (0.85–0.96)	−0.116 (0.121)	0.89 (0.70–1.13)
Smoking (current)	0.359 (0.091)	1.43 (1.20–1.71)	−0.517 (0.387)	0.60 (0.28–1.27)
Antihypertensive medication use	0.349 (0.063)	1.42 (1.25–1.60)	0.004 (0.273)	1.00 (0.59–1.72)
Diabetes (yes)	0.237 (0.073)	1.27 (1.10–1.46)	−0.488 (0.291)	0.61 (0.35–1.09)
Heart failure (yes)	0.701 (0.106)	2.02 (1.64–2.48)	0.627 (0.379)	1.87 (0.89–3.93)
Myocardial infarction (yes)	0.496 (0.089)	1.64 (1.38–1.96)	0.282 (0.396)	1.33 (0.61–2.88)

BP indicates blood pressure; and HR, hazard ratio.

For the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model, the race coefficient corresponds to White persons compared with Black persons. For the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model, the race coefficient corresponds to non‐Hispanic White persons compared with persons from all other race/ethnic groups.

Figure 2

Density plot of the predicted 5‐year probabilities of atrial fibrillation (AF).

Estimated β Coefficients From the CHARGE‐AF and the Re‐CHARGE‐AF Models BP indicates blood pressure; and HR, hazard ratio. For the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model, the race coefficient corresponds to White persons compared with Black persons. For the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model, the race coefficient corresponds to non‐Hispanic White persons compared with persons from all other race/ethnic groups.

Density plot of the predicted 5‐year probabilities of atrial fibrillation (AF).

The plot depicts the distribution of predicted 5‐year probability from the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model (pink) and the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model (blue). Overlap in the distributions is depicted in gray. For the CHARGE‐AF model, the HR for AF was 4.93 (95% CI, 2.38–10.21) for the highest risk tertile group and 2.81 (95% CI, 1.30–6.07) for the middle risk tertile, both compared with the lowest risk tertile. By comparison, for the Re‐CHARGE‐AF model, the HR for AF was 6.69 (95% CI, 3.15–14.23) for its highest risk tertile and 2.33 (95% CI, 1.02–5.37) for the middle tertile. The C statistic for the CHARGE‐AF model was 0.64 (95% CI, 0.57–0.70) and the Royston‐Sauerbrei D statistic was 0.83 (95% CI, 0.46–1.20). For the Re‐CHARGE‐AF model, the C statistic was 0.74 (95% CI, 0.68–0.79) and the Royston‐Sauerbrei D statistic was 1.30 (95% CI, 1.10–1.50). The optimism‐adjusted C statistic for Re‐CHARGE AF was 0.70 (95% CI, 0.65–0.75) and was significantly greater than the C statistic for the CHARGE‐AF model (P=0.001). Cumulative incidence plots stratified by tertile groups of predicted risk based on the CHARGE‐AF model and the Re‐CHARGE‐AF model are shown in Figure 3A and 3B. The evaluation of discrimination with 4 risk groups based on the 16th, 50th, and 84th percentiles is shown in Table S4 and Figure S2. There is separation between the cumulative incidence curves for both CHARGE‐AF and Re‐CHARGE‐AF when stratified into 3 or 4 risk groups. In both instances, the highest risk group in the Re‐CHARGE‐AF model demonstrated greater separation from the next highest risk group (Kaplan‐Meier estimate for highest tertile: 33.4%; middle tertile: 16.1%; >84th percentile: 41.7%; 50th–84th percentile: 24.3%) compared with CHARGE‐AF (Kaplan‐Meier estimate for highest tertile: 31.7%; middle tertile: 19.6%; >84th percentile: 32.1%; 50th–84th percentile: 26.1%).

Figure 3

Cumulative risk of atrial fibrillation (AF) stratified by tertile groups of predicted AF risk.

Cumulative risk of atrial fibrillation (AF) stratified by tertile groups of predicted AF risk.

A, Depicts the cumulative risk of AF by tertile groups (green: lowest tertile [0.21%–2.68%]; blue: middle tertile [2.72%–9.21%]; red: highest tertile [9.23%–74.14%]) of predicted AF risk for the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model. B, Depicts the cumulative risk of AF by tertile groups (green: lowest tertile [1.09%–10.98%]; blue: middle tertile [11.05%–21.84%]; red: highest tertile [21.93%–81.47%]) of predicted AF risk for the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model. Calibration of the CHARGE‐AF model was poor, with the plot of observed 5‐year AF risk versus predicted 5‐year AF risk demonstrating marked underestimation of AF risk (Figure 4A) and the Greenwood‐Nam D’Agostino test indicating miscalibration (chi‐square: 24.8, P<0.001). Calibration of the CHARGE‐AF model with recalibrated baseline risk was also poor (Greenwood‐Nam D’agostino chi‐square: 37.6, P<0.001; Figure S3). In contrast, the Re‐CHARGE‐AF model appeared well calibrated both with and without optimism adjustment (Figure 4B), as well as by the Greenwood‐Nam D’Agostino chi‐square test (0.53, P=0.97).

Figure 4

Calibration plots of observed 5‐year atrial fibrillation (AF) risk vs predicted 5‐year AF risk in quintile groups.

A, Depicts the plot of observed 5‐year AF risk (y‐axis) vs. predicted 5‐year AF risk (x‐axis) for the Cohorts for Heart and Aging Research in Genomic Epidemiology for Atrial Fibrillation (CHARGE‐AF) model in blue, while the optimal calibration is shown in gray. B, Depicts the plot of observed 5‐year AF risk (y‐axis) vs predicted 5‐year AF risk (x‐axis) for the fully refitted CHARGE‐AF (Re‐CHARGE‐AF) model in blue, the optimism‐corrected calibration plot in orange, and the optimal calibration is shown in gray.

Calibration plots of observed 5‐year atrial fibrillation (AF) risk vs predicted 5‐year AF risk in quintile groups.

Discussion

Among over 500 primary care patients discharged from a regional stroke referral center after acute ischemic stroke, we observed that the CHARGE‐AF risk model achieved moderate discrimination for incident AF but was poorly calibrated and substantially underestimated the risk of AF. Recalibration of CHARGE‐AF to the baseline hazard of our poststroke sample was insufficient to achieve accurate absolute AF risk estimates. In contrast, a fully refitted model, Re‐CHARGE‐AF, demonstrated substantially greater discrimination of AF and achieved good calibration between predicted and observed AF incidence. Our findings suggest that accurate AF risk estimation in the poststroke setting can be achieved using covariates from an established AF risk model, but only after adjustment to account both for varying baseline risk and relative influence of covariates. Given that AF is an important predictor of recurrent stroke, our findings may enable accurate estimation of AF risk and ultimately aid in clinical management decisions in patients with stroke. The CHARGE‐AF risk model was developed in community‐based cohorts to predict incident AF. Although it has been externally validated in multiple cohorts, , , , , , its discriminatory performance and calibration in an acute stroke population, whose risk of AF is elevated, has not previously been assessed. Our results demonstrate that calculating 5‐year predicted probability of incident AF in an acute stroke population using the published CHARGE‐AF model components and weights achieves moderate discrimination but poor calibration. A full model refitting comprising the CHARGE‐AF score components was required to achieve a more discriminative risk score that is well calibrated in the poststroke population. Prior research has shown that cardiac monitoring following acute stroke may be underutilized and is not associated with predicted risk of AF. , , Future research is needed to evaluate whether accurate assessments of AF risk in acute stroke survivors increases appropriate poststroke cardiac monitor utilization and identification of AF. Our findings support the need to evaluate both discrimination and calibration of prognostic models before implementation in clinical practice. Even if a model demonstrates good discrimination, poor calibration can make predictions based on the model misleading. We observed that the CHARGE‐AF model underestimated AF risk in a poststroke population, which may impact clinical decisions by physicians and may misrepresent risk to patients. For example, if utilizing predicted AF risk to determine whether extended or ambulatory cardiac monitoring is appropriate following stroke, poor calibration may lead to inappropriately lower utilization via underestimation of AF risk. Despite the importance of both discrimination and calibration in making a model clinically useful, systematic reviews have shown that calibration is assessed less often. , , The C2HEST (coronary artery disease or chronic obstructive pulmonary disease [1 point each]; hypertension [1 point]; elderly [age ≥75 years, 2 points]; systolic heart failure [2 points]; thyroid disease [hyperthyroidism, 1 point]) score, a score originally developed in a general population of Asian patients to predict incident AF, was recently evaluated in a poststroke population in France. While the C2HEST score showed adequate discrimination in this population, calibration of the model was not assessed. As our study demonstrates, recalibration and even refitting may be necessary in order to accurately present risk. Calibration of prognostic models may be affected by several factors. The underlying risk of disease incidence and other patient characteristics may differ between where an algorithm is developed and where it is implemented. In addition, calibration may be impacted by secular trends in disease incidence. We applied the CHARGE‐AF model to an acute stroke sample from an academic medical hospital, which corresponds to a target population with high underlying risk of developing AF, and ascertained predictors using EHR data. Poor calibration may be expected since CHARGE‐AF was developed in community‐based cohorts with lower incidence of AF and routine follow‐up data collection. Additionally, the derivation data for CHARGE‐AF included only White and Black persons, while our sample included additional racial and ethnic groups. We used the coefficient for race from CHARGE‐AF, although it may not be applicable to other groups. This study has several potential limitations. We utilized the CHARGE‐AF model since it is most widely used and externally validated. , , , , Other AF risk scores may have performed differently in a poststroke sample. Ascertainment of clinical predictors and incidence of AF was based on retrospective assessment of EHR documentation, which may be associated with misclassification. Our sample was limited to only those patients connected to our primary care network, a study design choice to increase the probability of having adequate follow‐up for outcome assessment. Despite this, we do not have the ability to fully ascertain incident diagnoses of AF for those patients who left the network and were censored during the follow‐up period. However, the proportion of patients diagnosed with AF following discharge in our sample compares favorably with a meta‐analysis of ambulatory AF diagnoses poststroke. Limited sample size led to a modest number of incident AF events and imprecise estimates. Simulation studies suggest a minimum of 100 events for external validation of logistic regression models to detect differences in model performance for calibration and discrimination. However, we did detect evidence of decreased calibration when applying CHARGE‐AF in our stroke sample despite the relatively low number of incident AF events, which motivated the recalibration. Our study was conducted within a single‐center tertiary academic hospital with patients who were largely of European ancestry, so generalizability may be limited. We did not perform external validation of our refitted model as our goal was not to propose a new standard model but rather to evaluate and correct the calibration of an existing and widely used model. However, we did observe some decrement in prediction accuracy in internal validation. It is possible that improved prediction of AF incidence could be achieved by building a new model or adding new risk factors that may be predictive in a poststroke population; however, that was not the objective of the current study. Unmeasured confounding may have impacted estimates of model coefficients and, thus, while predictive of AF risk in our representative Massachusetts General Hospital ischemic stroke population, are not intended to represent biologically informative markers of disease risk. We do not propose external use of our derived coefficients. Rather, our findings suggest that, when possible, recalibration or refitting of existing models within populations in whom deployment is intended may facilitate substantially more accurate absolute risk estimates. In conclusion, in a sample of patients with acute stroke connected to primary care, we found that the CHARGE‐AF risk model exhibited moderate discrimination of incident AF; however, it was poorly calibrated and underestimated true AF risk. A fully refitted model in our stroke sample substantially improved discrimination and was well calibrated. As we move towards incorporating risk estimates from prognostic models into clinical tools to improve decision making, it is critical to evaluate model performance, calibration, and discrimination to ensure that we are providing the most useful information in the context of the population being treated.

Sources of Funding

Drs Lubitz, Anderson, Trinquart, and Ashburner are supported by the American Heart Association (AHA) 18SFRN34250007 and 18SFRN34150007. Dr Ashburner is supported by National Institutes of Health (NIH) grant K01HL148506. Dr Lubitz is supported by National Institutes of Health (NIH) grant 1R01HL139731. Drs Ellinor and Benjamin are supported by NIH grant 1RO1HL092577 and the AHA (18SFRN34110082). Dr Ellinor is supported by grants from the Fondation Leducq (14CVD01) and the NIH (K24HL105780). Dr Anderson is supported by NIH grants R01NS103924 and U01NS069763, AHA‐Bugher Foundation Centers for Excellence in Hemorrhagic Stroke, the Massachusetts General Hospital Center for Neuroscience, and the Henry and Allison McCance Center for Brain Health. Dr Khurshid is supported by NIH T32HL007208. Dr Ko is supported by the American College of Cardiology Foundation/Merck Research Fellowship in Cardiovascular Diseases and Cardiometabolic Disorders. Dr Benjamin is also supported by R01 HL141434, R01AG066010, 1R01AG066914, and 2U54HL120163.

Disclosures

Dr Lubitz receives sponsored research support from Bristol‐Myers Squibb/Pfizer, Bayer AG, Boehringer Ingelheim, and Fitbit; has consulted for Bristol‐Myers Squibb/Pfizer and Bayer AG; and participates in a research collaboration with IBM. Dr Ellinor is supported by a grant from Bayer AG to the Broad Institute focused on the genetics and therapeutics of cardiovascular diseases. Dr Ellinor has also served on advisory boards or consulted for Bayer AG, MyoKardia, Quest Diagnostics, and Novartis. Dr Anderson receives sponsored research support from Bayer AG and has consulted for ApoPharma, Inc. Dr Singer receives research support from Bristol‐Myers Squibb and has consulted for Boehringer Ingelheim, Bristol‐Myers Squibb, Fitbit, Johnson and Johnson, Merck, and Pfizer. Dr Atlas receives sponsored research support from Bristol‐Myers Squibb/Pfizer and has consulted for Bristol‐Myers Squibb/Pfizer and Fitbit. The remaining authors have no disclosures to report. Table S1–S4 Figure S1–S3 Click here for additional data file.

41 in total

1. Calibration of risk prediction models: impact on decision-analytic performance.

Authors: Ben Van Calster; Andrew J Vickers
Journal: Med Decis Making Date: 2014-08-25 Impact factor: 2.583

2. Effect of changing breast cancer incidence rates on the calibration of the Gail model.

Authors: Sara J Schonfeld; David Pee; Robert T Greenlee; Patricia Hartge; James V Lacey; Yikyung Park; Arthur Schatzkin; Kala Visvanathan; Ruth M Pfeiffer
Journal: J Clin Oncol Date: 2010-04-05 Impact factor: 44.544

3. Long-term risk of recurrent stroke after a first-ever stroke. The Oxfordshire Community Stroke Project.

Authors: J Burn; M Dennis; J Bamford; P Sandercock; D Wade; C Warlow
Journal: Stroke Date: 1994-02 Impact factor: 7.914

4. Atrial fibrillation in patients with cryptogenic stroke.

Authors: David J Gladstone; Melanie Spring; Paul Dorian; Val Panzov; Kevin E Thorpe; Judith Hall; Haris Vaid; Martin O'Donnell; Andreas Laupacis; Robert Côté; Mukul Sharma; John A Blakely; Ashfaq Shuaib; Vladimir Hachinski; Shelagh B Coutts; Demetrios J Sahlas; Phil Teal; Samuel Yip; J David Spence; Brian Buck; Steve Verreault; Leanne K Casaubon; Andrew Penn; Daniel Selchen; Albert Jin; David Howse; Manu Mehdiratta; Karl Boyle; Richard Aviv; Moira K Kapral; Muhammad Mamdani
Journal: N Engl J Med Date: 2014-06-26 Impact factor: 91.245

5. Effect of intensity of oral anticoagulation on stroke severity and mortality in atrial fibrillation.

Authors: Elaine M Hylek; Alan S Go; Yuchiao Chang; Nancy G Jensvold; Lori E Henault; Joe V Selby; Daniel E Singer
Journal: N Engl J Med Date: 2003-09-11 Impact factor: 91.245

6. Performance of Atrial Fibrillation Risk Prediction Models in Over 4 Million Individuals.

Authors: Shaan Khurshid; Uri Kartoun; Jeffrey M Ashburner; Ludovic Trinquart; Anthony Philippakis; Amit V Khera; Patrick T Ellinor; Kenney Ng; Steven A Lubitz
Journal: Circ Arrhythm Electrophysiol Date: 2020-12-09

7. C₂ HEST Score and Prediction of Incident Atrial Fibrillation in Poststroke Patients: A French Nationwide Study.

Authors: Yan-Guang Li; Arnaud Bisson; Alexandre Bodin; Julien Herbert; Leslie Grammatico-Guillon; Boyoung Joung; Yu-Tang Wang; Gregory Y H Lip; Laurent Fauchier
Journal: J Am Heart Assoc Date: 2019-06-25 Impact factor: 5.501

8. CHARGE-AF in a national routine primary care electronic health records database in the Netherlands: validation for 5-year risk of atrial fibrillation and implications for patient selection in atrial fibrillation screening.

Authors: Jelle C L Himmelreich; Wim A M Lucassen; Ralf E Harskamp; Claire Aussems; Henk C P M van Weert; Mark M J Nielen
Journal: Open Heart Date: 2021-01

9. Performance of the CHARGE-AF risk model for incident atrial fibrillation in the EPIC Norfolk cohort.

Authors: Roman Pfister; Johannes Brägelmann; Guido Michels; Nick J Wareham; Robert Luben; Kay-Tee Khaw
Journal: Eur J Prev Cardiol Date: 2014-07-24 Impact factor: 7.804

10. Development and Validation of a Prediction Model for Atrial Fibrillation Using Electronic Health Records.

Authors: Olivia L Hulme; Shaan Khurshid; Lu-Chen Weng; Christopher D Anderson; Elizabeth Y Wang; Jeffrey M Ashburner; Darae Ko; David D McManus; Emelia J Benjamin; Patrick T Ellinor; Ludovic Trinquart; Steven A Lubitz
Journal: JACC Clin Electrophysiol Date: 2019-10-02

5 in total

1. Development and Validation of a Novel Score for Predicting Paroxysmal Atrial Fibrillation in Acute Ischemic Stroke.

Authors: Jiann-Der Lee; Ya-Wen Kuo; Chuan-Pin Lee; Yen-Chu Huang; Meng Lee; Tsong-Hai Lee
Journal: Int J Environ Res Public Health Date: 2022-06-14 Impact factor: 4.614

2. Validation of Risk Scores for Predicting Atrial Fibrillation Detected After Stroke Based on an Electronic Medical Record Algorithm: A Registry-Claims-Electronic Medical Record Linked Data Study.

Authors: Cheng-Yang Hsieh; Hsuan-Min Kao; Kuan-Lin Sung; Luciano A Sposato; Sheng-Feng Sung; Swu-Jane Lin
Journal: Front Cardiovasc Med Date: 2022-04-29

3. Cohort design and natural language processing to reduce bias in electronic health records research.

Authors: Shaan Khurshid; Christopher Reeder; Lia X Harrington; Pulkit Singh; Gopal Sarma; Samuel F Friedman; Paolo Di Achille; Nathaniel Diamant; Jonathan W Cunningham; Ashby C Turner; Emily S Lau; Julian S Haimovich; Mostafa A Al-Alusi; Xin Wang; Marcus D R Klarqvist; Jeffrey M Ashburner; Christian Diedrich; Mercedeh Ghadessi; Johanna Mielke; Hanna M Eilken; Alice McElhinney; Andrea Derix; Steven J Atlas; Patrick T Ellinor; Anthony A Philippakis; Christopher D Anderson; Jennifer E Ho; Puneet Batra; Steven A Lubitz
Journal: NPJ Digit Med Date: 2022-04-08

4. Automated risk assessment of newly detected atrial fibrillation poststroke from electronic health record data using machine learning and natural language processing.

Authors: Sheng-Feng Sung; Kuan-Lin Sung; Ru-Chiou Pan; Pei-Ju Lee; Ya-Han Hu
Journal: Front Cardiovasc Med Date: 2022-07-29

5. Natural Language Processing to Improve Prediction of Incident Atrial Fibrillation Using Electronic Health Records.

Authors: Jeffrey M Ashburner; Yuchiao Chang; Xin Wang; Shaan Khurshid; Christopher D Anderson; Kumar Dahal; Dana Weisenfeld; Tianrun Cai; Katherine P Liao; Kavishwar B Wagholikar; Shawn N Murphy; Steven J Atlas; Steven A Lubitz; Daniel E Singer
Journal: J Am Heart Assoc Date: 2022-07-29 Impact factor: 6.106

5 in total