Literature DB >> 32011689

Prediction models for atrial fibrillation applicable in the community: a systematic review and meta-analysis.

Jelle C L Himmelreich¹, Lieke Veelers¹, Wim A M Lucassen¹, Renate B Schnabel², Michiel Rienstra³, Henk C P M van Weert¹, Ralf E Harskamp¹.

Abstract

AIMS: Atrial fibrillation (AF) is a common arrhythmia associated with an increased stroke risk. The use of multivariable prediction models could result in more efficient primary AF screening by selecting at-risk individuals. We aimed to determine which model may be best suitable for increasing efficiency of future primary AF screening efforts. METHODS AND
RESULTS: We performed a systematic review on multivariable models derived, validated, and/or augmented for AF prediction in community cohorts using Pubmed, Embase, and CINAHL (Cumulative Index to Nursing and Allied Health Literature) through 1 August 2019. We performed meta-analysis of model discrimination with the summary C-statistic as the primary expression of associations using a random effects model. In case of high heterogeneity, we calculated a 95% prediction interval. We used the CHARMS (Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies) checklist for risk of bias assessment. We included 27 studies with a total of 2 978 659 unique participants among 20 cohorts with mean age ranging from 42 to 76 years. We identified 21 risk models used for incident AF risk in community cohorts. Three models showed significant summary discrimination despite high heterogeneity: CHARGE-AF (Cohorts for Heart and Aging Research in Genomic Epidemiology) [summary C-statistic 0.71; 95% confidence interval (95% CI) 0.66-0.76], FHS-AF (Framingham Heart Study risk score for AF) (summary C-statistic 0.70; 95% CI 0.64-0.76), and CHA2DS2-VASc (summary C-statistic 0.69; 95% CI 0.64-0.74). Of these, CHARGE-AF and FHS-AF had originally been derived for AF incidence prediction. Only CHARGE-AF, which comprises easily obtainable measurements and medical history elements, showed significant summary discrimination among cohorts that had applied a uniform (5-year) risk prediction window.
CONCLUSION: CHARGE-AF appeared most suitable for primary screening purposes in terms of performance and applicability in older community cohorts of predominantly European descent. Published on behalf of the European Society of Cardiology. All rights reserved.

Entities: Disease Gene Species

Keywords: Atrial fibrillation; Community; Meta-analysis; Risk model; Screening; Systematic review

Mesh：

Year: 2020 PMID： 32011689 PMCID： PMC7526764 DOI： 10.1093/europace/euaa005

Source DB: PubMed Journal: Europace ISSN： 1099-5129 Impact factor: 5.214

This is the first systematic review and meta-analysis designed to capture and evaluate a broad range of prognostic models used for incident atrial fibrillation (AF) risk prediction, and the first to focus specifically on models that are applicable in and have been derived, validated and/or augmented in community cohorts. This work was open to any model used for incident AF prediction in the community, which also enabled inclusion of models that had not been developed for incident AF but that may have merits for this aim. We hereby identified 21 models used for incident AF prediction in community cohorts. This work suggests that the CHARGE-AF model is likely most robust for incident AF prediction in terms of performance as well as applicability in the community.

Introduction

Atrial fibrillation (AF) is a common cardiac arrhythmia affecting over 33 million people worldwide. Its incidence increases with age, with a lifetime risk of over 30%. Due to ageing populations, the number of AF cases in Europe is expected to double to >17 million by 2060. Atrial fibrillation is associated with a five-fold increased risk of ischaemic stroke, which can be largely prevented by antithrombotic prophylaxis in at-risk patients., Screening for AF in the community has been proposed as an approach to optimize early AF detection and to prevent AF-associated sequelae. Prior research has shown that AF screening is cost-effective when selecting patients at older age, with thresholds for screening eligibility varying from 65 to 75 years. The screening regimes in these primary care studies often involved single-point screening, while multiple-point screening could result in a higher yield of new AF cases., Multiple-point or prolonged rhythm monitoring schemes are, however, likely to be more costly for society and more burdensome to patients. Multivariable prediction models for incident AF could contribute to AF screening by determining a risk category for each patient. The more intensive regimes could be assigned to those with highest risk, while those in lower-risk strata could be assigned to less stringent follow-up, or none at all. It remains, however, insufficiently clear from consensus documents whether other parameters beyond age could be used to differentiate between degrees of AF risk within the community., We therefore set out to perform a systematic review and meta-analysis with two aims. First, we wished to provide an overview of AF risk models that are applicable to and have been validated in community cohorts. Such models should consist of variables that can be quickly assessed and/or are commonly available from patient records and should not require advanced diagnostic testing. Second, by synthesizing the discriminatory abilities of each included risk model, we aimed to determine which of these may be best suitable for increasing efficiency of future primary AF screening efforts.

Methods

We reported this systematic review and meta-analysis in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.

Data searches

We searched Pubmed, Embase, and Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases from inception through 1 August 2019. We used the keywords ‘AF or atrial flutter (AFl)’ and ‘risk model’ as well as related terms. We filtered for studies conducted on humans and written in English. The full search is shown in Supplementary material online, . We checked the reference list of included studies for additional relevant references.

Study selection

To be eligible for inclusion, studies had: (i) to be original studies in adults (≥18 years of age); (ii) to derive, validate, and/or augment a tool for predicting risk of incident AF/AFl based on multivariable analysis; (iii) to include only patients without a diagnosis of AF/AFl at baseline; and (iv) to incorporate into their risk prediction tool only variables that are applicable and/or commonly available in primary care settings. We included studies with AFl as co-outcome, since AF and AFl have similar clinical relevance. In light of inclusion criterion iv, we included only studies that used medical history, physical examination, simple laboratory findings, or electrocardiogram (ECG) parameters as variables in the prediction model. We excluded studies that required advanced diagnostic testing [e.g. echocardiography, genetic markers, or specialized (laboratory) tests] for their simple (non-augmented) model. We only included studies written in English. We included studies that diagnosed AF/AFl through medical records, hospitalizations, death certificates, and/or ECG during follow-up examinations. We excluded studies that selected patients for a common disease or risk factor, as such studies would not be generalizable to the community. Moreover, we excluded studies with a mean follow-up duration under 3 months since with shorter follow-up durations there would be an increasing risk of measuring prevalent AF missed at baseline recording, rather than actual incident AF. We uploaded references to a systematic review web application (Covidence, Veritas Health Innovation Ltd, Melbourne, Australia). Three investigators (J.C.L.H., L.V., and R.E.H.) assessed studies for eligibility by screening studies on title and abstract, followed by a full-text screening. Disagreements were resolved by panel discussion (J.C.L.H., L.V., R.E.H., and W.A.M.L.).

Data extraction and quality assessment

Two investigators (J.C.L.H. and L.V.) extracted data from the included studies regarding study methods, population characteristics, risk prediction model(s), and model performance. For the latter, we extracted the C-statistic and corresponding 95% confidence interval (95% CI) for discrimination, and the P-value of a goodness-of-fit test and the ratio of observed and expected AF/AFl cases (O:E ratio) for calibration. When authors did not report an O:E ratio we derived the O:E ratio by analysing calibration plots. When authors performed augmentation of pre-existing models by adding variables with an aim to enhance predictive value of models, we retrieved the net reclassification improvement (NRI) index of the augmented model compared with the original ‘simple’ model, as well as the augmented model’s performance in terms of discrimination and calibration. We included augmentation data only when the augmentation variables were applicable to primary care settings as outlined previously. Two investigators (J.C.L.H. and L.V.) used the Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS) checklist to assess the risk of bias and the applicability for our research aims. Our interpretation of each CHARMS domain can be found in the Supplementary material online, . We assessed risk of bias at the cohort level for each of the included studies. We scored each domain as either low, unclear, or high risk of bias. We defined overall risk of bias as: low, when all domains of a cohort within one study were scored as low risk of bias; unclear, when one or more domains of a cohort within one study were scored as unclear risk of bias; and high, when one or more domains of a cohort within one study were scored as high risk of bias. We resolved disagreements by discussion.

Data synthesis and statistical analysis

We reported continuous variables as means ± standard deviations, and categorical variables as percentages. We evaluated statistical significance in all analyses at the 0.05 level. In individual studies, we assessed the C-statistic of a model, where a 95% CI containing 0.5 indicated insufficient discrimination. Calibration of a model was deemed sufficient when authors reported a P-value of a goodness-of-fit test >0.05 and/or an O:E ratio ranging between 0.95 and 1.05. In assessing augmentation, we defined significant improvement as a positive NRI index with a reported 95% CI that did not contain 0. When a study reported on multiple cohorts, and presented separate data for each cohort, we assessed model performance separately for each cohort within that study. We performed meta-analysis to assess overall discrimination of included models. The primary expression of associations in meta-analysis was the summary C-statistic and corresponding 95% CI using a random effects inverse variance model with restricted maximum likelihood estimation and Hartung–Knapp corrections. We conducted the meta-analyses in R using the meta and metafor packages (R Foundation for Statistical Computing, version 3.5.1). We performed meta-analysis only when C-statistic data for a prediction model were available for ≥3 cohorts. When studies presented a C-statistic without 95% CI, we calculated the 95% CI using methods described previously. In each meta-analysis, we calculated the mean as the summary effect measure, its 95% CI, and the I2 statistic as an expression of the heterogeneity between studies. When heterogeneity in meta-analysis of C-statistics was high (I2 > 30%), we derived a 95% prediction interval (95% PI) using methods described previously. We assessed overall discrimination of models by the summary C-statistic. When the 95% CI (or, in case of high heterogeneity, the 95% PI) of the summary C-statistic included 0.5, we concluded that there was insufficient evidence that the prediction model has significant discriminatory ability for incident AF in such populations as included in the meta-analysis. We assessed eligibility for inclusion into meta-analysis at the cohort level. Cohorts with low or unclear overall risk of bias were eligible for inclusion into meta-analysis. When studies reported C-statistic data based on the aggregation of multiple cohorts, and one of these cohorts was assessed as having high overall risk of bias, we did not include the aggregate C-statistic data into meta-analysis. When multiple studies reported C-statistic data on the same cohort, we included the first published study into the primary analysis. In the primary meta-analysis of each model, we included cohorts with any follow-up duration. In our primary analysis, we assessed overall discrimination of all models that had ≥3 eligible cohorts with C-statistic data. In the secondary analysis we performed meta-analysis for each risk model that had ≥3 eligible cohorts reporting C-statistic data while applying a uniform prediction window, and grouped cohorts according to the applied risk prediction window (e.g. 5 or 10 years) since this is an important methodological considerations when wanting to translate summary risk model performance to clinical settings. We performed a sensitivity analysis in which we restricted the primary and secondary analyses to only those cohorts that had demonstrated sufficient calibration in order to assess overall discrimination among populations where the prediction model had also shown the ability to correctly classify absolute incident AF risk. Finally, we performed a sensitivity analysis in which we replaced primary and secondary meta-analyses data from ‘double’ cohorts (cohorts from the primary meta-analysis that had also been reported on in later studies) with data on that same cohort from any later study to assess whether later, possibly ‘more complete’ datasets could be of influence to our conclusions from the primary and secondary analyses.

Results

We found a total of 3873 unique references, 102 of which we subjected to full-text screening. From these, we included 27 studies for our final analysis (see Figure ). Literature flow diagram. AF, atrial fibrillation; AFl, atrial flutter; CINAHL, Cumulative Index to Nursing and Allied Health Literature; NA, not applicable.

Characteristics of included studies

The 27 included studies were based on 20 different cohorts set in Europe (n = 8), East Asia (n = 5), North America (n = 5), and the Middle East (n = 2) (see Supplementary material online, for characteristics of the included studies and cohorts). Cohort size ranged from 646 to 1 062 073 patients, with a total of 2 978 659 unique participants. Mean age varied from 42 to 76 years, percentage of female participants ranged from 0% to 100%. Mean follow-up of the included cohorts varied from 3 to 20 years, with AF incidence during follow-up ranging from 0.2% to 24.5%. Ten cohorts used AF/Afl as the outcome, and the other 10 cohorts described only AF. Thirteen of the 20 cohorts followed a prospective design, of which 6 cohorts applied prescheduled follow-up examinations to systematically identify AF.

Characteristics of included risk models

The included studies represented data on 21 multivariable prediction models. Ten models had specifically been derived for predicting incident AF (Table ). Of these, nine had been derived in community cohorts and one had been derived in a cohort of outpatients. Five of the models derived for incident AF had also been externally validated. The intended risk prediction window of models derived for incident AF varied between 5 and 11 years. The FHS-AF (Framingham Heart Study risk score for AF) model had originally been derived for predicting 10-year incident AF risk, but had later been recalibrated and subsequently externally validated for 5-year risk of incident AF., Characteristics of included risk models developed for incident AF AF, atrial fibrillation; ARIC-AF, Atherosclerosis Risk In Communities score for Atrial Fibrillation; BMI, body mass index; CHADS2, Congestive heart failure, Hypertension, Age >75, Diabetes mellitus, prior Stroke or transient ischaemic attack (2 points); CHA2DS2-VASc, Congestive HF, Hypertension, Age >75 (2 points), Stroke/transient ischaemic attack/thromboembolism (2 points), Vascular disease, Age 65–74, Sex category; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology; CHD, coronary heart disease; C2HEST, Coronary artery disease/Chronic obstructive pulmonary disease (2 points), Hypertension, Elderly, Systolic heart failure, Thyroid disease; COPD, chronic obstructive pulmonary disease; ECG, electrocardiogram; FHS-AF, Framingham Heart Study score for Atrial Fibrillation; MHS, Maccabi Healthcare Services; MI, myocardial infarction; NS, not specified; PREVEND, Prevention of Renal and Vascular End-stage Disease; WHS, Women’s Health Study. Depicted here are the variables in the simple (non-augmented) models. We identified seven risk models that had originally been derived for predicting other outcomes than incident AF but had been validated for this outcome in community cohorts, and a further four models that were incidentally employed to predict incident AF but that had not specifically been derived as a prediction model for that outcome,,, (Supplementary material online, ). The number of variables incorporated into each of the included models varied between 5 and 18, with a total of 27 distinguishable variables/variable categories among all included risk models. Age was the only variable used in all models. Other common variables were hypertension history or treatment, heart failure history, sex, and blood pressure, incorporated into 16, 16, 14, and 14 of the 21 included models, respectively.

Risk model performance among included cohorts

Supplementary material online, shows the results on AF incidence, discrimination, and calibration of the included simple models among the cohorts in our search. All studies used the C-statistic to assess model discrimination for incident AF within their cohorts. Nine studies assessed calibration by providing both a P-value for a goodness-of-fit test and a calibration plot in at least one of the risk models that these studies reported on refs,,,,,,,,, seven studies assessed calibration only by a P-value for goodness-of-fit test,,,,,,, two studies assessed calibration only by a calibration plot,, and nine studies reported neither of these calibration parameters.,,,,,,,, Reported C-statistics for incident AF ranged from 0.58 (95% CI 0.55–0.61) to 0.842 (95% CI 0.826–0.858). The highest C-statistic while also showing sufficient calibration was reported in the FHS (Framingham Heart Study) cohort on the in the incidentally used FHS-Lubitz model with a C-statistic of 0.78 (95% CI 0.76–0.80) and P-value of the goodness-of-fit test of 0.11.

Augmentation of included risk models

We identified augmentation data applicable to primary care settings for five of the included AF risk models (see Supplementary material online, ). Significant improvement was demonstrated in CHARGE-AF (Cohorts for Heart and Aging Research in Genomic Epidemiology) with addition of the P-wave axis and brain natriuretic peptide (BNP) and/or C-reactive protein (CRP), in the FHS-AF 10-year model with addition of BNP and CRP, and in the Seirei model with addition of ECG parameters to the model.

Risk of bias assessment

Supplementary material online, shows the results of the risk of bias assessment for each cohort in the included studies. We assessed the risk of bias of all domains as either low or unclear for all domains except for the participants domain. For this domain, we assessed eight cohorts employed by 10 studies as having a high risk of bias for excluding patients for reasons with a known association with risk of future AF.,,,,,,,,,

Meta-analyses

Five models were eligible for the primary meta-analysis, as shown in Figure . Of these, only CHARGE-AF and the FHS-AF for 10-year risk model had originally been derived for incident AF. All primary meta-analyses resulted in high heterogeneity for which we calculated a 95% PI. There were three models that resulted in a summary C-statistic with significant 95% PI in our primary meta-analysis: CHARGE-AF (summary C-statistic 0.71; 95% CI 0.66–0.76; I2 87%; 95% PI 0.554–0.865; n = 8 studies; n = 58 137 patients), the FHS-AF 10-year model (summary C-statistic 0.70; 95% CI 0.64–0.76; I2 94%; 95% PI 0.535–0.869; n = 5 studies; n = 33 846 patients), and CHA2DS2-VASc (summary C-statistic 0.69; 95% CI 0.64–0.74; I2 100%; 95% PI 0.540–0.838; n = 5 studies; n = 2 005 813 patients) (see Figure for a comparison of these three models). Primary analysis: meta-analysis of C-statistics. 95% CI, 95% confidence interval; AF, atrial fibrillation; AGES, Age, Gene and Environment-Reykjavik Study; ARIC, Atherosclerosis Risk In Communities; CHADS2, Congestive heart failure, Hypertension, Age >75, Diabetes mellitus, prior Stroke or transient ischaemic attack (2 points); CHA2DS2-VASc, Congestive HF, Hypertension, Age >75 (2 points), Stroke/transient ischaemic attack/thromboembolism (2 points), Vascular disease, Age 65–74, Sex category; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology; CI, confidence interval; EPIC, European Prospective Investigation into Cancer and Nutrition; FHS, Framingham Heart Study; FHS-AF, Framingham Heart Study score for Atrial Fibrillation; HATCH, Hypertension, Age, stroke or Transient ischaemic attack, Chronic obstructive pulmonary disease, Heart failure; IV, inverse variance; MPP-RES, Malmö Preventive Project Re-examination Study; NHIS, National Health Insurance Service; NHIRD, National Health Insurance Research Database; PIVUS, Prospective Investigation of the Vasculature in Uppsala Seniors; RS, Rotterdam Study; YMID, Yunnan Medical Insurance Database. aDerivation cohort. Comparison of the three models that resulted in significant 95% prediction intervals in the primary meta-analysis. 95% CI, 95% confidence interval; 95% PI, 95% prediction interval; AF, atrial fibrillation; AGES, Age, Gene and Environment-Reykjavik Study; ARIC, Atherosclerosis Risk In Communities; BMI, body mass index; CHA2DS2-VASc, Congestive HF, Hypertension, Age >75 (2 points), Stroke/transient ischaemic attack/thromboembolism (2 points), Vascular disease, Age 65–74, Sex category; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology; C-stat., C-statistic; ECG, electrocardiogram; EPIC, European Prospective Investigation into Cancer and Nutrition; FHS, Framingham Heart Study; FHS-AF, Framingham Heart Study score for Atrial Fibrillation; MPP-RES, Malmö Preventive Project Re-examination Study; NA, not applicable; NHIS, National Health Insurance Service; PIVUS, Prospective Investigation of the Vasculature in Uppsala Seniors; RS, Rotterdam Study; YMID, Yunnan Medical Insurance Database. The meta-analyses of C-statistics for the outcome incident AF are grouped by cohort from which C-statistics were reported, allowing for a comparison of multiple models’ performance within one cohort insofar as data are available. For our secondary analysis, we were able to meta-analyse CHARGE-AF and the FHS-AF 10-year model, each for a 5- and 10-year prediction window (Figure ). Only the meta-analysis of CHARGE-AF with a 5-year prediction window resulted in significant overall discrimination (summary C-statistic 0.72; 95% CI 0.66–0.78; I2 85%; 95% PI 0.567–0.881; n = 6 studies; n = 50 328 patients). Secondary analysis: meta-analysis of C-statistics grouped according to application of a uniform prediction window within a model. AGES, Age, Gene and Environment-Reykjavik Study; ARIC, Atherosclerosis Risk In Communities; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology; CI, confidence interval; EPIC, European Prospective Investigation into Cancer and Nutrition; FHS, Framingham Heart Study; FHS-AF, Framingham Heart Study score for Atrial Fibrillation; IV, inverse variance; MPP-RES, Malmö Preventive Project Re-examination Study; PIVUS, Prospective Investigation of the Vasculature in Uppsala Seniors; RS, Rotterdam Study; 95% CI, 95% confidence interval. aDerivation cohort. In our sensitivity analysis of restricting primary and secondary analyses models to cohorts with sufficient calibration, we found no model with significant overall discrimination due to high heterogeneity (Supplementary material online, Figures S1 and S2). Our sensitivity analysis on double cohorts in the primary and secondary analyses did not lead to different conclusions on overall discriminatory ability of meta-analysed models in all but one comparison (see Supplementary material online, ).

Discussion

In this systematic review and meta-analysis, we provided an overview of prediction models for incident AF risk that are applicable in and had been derived, validated, and/or augmented in community cohorts. We identified 21 risk models that met these criteria, of which 10 had specifically been derived for predicting AF incidence in the community. In meta-analysis of C-statistics, three models showed significant overall discrimination for AF incidence at any follow-up duration and with any calibration despite high heterogeneity. Two of those models were derived specifically for incident AF risk prediction: CHARGE-AF and the FHS-AF 10-year model. Only CHARGE-AF showed significant overall discrimination among cohorts with a uniform prediction window (the model’s originally intended 5-year window).

Clinical relevance

The outcomes of this systematic review and meta-analysis are highly relevant for the field of primary AF screening. Previous AF screening programmes showed only moderate efficiency in selecting at-risk patients from the community, with an estimated number needed to screen of 111 among 23 studies that had screened community cohorts for incident AF by various methods. Patients were often selected for screening based only on age., The age criterion in selecting patients for AF screening has its clinical merits since oral anticoagulation in AF patients is indicated in all women ≥65 and all men ≥75 years of age and should be considered in men aged ≥65 years in the absence of other risk factors., Age as a criterion, however, should not be considered absolute in selecting patients for primary AF screening. Half of all AF cases detected in the Belgian Heart Rhythm Week were younger than 65 years of age. Moreover, there is evidence that CHARGE-AF has higher discrimination among younger patients, although calibration here was lower due to lower absolute AF risk in this younger subgroup. Finally, the two studies within our search that compared multivariable models with age alone as the predictor both found that the multivariable models had significantly higher C-statistics for incident AF., We conclude, therefore, that the use of multivariable risk models in selecting patients for community AF screening is likely to result in more efficient screening than selecting based on age alone. Given that there is adequate stroke prevention therapy available once AF is detected, it is likely that the use of such models in AF screening will result in more efficient stroke prevention. More work on the implementation of multivariable risk models in AF screening as well as on long-term follow-up of screening-detected AF cases, however, is necessary to test these hypotheses. Whether an immediate start of anticoagulation therapy is warranted when AF is detected in younger patients with risk factors other than high age will subsequently depend on the number and nature of these other risk factors. However, as shown in Table , most AF risk prediction models include a multitude of the variables in the CHA2DS2-VASc score used to assess anticoagulant treatment indication., In a younger patient selected for AF screening based on a multivariable prediction model due to presence of other risk factors than high age, an AF diagnosis is therefore likely to still be relevant in terms of the need for anticoagulant therapy, if not for prevention of other pathology associated with AF such as heart failure. One of the aims of this work was to determine which model may be best suitable for increasing efficiency of future primary AF screening efforts. Our work showed that there are ample AF risk models to choose from, however with one model that currently stands out between the others: CHARGE-AF. Despite heterogeneity in included cohorts, CHARGE-AF showed significant summary discrimination over a relatively short (5-year) risk prediction window. The model contains variables that are generally easy to extract from health records, and requires only body measurements that are easily obtainable (height, weight, and blood pressure). The FHS-AF model, in contrast, though performing nearly as well in overall discrimination, requires variables ‘significant murmur’ and ECG variables which are less easy to acquire or interpret for many care professionals. Concluding, CHARGE-AF currently seems the most suitable prediction model for incident AF, and likely has merits as a low-cost triage test for future primary AF screening efforts.

Derivation, validation, and augmentation

In risk models derived for incident AF in community cohorts, there was a trend that the derivation cohort had the highest C-statistic compared with external validation cohorts. The only exception was CHARGE-AF, where Pfister et al. reported a C-statistic of 0.808. Calibration of CHARGE-AF in their cohort, however, was insufficient (P-value for goodness-of-fit test <0.001 and O: E ratio 0.47) due to a systematic overestimation of 5-year AF risk in all risk deciles. One explanation lies in the differences in demographics, as Pfister’s cohort was younger and had lower baseline prevalence of diabetes mellitus than the CHARGE-AF derivation cohorts. Depending on whether one’s aim is to distinguish high from low-risk patients, or to predict absolute 5-year incident AF risk, a researcher may use this knowledge to decide whether or not to recalibrate a model for his own target population. In augmentation studies, we saw that addition of BNP and CRP to a model seemed most promising in terms of improving risk classification. We note, however, that the significance of an added value of BNP and CRP to CHARGE-AF was not consistent, and that the augmentation studies provided no information on the added costs of augmentation parameters relative to those of acquiring the simple model risk score.

Previous work

Previous systematic reviews have focused on individual predictors for AF,, on AF as a risk factor for other outcomes,, or on risk models for adverse outcomes in AF patients., However, to our knowledge, this is the first systematic review and meta-analysis on performance of incident AF risk prediction models, and the first with a focus on such risk models validated in and applicable to the community.

Future work

Future studies could focus on finding optimal cut-offs for the more promising AF prediction models, and to find the most cost-effective use of multivariable models within various screening schemes. Researchers may opt here, e.g. for either a dichotomization into patients with higher and lower risk or assigning patients to one of multiple risk strata. Patients at higher risk could be offered a more intensive, sensitive screening scheme (e.g. multiple-point screening or Holter monitoring) when compared with patients at low risk (single point or no screening). Further research could also assess whether implementation of multivariable models in AF screening could be aided by software that automatically extracts patient data from health records, informs the physician of a patient’s current risk category, and suggests parameters that should be updated for a more accurate current AF risk stratification.

Strengths and limitations

This study has a number of strengths. First, we included only studies performed in community cohorts, which contributed to the value of our results for primary care AF screening. Second, we included any risk model that was used to predict risk of incident AF. This enabled us to expand our scope to models that had originally not been intended for predicting incident AF, but that may have merits in predicting this outcome. Third, we attributed high bias to studies that excluded or over-represented patients based on factors that are likely to be associated with risk of incident AF, further contributing to the generalizability of our results to the community. Fourth, we included only the C-statistic of raw, non-bootstrapped data into meta-analysis in order to not bias the meta-analysis with potentially overly narrow confidence intervals. Finally, we refrained from meta-regression or subgroup meta-analysis based on e.g. a subdivision of cohorts’ mean age, AF incidence or region, to explain the heterogeneity in our results. Such analyses from aggregate data are known to have a high risk of especially ecological bias and are inferior to subgroup results derived from individual participant data (IPD). An IPD meta-analysis, however, was not the scope of the current study. The primary limitation of our study is the high heterogeneity of included studies. We attempted to cope with this limitation by performing sensitivity analyses and by calculating a 95% PI in our meta-analyses with high heterogeneity. The outcomes of our meta-analyses with significant 95% PI can be considered generalizable to such populations as included into those meta-analyses, despite high heterogeneity. As a second limitation, we did not provide a meta-analysis on model calibration since such analyses are often challenging due to a lack of calibration measures reported among studies. Indeed, we found that meta-analysable data on calibration was poorly reported on among included studies (Supplementary material online, ). Moreover, summarizing O:E data would have automatically excluded those models that were not originally intended for incident AF, since expected incident AF rates would never have been defined for such models. We addressed calibration by performing a sensitivity analysis among cohorts which had demonstrated sufficient calibration by their applied risk model(s). A third limitation is that we included both prospective and retrospective cohort studies. This may have introduced bias as AF is not always symptomatic and asymptomatic patients are less likely to undergo rhythm evaluation when left to their physicians’ discretion than when ECG is performed in the context of a prescheduled follow-up. The restriction of our search to studies written in English which we applied for quality-related as well as practical reasons, finally, has been found not to lead to significant bias.

Conclusion

We provided an overview of prediction models for incident AF risk that are applicable in and have been derived, validated, and/or augmented in community cohorts. We identified 21 risk models that met these criteria. Of these, CHARGE-AF seemed the most robust in terms of performance as well as applicability in the community.

Funding

This work was supported by the Netherlands Organisation for Health Research and Development (ZonMw) [80-83910-98-13046]. The authors had full autonomy in design, conduct, and reporting of the manuscript. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 648131) and German Center for Cardiovascular Research (DZHK e.V.) (81Z1710103). Salary support for R.E.H. was provided by a Rubicon Fellowship grant by the Netherlands Organisation for Scientific Research (NWO) (grant No. 452173116). Conflict of interest: none declared. Click here for additional data file.

Table 1

Characteristics of included risk models developed for incident AF

Model	ARIC-AF	CHARGE-AF	C₂HEST	FHS-AF	Mayo	MHS	PREVEND	Seirei	Suita	WHS
Model type	Point-based	Cox regression	Point-based	Cox regression	Point-based	Point-based	Latent class analysis	Point-based	Point-based	Cox regression
Intended prediction window for incident AF (years)	10	5	11	5, 10	NS	10	10	7	10	10
Model variables^a
Age	X	X	X	X	X	X	X	X	X	X
Sex				X	X	X	X	X	X
Race	X	X					X
Body measurements (height, weight, and BMI)	X	X		X		X	X	X	X	X
Blood pressure (systolic, diastolic)	X	X		X		X	X	X		X
Heart rate							X	X
Heart failure history	X	X	X	X	X	X	X		X
Hypertension treatment or history	X	X	X	X	X	X	X		X
Diabetes mellitus history	X	X			X		X
Stroke history							X
CHD or MI history	X	X	X				X
Vascular disease history							X
Alcohol use							X	X	X	X
Smoking	X	X					X		X	X
ECG parameters	X			X			X
COPD			X			X
Autoimmune or inflammatory disease history						X
Significant murmur	X			X	X			X	X
Serum lipids							X		X
Glomerular filtration rate							X
Urine albumin secretion							X
Thyroid disease			X

AF, atrial fibrillation; ARIC-AF, Atherosclerosis Risk In Communities score for Atrial Fibrillation; BMI, body mass index; CHADS2, Congestive heart failure, Hypertension, Age >75, Diabetes mellitus, prior Stroke or transient ischaemic attack (2 points); CHA2DS2-VASc, Congestive HF, Hypertension, Age >75 (2 points), Stroke/transient ischaemic attack/thromboembolism (2 points), Vascular disease, Age 65–74, Sex category; CHARGE-AF, Cohorts for Heart and Aging Research in Genomic Epidemiology; CHD, coronary heart disease; C2HEST, Coronary artery disease/Chronic obstructive pulmonary disease (2 points), Hypertension, Elderly, Systolic heart failure, Thyroid disease; COPD, chronic obstructive pulmonary disease; ECG, electrocardiogram; FHS-AF, Framingham Heart Study score for Atrial Fibrillation; MHS, Maccabi Healthcare Services; MI, myocardial infarction; NS, not specified; PREVEND, Prevention of Renal and Vascular End-stage Disease; WHS, Women’s Health Study.

Depicted here are the variables in the simple (non-augmented) models.

64 in total

1. Validation of an atrial fibrillation risk algorithm in whites and African Americans.

Authors: Renate B Schnabel; Thor Aspelund; Guo Li; Lisa M Sullivan; Astrid Suchy-Dicey; Tamara B Harris; Michael J Pencina; Ralph B D'Agostino; Daniel Levy; William B Kannel; Thomas J Wang; Richard A Kronmal; Philip A Wolf; Gregory L Burke; Lenore J Launer; Ramachandran S Vasan; Bruce M Psaty; Emelia J Benjamin; Vilmundur Gudnason; Susan R Heckbert
Journal: Arch Intern Med Date: 2010-11-22

2. Frequent premature atrial contractions are associated with atrial fibrillation, brain ischaemia, and mortality: a systematic review and meta-analysis.

Authors: Jelle C L Himmelreich; Wim A M Lucassen; Martijn Heugen; Patrick M M Bossuyt; Hanno L Tan; Ralf E Harskamp; Faridi S van Etten-Jamaludin; Henk C P M van Weert
Journal: Europace Date: 2019-05-01 Impact factor: 5.214

3. A population screening programme for atrial fibrillation: a report from the Belgian Heart Rhythm Week screening programme.

Authors: Marco Proietti; Georges H Mairesse; Peter Goethals; Christophe Scavee; Johan Vijgen; Ivan Blankoff; Yves Vandekerckhove; Gregory Y H Lip
Journal: Europace Date: 2016-05-11 Impact factor: 5.214

4. Relation between soluble ST2, growth differentiation factor-15, and high-sensitivity troponin I and incident atrial fibrillation.

Authors: Michiel Rienstra; Xiaoyan Yin; Martin G Larson; João D Fontes; Jared W Magnani; David D McManus; Elizabeth L McCabe; Erin E Coglianese; Michael Amponsah; Jennifer E Ho; James L Januzzi; Kai C Wollert; Michael G Fradley; Ramachandran S Vasan; Patrick T Ellinor; Thomas J Wang; Emelia J Benjamin
Journal: Am Heart J Date: 2013-10-17 Impact factor: 4.749

5. Cost-effectiveness of mass screening for untreated atrial fibrillation using intermittent ECG recording.

Authors: Mattias Aronsson; Emma Svennberg; Mårten Rosenqvist; Johan Engdahl; Faris Al-Khalili; Leif Friberg; Viveka Frykman-Kull; Lars-Åke Levin
Journal: Europace Date: 2015-04-12 Impact factor: 5.214

6. Interpretation of random effects meta-analyses.

Authors: Richard D Riley; Julian P T Higgins; Jonathan J Deeks
Journal: BMJ Date: 2011-02-10

7. B-type natriuretic peptide and C-reactive protein in the prediction of atrial fibrillation risk: the CHARGE-AF Consortium of community-based cohort studies.

Authors: Moritz F Sinner; Katherine A Stepas; Carlee B Moser; Bouwe P Krijthe; Thor Aspelund; Nona Sotoodehnia; João D Fontes; A Cecile J W Janssens; Richard A Kronmal; Jared W Magnani; Jacqueline C Witteman; Alanna M Chamberlain; Steven A Lubitz; Renate B Schnabel; Ramachandran S Vasan; Thomas J Wang; Sunil K Agarwal; David D McManus; Oscar H Franco; Xiaoyan Yin; Martin G Larson; Gregory L Burke; Lenore J Launer; Albert Hofman; Daniel Levy; John S Gottdiener; Stefan Kääb; David Couper; Tamara B Harris; Brad C Astor; Christie M Ballantyne; Ron C Hoogeveen; Andrew E Arai; Elsayed Z Soliman; Patrick T Ellinor; Bruno H C Stricker; Vilmundur Gudnason; Susan R Heckbert; Michael J Pencina; Emelia J Benjamin; Alvaro Alonso
Journal: Europace Date: 2014-07-18 Impact factor: 5.214

8. Normal Thyroid Function and the Risk of Atrial Fibrillation: the Rotterdam Study.

Authors: Layal Chaker; Jan Heeringa; Abbas Dehghan; Marco Medici; W Edward Visser; Christine Baumgartner; Albert Hofman; Nicolas Rodondi; Robin P Peeters; Oscar H Franco
Journal: J Clin Endocrinol Metab Date: 2015-08-11 Impact factor: 5.958

9. Relations of biomarkers of distinct pathophysiological pathways and atrial fibrillation incidence in the community.

Authors: Renate B Schnabel; Martin G Larson; Jennifer F Yamamoto; Lisa M Sullivan; Michael J Pencina; James B Meigs; Geoffrey H Tofler; Jacob Selhub; Paul F Jacques; Philip A Wolf; Jared W Magnani; Patrick T Ellinor; Thomas J Wang; Daniel Levy; Ramachandran S Vasan; Emelia J Benjamin
Journal: Circulation Date: 2010-01-04 Impact factor: 29.690

10. Usefulness of HATCH score in the prediction of new-onset atrial fibrillation for Asians.

Authors: Kazuyoshi Suenari; Tze-Fan Chao; Chia-Jen Liu; Yasuki Kihara; Tzeng-Ji Chen; Shih-Ann Chen
Journal: Medicine (Baltimore) Date: 2017-01 Impact factor: 1.889

11 in total

1. A new scoring system: PAT₂C₂H score. Its clinical use and comparison with HATCH and CHA₂DS₂-VASc scores in predicting arrhythmia recurrence after cryoballoon ablation of paroxysmal atrial fibrillation.

Authors: Serkan Cay; Meryem Kara; Firat Ozcan; Ozcan Ozeke; Ahmet Korkmaz; Hande Cetin; Tolga Aksu; Serkan Topaloglu
Journal: J Interv Card Electrophysiol Date: 2022-08-04 Impact factor: 1.759

2. Novel model-based point scoring system for predicting stroke risk in atrial fibrillation patients: Results from a nationwide cohort study with validation.

Authors: Yun-Yu Chen; Yenn-Jiang Lin; Kuo-Liong Chien; Tze-Fan Chao; Li-Wei Lo; Shih-Lin Chang; Fa-Po Chung; Chin-Yu Lin; Ting-Yung Chang; Ling Kuo; Yu-Cheng Hsieh; Cheng-Hung Li; Shih-Ann Chen
Journal: Int J Cardiol Heart Vasc Date: 2021-04-28

3. CHARGE-AF in a national routine primary care electronic health records database in the Netherlands: validation for 5-year risk of atrial fibrillation and implications for patient selection in atrial fibrillation screening.

Authors: Jelle C L Himmelreich; Wim A M Lucassen; Ralf E Harskamp; Claire Aussems; Henk C P M van Weert; Mark M J Nielen
Journal: Open Heart Date: 2021-01

4. C₂HEST score predicts clinical outcomes in heart failure with preserved ejection fraction: a secondary analysis of the TOPCAT trial.

Authors: Weihao Liang; Yuzhong Wu; Gregory Y H Lip; Wengen Zhu; Chen Liu; Ruicong Xue; Zexuan Wu; Dexi Wu; Jiangui He; Yugang Dong
Journal: BMC Med Date: 2021-02-18 Impact factor: 8.775

5. Usefulness of CHADS2, R2CHADS2, and CHA2DS2-VASc scores for predicting incident atrial fibrillation in heart failure with preserved ejection fraction patients.

Authors: Yuzhong Wu; Zengshuo Xie; Weihao Liang; Ruicong Xue; Zexuan Wu; Dexi Wu; Jiangui He; Wengen Zhu; Chen Liu
Journal: ESC Heart Fail Date: 2021-01-27

6. Validating risk models versus age alone for atrial fibrillation in a young Dutch population cohort: should atrial fibrillation risk prediction be expanded to younger community members?

Authors: Jelle C L Himmelreich; Ralf E Harskamp; Bastiaan Geelhoed; Saverio Virdone; Wim A M Lucassen; Ron T Gansevoort; Michiel Rienstra
Journal: BMJ Open Date: 2022-02-16 Impact factor: 2.692

7. Re-CHARGE-AF: Recalibration of the CHARGE-AF Model for Atrial Fibrillation Risk Prediction in Patients With Acute Stroke.

Authors: Jeffrey M Ashburner; Xin Wang; Xinye Li; Shaan Khurshid; Darae Ko; Ana Trisini Lipsanopoulos; Priscilla R Lee; Taylor Carmichael; Ashby C Turner; Corban Jackson; Patrick T Ellinor; Emelia J Benjamin; Steven J Atlas; Daniel E Singer; Ludovic Trinquart; Steven A Lubitz; Christopher D Anderson
Journal: J Am Heart Assoc Date: 2021-10-20 Impact factor: 5.501

8. Prediction of incident atrial fibrillation in community-based electronic health records: a systematic review with meta-analysis.

Authors: Ramesh Nadarajah; Eman Alsaeed; Ben Hurdus; Suleman Aktaa; David Hogg; Matthew G D Bates; Campbel Cowan; Jianhua Wu; Chris P Gale
Journal: Heart Date: 2022-06-10 Impact factor: 7.365

Review 9. Computational models of atrial fibrillation: achievements, challenges, and perspectives for improving clinical care.

Authors: Jordi Heijman; Henry Sutanto; Harry J G M Crijns; Stanley Nattel; Natalia A Trayanova
Journal: Cardiovasc Res Date: 2021-06-16 Impact factor: 10.787

10. Predicting patient-level new-onset atrial fibrillation from population-based nationwide electronic health records: protocol of FIND-AF for developing a precision medicine prediction model using artificial intelligence.

Authors: Ramesh Nadarajah; Jianhua Wu; Alejandro F Frangi; David Hogg; Campbell Cowan; Chris Gale
Journal: BMJ Open Date: 2021-11-02 Impact factor: 2.692