Shaan Khurshid1, Wanyi Chen2, Daniel E Singer3,4, Steven J Atlas3,4, Jeffrey M Ashburner3,4, Jin G Choi5, Chin Hur6,7, Patrick T Ellinor1, David D McManus8, Jagpreet Chhatwal2, Steven A Lubitz1. 1. Cardiovascular Research Center and Cardiac Arrhythmia Service Division of Cardiology Massachusetts General Hospital Boston MA. 2. Institute for Technology Assessment Massachusetts General Hospital Boston MA. 3. Division of General Internal Medicine Massachusetts General Hospital MA. 4. Department of Medicine Harvard Medical School Boston MA. 5. University of Chicago Pritzker School of Medicine Chicago IL. 6. Department of Medicine Columbia University New York NY. 7. Department of Epidemiology Mailman School of Public Health Columbia University New York NY. 8. Department of Medicine University of Massachusetts Medical School Worcester MA.
Abstract
Background Atrial fibrillation (AF) screening is endorsed by certain guidelines for individuals aged ≥65 years. Yet many AF screening strategies exist, including the use of wrist-worn wearable devices, and their comparative effectiveness is not well-understood. Methods and Results We developed a decision-analytic model simulating 50 million individuals with an age, sex, and comorbidity profile matching the United States population aged ≥65 years (ie, with a guideline-based AF screening indication). We modeled no screening, in addition to 45 distinct AF screening strategies (comprising different modalities and screening intervals), each initiated at a clinical encounter. The primary effectiveness measure was quality-adjusted life-years, with incident stroke and major bleeding as secondary measures. We defined continuous or nearly continuous modalities as those capable of monitoring beyond a single time-point (eg, patch monitor), and discrete modalities as those capable of only instantaneous AF detection (eg, 12-lead ECG). In total, 10 AF screening strategies were effective compared with no screening (300-1500 quality-adjusted life-years gained/100 000 individuals screened). Nine (90%) effective strategies involved use of a continuous or nearly continuous modality such as patch monitor or wrist-worn wearable device, whereas 1 (10%) relied on discrete modalities alone. Effective strategies reduced stroke incidence (number needed to screen to prevent a stroke: 3087-4445) but increased major bleeding (number needed to screen to cause a major bleed: 1815-4049) and intracranial hemorrhage (number needed to screen to cause intracranial hemorrhage: 7693-16 950). The test specificity was a highly influential model parameter on screening effectiveness. Conclusions When modeled from a clinician-directed perspective, the comparative effectiveness of population-based AF screening varies substantially upon the specific strategy used. Future screening interventions and guidelines should consider the relative effectiveness of specific AF screening strategies.
Background Atrial fibrillation (AF) screening is endorsed by certain guidelines for individuals aged ≥65 years. Yet many AF screening strategies exist, including the use of wrist-worn wearable devices, and their comparative effectiveness is not well-understood. Methods and Results We developed a decision-analytic model simulating 50 million individuals with an age, sex, and comorbidity profile matching the United States population aged ≥65 years (ie, with a guideline-based AF screening indication). We modeled no screening, in addition to 45 distinct AF screening strategies (comprising different modalities and screening intervals), each initiated at a clinical encounter. The primary effectiveness measure was quality-adjusted life-years, with incident stroke and major bleeding as secondary measures. We defined continuous or nearly continuous modalities as those capable of monitoring beyond a single time-point (eg, patch monitor), and discrete modalities as those capable of only instantaneous AF detection (eg, 12-lead ECG). In total, 10 AF screening strategies were effective compared with no screening (300-1500 quality-adjusted life-years gained/100 000 individuals screened). Nine (90%) effective strategies involved use of a continuous or nearly continuous modality such as patch monitor or wrist-worn wearable device, whereas 1 (10%) relied on discrete modalities alone. Effective strategies reduced stroke incidence (number needed to screen to prevent a stroke: 3087-4445) but increased major bleeding (number needed to screen to cause a major bleed: 1815-4049) and intracranial hemorrhage (number needed to screen to cause intracranial hemorrhage: 7693-16 950). The test specificity was a highly influential model parameter on screening effectiveness. Conclusions When modeled from a clinician-directed perspective, the comparative effectiveness of population-based AF screening varies substantially upon the specific strategy used. Future screening interventions and guidelines should consider the relative effectiveness of specific AF screening strategies.
Using a comprehensive simulation model including 50 million individuals aged ≥65 years, we compared the clinical effectiveness of no screening versus 45 distinct atrial fibrillation screening strategies, including strategies using wearable devices.Strategies using a sensitive modality upfront (eg, single‐lead ECG, wrist‐worn wearable photoplethysmography), followed by a highly specific test to minimize false‐positive diagnoses, were most effective.In our simulation, the majority of effective strategies included use of a wearable device in the screening pathway.
What Are the Clinical Implications?
Minimizing false positives is critical for effective population‐based atrial fibrillation screening. Wearable devices are likely to be important for clinician‐directed atrial fibrillation screening.Undetected atrial fibrillation (AF) may lead to increased stroke risk.
Since oral anticoagulation (OAC) can reduce risk of AF‐related stroke,
AF screening may enable early diagnosis of AF and initiation of OAC to prevent strokes. However, concerns exist about the potential downstream complications of screening, such as OAC‐related bleeding.
Multiple studies have demonstrated that AF screening is feasible and leads to increased AF diagnosis,
yet none have reported on whether screening prevents strokes or increases bleeding.Recent technological advances have enabled a myriad of AF screening approaches, which have not been comprehensively compared. In addition to pulse palpation, 12‐lead ECG, and patch monitoring, screening can now be conducted using handheld 1‐lead ECG and wrist‐worn wearable devices including smart watches or bands.
,
Wrist‐worn wearables, in particular, can be used to ascertain cardiac rhythm in a frequent or nearly continuous manner using photoplethysmography or 1‐lead ECG, thus offering the potential to detect episodes of paroxysmal AF otherwise eluding identification. Yet longer or more frequent screening may increase false positives or detect infrequent episodes of paroxysmal AF for which the degree of increased stroke risk is unclear.Studies testing whether AF screening reduces stroke are challenging to conduct because of sample size requirements and high costs. It therefore remains uncertain whether population‐level AF screening is clinically effective. Consequently, consensus guidelines offer conflicting endorsements of population‐based AF screening, with cardiology societies from Europe and Australia/New Zealand providing a class I recommendation for AF screening in individuals aged ≥65 years, and the United States Preventive Services Task Force concluding there is insufficient evidence for or against AF screening with electrocardiography.
,
,Given the prohibitive nature of conducting trials for each of the many potential AF screening methods, we used a comprehensive decision‐analytic model to assess the long‐term benefits and harms of clinician‐directed AF screening using traditional and novel screening modalities incorporated into a wide range of potential screening strategies.
Methods
Data Availability
The code underlying the simulation model described in the current study will be made available upon reasonable request to the corresponding author. Given that all data used in this study stem from previously published reports, and no new patient data were generated or used, the study did not require Institutional Review Board approval.
Model Structure
We constructed a microsimulation model recapitulating the clinical course of AF using an individual‐level state‐transition approach. The model was built using C++. The model simulated a 50‐million person cohort with age and comorbidity distribution matching the 2019 US population aged ≥65 years—the age at which AF screening is guideline‐recommended.
,
We assumed that only individuals at sufficient stroke risk to merit OAC based on the CHA2DS2‐VASC score
,
in the presence of an AF diagnosis would be screened. The natural history (no screening) as well as 45 unique screening approaches were simulated. Health states in our model were characterized by the historical profile of clinical events that occurred in each simulated individual's life course up to that time (ie, acuteness, number of and types of events), the patient's demographics (eg, age and sex), AF status (including underlying presence, burden, whether symptomatic, and whether diagnosed), presence of CHA2DS2‐VASC risk factors, and use of antithrombotic treatment, each of which governed transition probabilities into future states. The time between state transitions was 1 month. Given greater risk for recurrent events and mortality observed after more recent clinical events, we modeled recency of clinical events into 3 categories (ie, acute [0–30 days], subacute [31 days to 1 year], and remote [>1 year]). Figure 1 provides an overview of model structure. To fully encompass the potential long‐term consequences of screening, we adopted a lifetime horizon, ending simulation at death or age 100. Further details of model structure and an overview of the sample size determination for the simulation are presented in Data S1.
Figure 1
Model structure.
A state transition diagram is depicted summarizing the range of possible states occupied by simulated individuals. For stroke and bleeding events, post‐event states (acute, subacute, remote) are used to capture the increased risk for morbidity and recurrent events observed in the period following these events. Medical comorbidities other than stroke or bleeding (eg, CHA2DS2‐VASc
risk factors) could accrue across any transition period. In addition to the current state, other clinical factors (listed in the state determinants box) influenced transition probabilities to future states.
Model structure.
A state transition diagram is depicted summarizing the range of possible states occupied by simulated individuals. For stroke and bleeding events, post‐event states (acute, subacute, remote) are used to capture the increased risk for morbidity and recurrent events observed in the period following these events. Medical comorbidities other than stroke or bleeding (eg, CHA2DS2‐VASc
risk factors) could accrue across any transition period. In addition to the current state, other clinical factors (listed in the state determinants box) influenced transition probabilities to future states.
AF Risk
Model input parameters were derived from published literature (Table S1), with studies selected systematically (Data S1). We modeled AF incidence using previously reported age‐ and sex‐stratified estimates. AF could be detected in the context of routine care, screening, or a 2‐week patch monitor (PM) deployed after every stroke (mirroring contemporary practice).
,Although the prevalence of undiagnosed AF is unknown, studies demonstrate that the proportion of additional cases detected using intermittent or short‐term screening represents ≈24% of the underlying AF prevalence.
,
,
Therefore, we assumed that 24% of prevalent AF in the simulated population is undiagnosed. Individuals with undiagnosed AF could become diagnosed through AF screening, according to the test characteristics of the strategy applied. Since the underlying prevalence of undiagnosed AF is inherently uncertain, we varied the proportion of undiagnosed AF widely in sensitivity analyses.
Stroke Risk
We modeled the prevalence and incidence of stroke among individuals without AF using population‐based estimates. Among individuals with AF, we varied stroke incidence in accordance with CHA2DS2VASc,
a widely used score for predicting stroke in AF. CHA2DS2VASc scores varied over time according to the incidence of component comorbidities. Since individuals with incident stroke are known to be at higher risk for recurrent events in a time‐dependent manner, we also applied recurrence‐specific stroke rates (Data S1).Stroke severity was simulated by applying an expected distribution of the modified Rankin scale.
In accordance with published evidence, stroke mortality varied according to severity. We assumed that paroxysmal and persistent AF conferred similar stroke risk in the base case, but performed sensitivity analyses in which we assumed that the stroke risk associated with paroxysmal AF was 75% of that associated with persistent AF.
,
,
We modeled the effect of antithrombotic therapy on stroke incidence and severity as a function of the presence or absence of AF.
Bleeding Risk
We modeled 2 classes of bleeding in accordance with International Society of Thrombosis and Hemostasis guidelines: major bleeding and clinically relevant non‐major bleeding.
,
Accordingly, we treated intracranial hemorrhage (ICH) as a subset of major bleeding. We modeled the effect of antithrombotic therapy on bleeding incidence and mortality.
Antithrombotic Therapies
We modeled 3 forms of antithrombotic therapy: aspirin, warfarin, and direct‐acting oral anticoagulant. We modeled aspirin use based on the presence of vascular disease, concurrent OAC administration, and contemporary primary prevention use patterns (Data S1). We assumed complete OAC usage at AF diagnosis but modeled real‐world estimates of OAC discontinuation over time.
,
Frequency of warfarin versus direct‐acting oral anticoagulant was based on contemporary use patterns.
,
We assumed permanent OAC discontinuation following major bleeds.
Screening Strategies
In addition to no screening, we evaluated 6 distinct AF screening modalities: pulse palpation, 1‐lead handheld ECG, 12‐lead ECG, PM, and wrist‐worn wearables (smart watch/band photoplethysmography and smart watch/band ECG). We arranged screening modalities in pragmatic combinations including those evaluated in AF screening trials.
,
,
,
,
,
,
,
To facilitate comparison across strategies, we assumed a clinician‐directed screening approach, in which screening would be initiated at a clinical encounter, and could be continued following the encounter based depending upon strategy used (eg, wrist‐worn wearable). We defined continuous or nearly continuous modalities as those capable of monitoring beyond a single time‐point (eg, PM, wrist‐worn wearable photoplethysmography), and discrete modalities as those capable of only instantaneous AF detection (eg, pulse palpation, 12‐lead ECG). We assumed that a continuous or nearly continuous modality would only be prescribed after a negative discrete modality. We defined confirmatory tests as those performed conditionally following an abnormal result on a preceding test (eg, confirmatory PM following abnormal wrist‐worn wearable photoplethysmography). Mirroring previous interventions using confirmatory tests (eg, Screening for Atrial Fibrillation the Elderly [SAFE],
the VITAL‐AF trial
), we assumed that confirmatory tests would be deployed immediately following the abnormal preceding test. Positive results on a confirmatory test, on the final test in the screening pathway, or on 12‐lead ECG (even when not utilized as a confirmatory test) resulted in AF diagnosis and termination of screening. For each modality, we applied published test characteristics to determine the diagnostic result (eg, true positive, false positive). For strategies not using wrist‐worn wearables, we also varied the screening interval (once, annually, every 5 years) to assess the effect of repeated screening. For strategies using wrist‐worn wearables, to fully assess the potential effects of prolonged screening, we compared a 12‐month versus lifetime screening duration. Although lifetime wearable use is an idealized scenario, we assumed this is plausible given the increasing availability of wrist‐worn wearable devices. To mirror current technology, wrist‐worn wearables with both photoplethysmography and ECG capability operated using photoplethysmography as the default function, with ECG triggered only after detection of abnormal photoplethysmography signals. Ultimately, our model compared 45 unique AF screening strategies (Figure 2). To facilitate reporting of results, strategies are reported by effectiveness rank, which corresponds to decreasing order of effectiveness (eg, Strategy 1 is the most effective strategy and Strategy 45 is the least effective strategy).
Figure 2
Screening strategies for population‐based atrial fibrillation screening.
Diagrams depict selected population‐based atrial fibrillation (AF) screening strategies evaluated. Strategies are labeled by rank order of decreasing effectiveness (see text). Diagrams correspond to multiple strategies since the same screening sequences were evaluated across varying durations and frequencies. Strategies 11, 12, and 22 (top left) utilize pulse palpation followed by confirmatory 12‐lead ECG if pulse palpation shows irregularity (analogous to the “opportunistic screening” strategy in the SAFE trial
). Strategies 10, 13, and 14 (top right) utilize single‐lead ECG followed by confirmatory 12‐lead ECG if single‐lead ECG suggests possible AF (analogous to SEARCH‐AF
and VITAL‐AF
). Strategies 31, 36, and 44 (middle left) utilize 12‐lead ECG followed by patch monitor if 12‐lead ECG does not show AF (analogous to the mSToPS trial
). Strategies 5 and 25 (middle right) use a novel strategy in which 12‐lead ECG is followed by wrist‐worn wearable‐based photoplethysmography if 12‐lead ECG is negative, then by wrist‐worn wearable‐based ECG if photoplethysmography is positive, then by confirmatory patch monitor if wrist‐worn wearable‐based ECG is positive. Strategies 35, 38, and 45 (bottom left) utilize 1‐lead ECG alone to diagnose AF. For all strategies, AF could be diagnosed by a positive result on a confirmatory test (gray box), on 12‐lead ECG (even when not utilized as a confirmatory test) or on the final test in the screening pathway. An AF diagnosis could be made in an individual who truly has AF (true positive), or an individual who does not truly have AF (false positive). In either case, an AF diagnosis leads to initiation of anticoagulation (in the absence of major bleeding history, see text), which provides greater protection against stroke among individuals with AF versus those without and increases bleeding risk among all individuals. AF indicates atrial fibrillation.
Screening strategies for population‐based atrial fibrillation screening.
Diagrams depict selected population‐based atrial fibrillation (AF) screening strategies evaluated. Strategies are labeled by rank order of decreasing effectiveness (see text). Diagrams correspond to multiple strategies since the same screening sequences were evaluated across varying durations and frequencies. Strategies 11, 12, and 22 (top left) utilize pulse palpation followed by confirmatory 12‐lead ECG if pulse palpation shows irregularity (analogous to the “opportunistic screening” strategy in the SAFE trial
). Strategies 10, 13, and 14 (top right) utilize single‐lead ECG followed by confirmatory 12‐lead ECG if single‐lead ECG suggests possible AF (analogous to SEARCH‐AF
and VITAL‐AF
). Strategies 31, 36, and 44 (middle left) utilize 12‐lead ECG followed by patch monitor if 12‐lead ECG does not show AF (analogous to the mSToPS trial
). Strategies 5 and 25 (middle right) use a novel strategy in which 12‐lead ECG is followed by wrist‐worn wearable‐based photoplethysmography if 12‐lead ECG is negative, then by wrist‐worn wearable‐based ECG if photoplethysmography is positive, then by confirmatory patch monitor if wrist‐worn wearable‐based ECG is positive. Strategies 35, 38, and 45 (bottom left) utilize 1‐lead ECG alone to diagnose AF. For all strategies, AF could be diagnosed by a positive result on a confirmatory test (gray box), on 12‐lead ECG (even when not utilized as a confirmatory test) or on the final test in the screening pathway. An AF diagnosis could be made in an individual who truly has AF (true positive), or an individual who does not truly have AF (false positive). In either case, an AF diagnosis leads to initiation of anticoagulation (in the absence of major bleeding history, see text), which provides greater protection against stroke among individuals with AF versus those without and increases bleeding risk among all individuals. AF indicates atrial fibrillation.
Utilities
We obtained long‐term disutility values for chronic conditions (eg, AF, history of severe stroke) and exposures (eg, OAC use) from the previous literature. We also applied 1‐time disutility penalties for short‐lived adverse events (eg, major bleeding).
Outcomes
The primary outcome was quality‐adjusted life‐years (QALYs). Secondary outcomes included ischemic stroke, major bleeding, intracranial hemorrhage, and AF true‐positive and false‐positive rates. We calculated incidence rates by dividing incident events observed by the person‐time accrued either before the event (events) or until death (non‐events). We estimated the number needed to screen (NNS) to prevent a stroke or cause a bleed as the inverse of the absolute difference in the event rate compared with no screening.
We did not discount future QALYs. Based on our simulation size determinations (Data S1), QALY differences within 100 QALYs per 100 000 individuals may be attributable to simulation noise. Therefore, we defined effective strategies as only those providing an increase in QALYs of ≥200 per 100 000 individuals as compared with no screening.
Sensitivity Analyses
Although certain guidelines recommend AF screening in individuals aged ≥65 years, some studies have investigated screening older populations.
To assess the effect of varying age thresholds on screening effectiveness, we applied the base case model in simulated populations mirroring the US population aged ≥70 years, and aged ≥75 years.We performed sensitivity analyses to assess the effect of parameter uncertainty. In probabilistic sensitivity analyses, we varied distribution parameters across plausible evidence‐based ranges (Table S2). In 1‐way sensitivity analyses, we assessed the effect of varying specific parameters chosen based on clinical importance or influence in previous models (Table S2).
,
Results
Base Case Results
Results of the base case analysis are depicted in Table S3 and summarized in Figures 3 and 4. With no screening, the average individual accrued 9.027 QALYs. Of the 45 strategies tested, only Strategies 1–10 (22%) resulted in QALY gain (range 300–1500 QALYs gained/100 000 people screened) and were therefore considered effective. Among Strategies 1–10, reduction in stroke ranged 0.23 to 0.32/1000 person‐years (NNS to prevent stroke: 3087–4445), increase in major bleeding ranged 0.25 to 0.55/1000 person‐years (NNS to cause major bleed: 1815–4049), and increase in ICH ranged 0.059 to 0.13/1000 person‐years (NNS to cause ICH: 7693–16 950, Figure 3, Table S3).
Figure 3
Clinical events by screening strategy.
Depicted are clinical effectiveness end points of interest according to atrial fibrillation screening strategy. The left panel depicts the incidence rates of ischemic stroke (blue), major bleeding (dark red), and intracranial hemorrhage (a subset of major bleeding, orange) according to atrial fibrillation screening strategy. The right panel depicts the overall atrial fibrillation true‐positive rate (green) and false‐positive rate (red). The strategies corresponding to each point are depicted by the table to the left of both graphs, corresponding to the icons above the table applied in sequence from left to right. The bars colored in darker shade depict relevant event rates with no screening. Strategies are numbered and sorted in rank order of decreasing effectiveness (ie, decreasing quality‐adjusted life‐years), starting with the most effective strategies at the top. Effective screening strategies (ie, providing an increase in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening), are depicted in black while all others are depicted in gray. A small false positive rate in the no screening condition (0.4%) is attributable to the application of a patch monitor following all stroke events (see text). 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; AF, atrial fibrillation; ICH, intracranial hemorrhage; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.
Figure 4
Screening effectiveness.
Depicted are the overall effectiveness results in the base case analysis assessing 45 unique strategies for atrial fibrillation screening. The strategies corresponding to each point are depicted by the table to the left of the graph, corresponding to the icons above the table. The vertical dashed line represents the expected quality‐adjusted life‐years lived without atrial fibrillation screening. Effective screening strategies (ie, providing an increase in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening) are depicted in green, ineffective screening strategies (ie, providing a decrease in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening), are depicted in red, while all others are considered equivalent to no screening and depicted in yellow. Strategies are numbered and sorted in rank order of decreasing effectiveness (ie, decreasing quality‐adjusted life‐years), starting with the most effective strategies at the top. 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.
Clinical events by screening strategy.
Depicted are clinical effectiveness end points of interest according to atrial fibrillation screening strategy. The left panel depicts the incidence rates of ischemic stroke (blue), major bleeding (dark red), and intracranial hemorrhage (a subset of major bleeding, orange) according to atrial fibrillation screening strategy. The right panel depicts the overall atrial fibrillation true‐positive rate (green) and false‐positive rate (red). The strategies corresponding to each point are depicted by the table to the left of both graphs, corresponding to the icons above the table applied in sequence from left to right. The bars colored in darker shade depict relevant event rates with no screening. Strategies are numbered and sorted in rank order of decreasing effectiveness (ie, decreasing quality‐adjusted life‐years), starting with the most effective strategies at the top. Effective screening strategies (ie, providing an increase in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening), are depicted in black while all others are depicted in gray. A small false positive rate in the no screening condition (0.4%) is attributable to the application of a patch monitor following all stroke events (see text). 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; AF, atrial fibrillation; ICH, intracranial hemorrhage; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.
Screening effectiveness.
Depicted are the overall effectiveness results in the base case analysis assessing 45 unique strategies for atrial fibrillation screening. The strategies corresponding to each point are depicted by the table to the left of the graph, corresponding to the icons above the table. The vertical dashed line represents the expected quality‐adjusted life‐years lived without atrial fibrillation screening. Effective screening strategies (ie, providing an increase in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening) are depicted in green, ineffective screening strategies (ie, providing a decrease in quality‐adjusted life‐years of ≥200 per 100 000 individuals as compared with no screening), are depicted in red, while all others are considered equivalent to no screening and depicted in yellow. Strategies are numbered and sorted in rank order of decreasing effectiveness (ie, decreasing quality‐adjusted life‐years), starting with the most effective strategies at the top. 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.Among Strategies 1–10, 9 (Strategies 1–9; 90%) involved use of a continuous or nearly continuous modality such as PM or wrist‐worn wearable device, whereas 1 (Strategy 10; 10%) relied on discrete modalities alone. Wrist‐worn wearables were included in 24 of 45 (53.3%) strategies modeled, but in 9 of 10 (90%) strategies identified as effective (Strategies 1–9). The most effective strategy comprised pulse palpation, confirmatory 12‐lead ECG, and if necessary, wrist‐worn wearable with photoplethysmography and single‐lead ECG for lifetime duration, with confirmatory PM (Strategy 1: 1500 QALYs gained/100 000 people screened; NNS to prevent stroke: 4133; NNS to cause major bleed: 3847; NNS to cause ICH: 16 130). The only effective strategy not using a wrist‐worn wearable was 1‐lead ECG and confirmatory 12‐lead ECG repeated every 5 years (Strategy 10: 300 QALYs gained/100 000 people screened; NNS to prevent stroke: 3862; NNS to cause major bleed: 2802; NNS to cause ICH: 11 112).Compared with ineffective strategies, effective strategies generally demonstrated low AF false‐positive rates with comparable AF true‐positive rates (Figure 3, Table S3). Accordingly, although all screening strategies prevented strokes, effective strategies prevented strokes without inducing large increases in bleeding related to false positives. For example, Strategy 9 (12‐lead ECG, and if necessary, wrist‐worn wearable photoplethysmography with confirmatory PM) detected roughly as many true AF cases as Strategy 28 (pulse palpation with confirmatory 12‐lead ECG and PM), but exhibited a lower bleeding rate (6.09 versus 6.26 per 1000 person‐years) attributable to fewer false‐positive AF diagnoses (Table S4).Of 25 strategies demonstrating harm, 7 (Strategies 23, 29–30, 32, 39–41; 28%) included use of a wrist‐worn wearable without a confirmatory PM, and another 11 (Strategy 22, 26, 33–34, 36–38, 41–45; 44%) used discrete modalities repeated annually or every 5 years. Ineffective strategies tended to exhibit high false‐positive rates, resulting in large increases in bleeding with small incremental reductions in stroke (Table S3, Figure 3). For example, the least effective strategy overall was annual 1‐lead ECG (Strategy 45: 17 600 QALYs lost/100 000 people screened; NNS to prevent stroke: 1678; NNS to cause major bleed: 124; NNS to cause ICH: 470). When compared with Strategy 1 (pulse palpation, confirmatory 12‐lead ECG, and if necessary, wrist‐worn wearable with photoplethysmography and single‐lead ECG with confirmatory PM for lifetime duration), Strategy 45 (annual 1‐lead ECG) increased appropriate AF detection, yet AF was falsely diagnosed within an even greater number of individuals, leading to a substantially higher bleeding rate (13.61 versus 5.80 per 1000 person‐years). In general, more frequent screening using discrete modalities reduced effectiveness because of accrual of false positives (Figure 5). Even strategies composed of highly specific modalities could exhibit considerable false‐positive rates when performed annually (eg, false‐positive rate increased from 5.5% for Strategy 31 [12L and PM performed once] to 41.8% for Strategy 44 [12L and PM performed annually]). In contrast, increasing the screening duration of wrist‐worn wearables from 12 months to the lifespan generally resulted in greater benefit, as long as abnormal photoplethysmography signals were followed‐up with either PM or wrist‐worn single‐lead ECG (Figure 5).
Figure 5
Screening effectiveness stratified by screening duration and interval.
Depicted are the results of analyses assessing the temporal effect of screening using (A) wrist‐worn wearables, and (B) traditional screening modalities. The strategies corresponding to each point are depicted by the table to the left of the graph, corresponding to the icons above the table. The vertical dashed line represents the expected quality‐adjusted life‐years lived without atrial fibrillation screening. For strategies including wrist‐worn wearables, temporal assessments compared use of the wearable for the lifespan (red) versus 12 months (green). For strategies not including wrist‐worn wearables, temporal assessments compared screening once (green), every 5 years (yellow), and annually (red). 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.
Screening effectiveness stratified by screening duration and interval.
Depicted are the results of analyses assessing the temporal effect of screening using (A) wrist‐worn wearables, and (B) traditional screening modalities. The strategies corresponding to each point are depicted by the table to the left of the graph, corresponding to the icons above the table. The vertical dashed line represents the expected quality‐adjusted life‐years lived without atrial fibrillation screening. For strategies including wrist‐worn wearables, temporal assessments compared use of the wearable for the lifespan (red) versus 12 months (green). For strategies not including wrist‐worn wearables, temporal assessments compared screening once (green), every 5 years (yellow), and annually (red). 12L indicates 12‐lead ECG; 12m, 12 months; 1L, 1‐lead ECG; PM, patch monitor; PP, pulse palpation; PPG, photoplethysmography; and q5y, every 5 years.Models simulating more elderly populations demonstrated similar patterns of effectiveness with higher absolute event rates (individuals aged ≥70 years: QALYs gained 200–1600/100 000 people screened, NNS to prevent stroke: 1985–6061, NNS to cause major bleed: 1171–5320, Table S5, Figure S1) and individuals aged ≥75 years (QALYs gained 300–1000/100 000 people screened, NNS to prevent stroke: 1561–4525, NNS to cause major bleed: 965–3497, Table S6 and Figure S2). In 1‐way sensitivity analyses, test specificity consistently emerged as highly influential (Figure S3). Other influential parameters included the treatment effect of OAC and aspirin on AF‐related stroke and AF‐related quality‐of‐life. Assuming a low estimate for the proportion of AF that is paroxysmal resulted in loss of effectiveness of 1 (10%) strategy, while assuming paroxysmal AF is associated with a lower risk of stroke than persistent AF did not result in loss of effectiveness of any strategy (Figure S3). In probabilistic sensitivity analyses, all effective strategies had a probability of effectiveness ≥50% (Figure S4).
Discussion
Using a decision‐analytic model to quantify the comparative effectiveness of 45 distinct AF screening strategies in individuals aged ≥65 years, we found that population‐based AF screening can be effective within a clinician‐directed context. Importantly, the comparative effectiveness of population‐based AF screening varied substantially upon the specific strategy used, with less than one quarter of modeled strategies resulting in net benefit. Effective strategies were typically multimodal, and commonly included devices capable of prolonged continuous or nearly continuous cardiac rhythm assessment such as wrist‐worn wearables. The most effective strategies resulted in 1500 QALYs gained/100 000 individuals screened, prevented 1 stroke for every 3087 to 4445 people screened, and caused 1 major bleed for every 1815 to 4049 people screened. Whereas all screening strategies reduced strokes, effective strategies characteristically had low false‐positive rates and thereby resulted in only modest increases in bleeding.In a previous model by Aronsson et al,
single timepoint screening among individuals aged 75 to 76 years using 1‐lead ECG, confirmed by cardiologist overread or short‐term rhythm monitor, was clinically effective. We also found that 1‐lead ECG with confirmatory 12‐lead ECG performed once or every 5 years was effective, although we observed a more modest QALY gain. Other studies have suggested both clinical and cost‐effectiveness of AF screening using traditional modalities such as pulse palpation, 12‐lead ECG, and patch monitoring.
,
In our analysis, we found that pulse palpation followed conditionally by 12‐lead ECG resulted in similar QALY estimates as no screening, and screening using 12‐lead ECG alone generally resulted in reduced QALYs. Given the multitude of possible AF screening approaches and the generally low stroke rates among individuals with detected AF, conducting well‐powered randomized trials comparing each strategy is infeasible. Therefore, our simulation of 45 distinct strategies deployed within a unified screening context provides important comparative effectiveness data to guide future screening efforts and guidelines.Our results demonstrate that application of screening strategies with low specificity or without inclusion of a confirmatory test may be ineffective and even harmful. All modeled strategies reduced stroke rates by a similar margin but increases in bleeding varied substantially. Effective strategies consistently demonstrated low false positive rates, whereas strategies utilizing repeated or prolonged application of less specific tests (eg, single‐lead ECG, wrist‐worn photoplethysmography) had higher false positive rates leading to excess bleeding. Therefore, our results indicate that abnormal findings using highly sensitive modalities with low specificity should be confirmed (eg, PM) before taking clinical action. For example, single‐lead ECG or wrist‐worn wearable‐based photoplethysmography alone currently appear insufficient to reliably establish an AF diagnosis in the context of population‐based AF screening. We also observed that strategies comprising discrete modalities repeated annually were universally ineffective. Repeating screening tests best equipped to detect persistent AF within a population in which undiagnosed AF is increasingly paroxsymal (as most individuals with persistent AF will have been diagnosed on previous screens) likely does not substantially increase AF yield yet increases exposure to potential false positives. Whether minimization of false positive results could be optimized by targeting AF screening towards individuals at highest AF risk
or utilizing emerging artificial technology‐based methods for interpreting ECG waveforms
merits further study.Our findings suggest that the high sensitivity afforded by wrist‐worn wearables coupled with high specificity confirmatory testing may provide a favorable balance between detecting AF and avoiding erroneous diagnoses. Wrist‐worn wearables are increasingly common,
and wearable‐based AF screening is feasible.
,
Our model demonstrates that clinician‐guided deployment of wrist‐worn wearables for AF screening is effective, particularly when coupled with confirmatory testing. The best performing wrist‐worn wearable strategy saved an additional 1200 QALYs/100 000 people screened when compared with the best strategy not using wrist‐worn wearables. It is likely that a longer screen duration increases yield of low‐burden paroxysmal AF that would otherwise go undetected even with repeated screening using discrete modalities, while confirmatory testing reduces false positive diagnoses potentially introduced by extended duration screening.
,
Our results indicate that reflexive single‐lead ECGs following abnormal photoplethysmography signals may likewise offset false positives. Given the rapidly evolving field of consumer wearable technology, prospective study is warranted to confirm our findings.Our results also identify key factors influencing AF screening effectiveness. Consistent with the importance of false positives, test specificity consistently emerged as highly influential. Furthermore, the treatment effect of OAC and aspirin on AF‐related stroke, as well as the reduction in quality‐of‐life attributable to AF, also influenced QALY estimates. Future anticoagulants offering a more favorable balance of stroke protection versus bleeding risk may improve AF screening effectiveness.
Notably, even the most effective strategy modeled needed to be deployed in >4000 individuals to prevent 1 stroke. As a result, targeting additional potentially modifiable AF‐related outcomes (eg, heart failure hospitalization
) may increase the effectiveness of future screening interventions. Similarly, given the observed influence of AF‐related quality‐of‐life, integration of strategies known to improve AF symptoms (eg, weight loss,
alcohol cessation,
blood pressure management,
sleep apnea treatment
) with the screening intervention may increase net benefit.Our study should be interpreted in the context of design. First, evidence to support certain model inputs was limited. For example, AF disutility was based on relatively dated surveys and varies substantially across studies. Since AF‐related quality‐of‐life may have changed with contemporary therapies,
future studies are needed to better quantify the disutility associated with AF and related outcomes. Second, we assumed that the stroke risk of screen‐detected AF was similar to that of clinically detected AF. Screen‐detected AF likely reflects a lower burden of disease, and AF burden may be associated with stroke risk.
,
We observed that assuming lower stroke risk in the setting of paroxysmal AF had little impact on screening effectiveness, though further data are needed.
Third, although we used a systematic approach to selecting studies to inform model inputs, we did not incorporate study heterogeneity when combining estimates across multiple studies. Fourth, we estimated AF‐related stroke risk using the CHA2DS2‐VASc score
given widespread use in clinical practice and endorsement by consensus guidelines.
The score has limited predictive utility, does not consider differences in risk associated with specific combinations of risk factors, and does not incorporate additional variables likely exerting some influence on risk of stroke in AF (eg, smoking,
obesity
). Fifth, our models focused on stroke (and bleeding) since this is a major irreversible hazard that may present as the initial manifestation of AF. Future models are warranted to assess the effectiveness of AF screening for additional end points including heart failure
and cognitive decline.
Sixth, we did not simulate screening using implantable loop recorders, or the impacts of pacemakers or defibrillators given our focus on population‐based screening. Seventh, to estimate the maximum plausible effectiveness of contemporary AF screening, we modeled complete OAC use. Future analyses are warranted to examine the impact of initial OAC use on screening effectiveness.
Conclusions
Using a decision‐analytic model comparing 45 contemporary strategies deployed within a clinician‐directed context to perform population‐based AF screening, we found that roughly one quarter were clinically effective. Strategies using a sensitive modality upfront (eg, single‐lead ECG, wrist‐worn wearable photoplethysmography), followed by a highly specific test to minimize false‐positive diagnoses, tended to be most effective. Future screening interventions and clinical guidelines should consider the relative effectiveness of specific screening approaches.
Sources of Funding
Dr Lubitz is supported by National Institutes of Health (NIH) 1R01HL139731 and American Heart Association 18SFRN34250007. Dr Khurshid is supported by NIH T32HL007208. Dr McManus is supported by NIH R01HL126911, R01HL137734, R01HL137794, R01HL135219, R01HL136660, U54HL143541, and U01HL146382 from the National Heart, Lung, and Blood Institute. Dr Ashburner is supported by NIH K01HL148506 and American Heart Association 18SFRN34250007. Dr Ellinor is supported by NIH 1R01HL092577, R01HL128914, K24HL105780, the American Heart Association 18SFRN34110082, and by the Foundation Leducq 14CVD01. Dr Singer is supported by NIH R01AG063381 and R01HL142834.
Disclosures
Dr Lubitz receives sponsored research support from Bristol Myers Squibb / Pfizer, Bayer AG, Boehringer Ingelheim, and Fitbit, and has consulted for Bristol Myers Squibb / Pfizer and Bayer AG and participates in a research collaboration with IBM. Dr Lubitz is a co‐principal investigator of the VITAL‐AF trial examining screening for AF in ambulatory primary care patients, and is the principal investigator of the Fitbit Heart Study which is examining the validity of software for wearable trackers and smartwatches for identifying AF. Dr Chhatwal has consulted for Novo Nordisk and is a partner at Value Analytics Labs. Dr McManus receives research support from Bristol Myers Squibb/Pfizer, Boehringer Ingelheim, Philips Healthcare, Flexcon, Samsung, Apple Computer, and Fitbit, and has consulted for Bristol Myers Squibb/Pfizer, Samsung, Philips, Flexcon, Boston Biomedical Associates, and Rose Consulting. Dr Atlas receives sponsored research support from Bristol Myers Squibb/Pfizer and has consulted for Bristol Myers Squibb/Pfizer and Fitbit. Dr Ellinor is supported by a grant from Bayer AG to the Broad Institute focused on the genetics and therapeutics of cardiovascular disease. Dr Ellinor has consulted for Bayer AG, Novartis, MyoKardia, and Quest Diagnostics. Dr Hur is a co‐founder of Cambridge Biomedical and Economics Consulting Group. Dr Singer receives sponsored research support from Bristol Myers Squibb/Pfizer and has consulted for Boehringer Ingelheim, Bristol Myers Squibb/Pfizer, Fitbit, and Johnson & Johnson.Data S1Tables S1–S6Figures S1–S4References 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111Click here for additional data file.
Authors: Elaine M Hylek; Alan S Go; Yuchiao Chang; Nancy G Jensvold; Lori E Henault; Joe V Selby; Daniel E Singer Journal: N Engl J Med Date: 2003-09-11 Impact factor: 91.245
Authors: John J McNeil; Rory Wolfe; Robyn L Woods; Andrew M Tonkin; Geoffrey A Donnan; Mark R Nelson; Christopher M Reid; Jessica E Lockery; Brenda Kirpach; Elsdon Storey; Raj C Shah; Jeff D Williamson; Karen L Margolis; Michael E Ernst; Walter P Abhayaratna; Nigel Stocks; Sharyn M Fitzgerald; Suzanne G Orchard; Ruth E Trevaks; Lawrence J Beilin; Colin I Johnston; Joanne Ryan; Barbara Radziszewska; Michael Jelinek; Mobin Malik; Charles B Eaton; Donna Brauer; Geoff Cloud; Erica M Wood; Suzanne E Mahady; Suzanne Satterfield; Richard Grimm; Anne M Murray Journal: N Engl J Med Date: 2018-09-16 Impact factor: 91.245
Authors: Amit Kaura; Laszlo Sztriha; Fong Kum Chan; John Aeron-Thomas; Nicholas Gall; Bartlomiej Piechowski-Jozwiak; James T Teo Journal: Eur J Med Res Date: 2019-07-26 Impact factor: 2.175
Authors: Wanyi Chen; Shaan Khurshid; Daniel E Singer; Steven J Atlas; Jeffrey M Ashburner; Patrick T Ellinor; David D McManus; Steven A Lubitz; Jagpreet Chhatwal Journal: JAMA Health Forum Date: 2022-08-05