Literature DB >> 21424218

The (mis)use of overlap of confidence intervals to assess effect modification.

Mirjam J Knol¹, Wiebe R Pestman, Diederick E Grobbee.

Abstract

Entities: Gene Species

Year: 2011 PMID： 21424218 PMCID： PMC3088813 DOI： 10.1007/s10654-011-9563-8

Source DB: PubMed Journal: Eur J Epidemiol ISSN： 0393-2990 Impact factor: 8.082

× No keyword cloud information.

In randomized controlled trials as well as in observational studies, researchers are often interested in effects of treatment or exposure in different subgroups, i.e. effect modification [1, 2]. There are several methods to assess effect modification and the debate on which method is best is still ongoing [2-5]. In this article we focus on an invalid method to assess effect modification, which is often used in articles in health sciences journals [6], namely concluding that there is no effect modification if the confidence intervals of the subgroups are overlapping [7-9]. When assessing effect modification by looking at overlap of the 95% confidence intervals in subgroups, a type 1 error probability of 0.05 is often mistakenly assumed. In other words, if the confidence intervals are overlapping, the difference in effect estimates between the two subgroups is judged to be statistically insignificant. By using mathematical derivation, we calculated that the chance of finding non-overlapping 95% confidence intervals under the null hypothesis is 0.0056 if the variance of both effect estimates is equal and the effect estimates are independent (see Supplemental material for derivation of this probability). If the variance of the effect estimates is not equal, the chance of finding non-overlapping 95% confidence intervals can be calculated by taking into account ρ, i.e. the ratio between the standard deviations in the subgroups, σ2/σ1 (Supplementary material, formula (3)). Figure. 1 shows the relation between ρ and the type 1 error probability if the effect estimates are independent. If the effect estimates are not independent, the correlation coefficient between the effect estimates can also be accounted for (Supplementary material, formula (3)).

Fig. 1

Relation between ρ, which is the ratio of σ2 and σ1, and the probability of non-overlapping confidence intervals under the null hypothesis (type 1 error)

Relation between ρ, which is the ratio of σ2 and σ1, and the probability of non-overlapping confidence intervals under the null hypothesis (type 1 error) To arrive at a type 1 error probability of 0.05, 83.4% confidence intervals should be calculated around the effect estimates in subgroups if the variance is equal and the effect estimates are independent (see Supplementary material for derivation of this percentage). If the variance is not equal, ρ should be taken into account (Supplementary material, formula (11)). Figure. 2 shows the relation between ρ and the level of the confidence interval. If the effect estimates are not independent, the correlation coefficient should be taken into account (Supplementary material, formula (11)). Adapting the level of the confidence interval can be especially useful for graphical presentations, for example in meta-analyses [10]. However, it is necessary to explicitly and clearly state which percentage confidence interval is calculated and its meaning should be thoroughly explained to the reader. Many readers will still interpret this ‘new’ confidence interval as if it were a 95% confidence interval, because this percentage is so commonly used. To prevent such confusion, other methods to assess effect modification could be used, such as calculating a 95% confidence interval around the difference in effect estimates [8].

Fig. 2

Relation between ρ, which is the ratio of σ2 and σ1, and the percentage confidence intervals to be calculated to arrive at a type 1 error probability of 0.05

Relation between ρ, which is the ratio of σ2 and σ1, and the percentage confidence intervals to be calculated to arrive at a type 1 error probability of 0.05 The assumption used in the formulas presented in the appendices is that the effect estimators in the subgroups are normally distributed. Assuming that epidemiologic effect measures, such as the odds ratio, risk ratio, hazard ratio and risk difference, follow a normal distribution, the methods presented can also be used for these epidemiologic measures. Note that the assumption for normality is generally unreasonable in small samples, but a satisfactory approximation in large samples.

Example

As an example, imagine a large randomized controlled trial that investigates the effect of some intervention on mortality and that includes 10,000 men and 5,000 women. Besides the main effect of treatment, the researchers are interested in assessing whether the treatment effect is different for men and women. Suppose that the risk ratio in men is 0.67 (95% CI: 0.59-0.75) and in women is 0.83 (95% CI: 0.71-0.98). The confidence intervals are partly overlapping, which the researchers may wrongly interpret as no effect modification by sex. Filling in formula (3) (Supplementary material) results in a probability of non-overlapping 95% confidence intervals under the null hypothesis of 0.006. A confidence level of 83.8% could have been calculated to arrive at a type 1 error probability of 0.05, resulting in a confidence interval of 0.61–0.73 for men and 0.74–0.93 for women. Now, the confidence intervals do not overlap, so the p-value is at least smaller than 0.05, indicating statistically significant effect modification. Calculating the difference in risk ratios with a 95% confidence interval results in a ratio of risk ratios of 0.80 with a 95% confidence interval of 0.66-0.98, corresponding to a p-value of 0.028. This confirms our earlier observation of statistically significant effect modification. Below is the link to the electronic supplementary material. Supplementary material 1 (DOC 162 kb)

7 in total

1. A brief note on overlapping confidence intervals.

Authors: Peter C Austin; Janet E Hux
Journal: J Vasc Surg Date: 2002-07 Impact factor: 4.268

Review 2. Interaction revisited: the difference between two estimates.

Authors: Douglas G Altman; J Martin Bland
Journal: BMJ Date: 2003-01-25

Review 3. Issues in the reporting of epidemiological studies: a survey of recent practice.

Authors: Stuart J Pocock; Timothy J Collier; Kimberley J Dandreo; Bianca L de Stavola; Marlene B Goldman; Leslie A Kalish; Linda E Kasten; Valerie A McCormack
Journal: BMJ Date: 2004-10-06

4. Statistics in medicine--reporting of subgroup analyses in clinical trials.

Authors: Rui Wang; Stephen W Lagakos; James H Ware; David J Hunter; Jeffrey M Drazen
Journal: N Engl J Med Date: 2007-11-22 Impact factor: 91.245

5. When one depends on the other: reporting of interaction in case-control and cohort studies.

Authors: Mirjam J Knol; Matthias Egger; Pippa Scott; Mirjam I Geerlings; Jan P Vandenbroucke
Journal: Epidemiology Date: 2009-03 Impact factor: 4.822

6. Interactions in epidemiology: relevance, identification, and estimation.

Authors: Sander Greenland
Journal: Epidemiology Date: 2009-01 Impact factor: 4.822

7. Subgroup analysis and other (mis)uses of baseline data in clinical trials.

Authors: S F Assmann; S J Pocock; L E Enos; L E Kasten
Journal: Lancet Date: 2000-03-25 Impact factor: 79.321

7 in total

38 in total

1. Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech.

Authors: Ying-Yee Kong; Ala Somarowthu; Nai Ding
Journal: J Assoc Res Otolaryngol Date: 2015-09-11

2. Optimizing PiB-PET SUVR change-over-time measurement by a large-scale analysis of longitudinal reliability, plausibility, separability, and correlation with MMSE.

Authors: Christopher G Schwarz; Matthew L Senjem; Jeffrey L Gunter; Nirubol Tosakulwong; Stephen D Weigand; Bradley J Kemp; Anthony J Spychalla; Prashanthi Vemuri; Ronald C Petersen; Val J Lowe; Clifford R Jack
Journal: Neuroimage Date: 2016-08-27 Impact factor: 6.556

3. Role of β-Amyloidosis and Neurodegeneration in Subsequent Imaging Changes in Mild Cognitive Impairment.

Authors: David S Knopman; Clifford R Jack; Emily S Lundt; Heather J Wiste; Stephen D Weigand; Prashanthi Vemuri; Val J Lowe; Kejal Kantarci; Jeffrey L Gunter; Matthew L Senjem; Michelle M Mielke; Mary M Machulda; Rosebud O Roberts; Bradley F Boeve; David T Jones; Ronald C Petersen
Journal: JAMA Neurol Date: 2015-12 Impact factor: 18.302

4. Simulator training improves ultrasound scanning performance on patients: a randomized controlled trial.

Authors: Mia Louise Østergaard; Kristina Rue Nielsen; Elisabeth Albrecht-Beste; Annette Kjær Ersbøll; Lars Konge; Michael Bachmann Nielsen
Journal: Eur Radiol Date: 2019-01-07 Impact factor: 5.315

5. Sex differences in the long-term repeatability of the acute stress response in long-lived, free-living Florida scrub-jays (Aphelocoma coerulescens).

Authors: Thomas W Small; Stephan J Schoech
Journal: J Comp Physiol B Date: 2014-11-07 Impact factor: 2.200

6. Trends in cannabis use among immigrants in the United States, 2002-2017: Evidence from two national surveys.

Authors: Christopher P Salas-Wright; Rachel John; Michael G Vaughn; Rob Eschmann; Mariana Cohen; Millan AbiNader; Jorge Delva
Journal: Addict Behav Date: 2019-06-19 Impact factor: 3.913

7. Evolution of neurodegeneration-imaging biomarkers from clinically normal to dementia in the Alzheimer disease spectrum.

Authors: David S Knopman; Clifford R Jack; Emily S Lundt; Stephen D Weigand; Prashanthi Vemuri; Val J Lowe; Kejal Kantarci; Jeffrey L Gunter; Matthew L Senjem; Michelle M Mielke; Mary M Machulda; Rosebud O Roberts; Bradley F Boeve; David T Jones; Ronald C Petersen
Journal: Neurobiol Aging Date: 2016-06-16 Impact factor: 4.673