Literature DB >> 19151434

A method to compute multiplicity corrected confidence intervals for odds ratios and other relative effect estimates.

Jimmy Thomas Efird¹, Susan Searles Nielsen.

Abstract

Epidemiological studies commonly test multiple null hypotheses. In some situations it may be appropriate to account for multiplicity using statistical methodology rather than simply interpreting results with greater caution as the number of comparisons increases. Given the one-to-one relationship that exists between confidence intervals and hypothesis tests, we derive a method based upon the Hochberg step-up procedure to obtain multiplicity corrected confidence intervals (CI) for odds ratios (OR) and by analogy for other relative effect estimates. In contrast to previously published methods that explicitly assume knowledge of P values, this method only requires that relative effect estimates and corresponding CI be known for each comparison to obtain multiplicity corrected CI.

Entities: Chemical Disease Gene

Mesh：

Year: 2008 PMID： 19151434 PMCID： PMC3699999 DOI： 10.3390/ijerph5050394

Source DB: PubMed Journal: Int J Environ Res Public Health ISSN： 1660-4601 Impact factor: 3.390

Introduction

Testing the statistical significance of multiple null hypotheses is a routine practice in epidemiologic and other types of biomedical research. By chance, the probability of wrongly rejecting one or more null hypotheses increases in proportion to the number of comparisons tested [1]. This is referred to as “multiplicity bias.” Various methods have been presented in the literature for controlling the type I error in the context of multiple hypothesis testing. The classic Bonferroni inequality [2] provides a simple distribution-free method for multiplicity P value correction. Letting α denote the probability that hypothesis S is incorrect, the Bonferroni probability for the joint null hypothesis may be written as: The Bonferroni method rejects the n set of null hypotheses if for at least one i, where p denotes the P value corresponding to the ith null hypothesis. In the simple case, α is apportioned evenly among the tests. Although the family-wise (FWER) and per family (PFER) error rates are preserved at the α level of significance, the Bonferroni procedure is known to be conservative, especially for highly correlated test statistics (i.e., type I error probability is less than the nominal level of α). For example, in the case of a study of multiple genetic polymorphisms, the assumption is that all variants being tested have equal probability of being truly associated with the outcome of interest and leads to overcorrection.[3] The first order Bonferroni inequality may be improved upon given knowledge of the joint bivariate probabilities [2, 4, 5] or when the absolute value of the correlation coefficient is greater than 50% [2, 6]. However, these improvements have been limited in applied practice due to their restrictive nature. Several multiple testing procedures [7-9] based upon the “closure method” [10] and “Simes equality” [11] have been introduced and shown to be more powerful than the Bonferroni method for testing the intersection hypothesis [12-13]. Of the closure method based options, the Hochberg step-up multiple comparisons procedure [7] has gained popularity as being “easier to apply” than the more powerful procedures of Hommel [9] and Rom. The procedure also is uniformly more powerful than the Bonferroni-based, sequentially-rejective method of Holm [14] in many applied situations, e.g., when test statistics are uncorrelated, follow a multivariate normal or T2 distribution, or are model independent [15-17]. Given an ordered set of P values, i.e., p(1)≤p(2)≤…≤p(, the Hochberg procedure rejects all hypothesis H if p(<α/(n−j+1) for any j=1, … , n. P values are incrementally corrected in order from smallest to largest by multiplying p( by (n−j+1), wherein the multiplicative factor for the largest P value is unity and thus remains the same after multiplicity correction. Many researchers and journal editors increasingly recognize confidence intervals (CI) as the preferred measure for conveying statistical uncertainty of effect size estimates such as odds ratios (OR), relative risks (RR), and hazard ratios (HR), as P values have been commonly misunderstood and misinterpreted in the literature [18-22]. Similar to hypothesis testing by way of P values, CI also may be corrected for multiplicity to minimize the risk of making false-positive inferences. Several authors have provided techniques to correct CI for multiple hypothesis testing [23-26]. However, most of the methods are computationally intensive or mathematically complex, and more importantly, none provide a way to correct CI when corresponding P values are not provided for the individual hypothesis tests. Below, we present a method to compute multiplicity corrected CI for OR and by analogy for other measures of relative risk, when no P values have been explicitly provided. This computationally simple method based upon the Hochberg step-up procedure only requires knowledge of individual test OR and CI, and the number of comparisons being tested.

Methodology

The derivation of multiplicity corrected confidence intervals for a set of n OR involves expressing the standard error (SE) for the logarithm of OR (i=1 to n) in terms of the lower confidence interval (LCI) for OR. Letting: where z(1−α/2) is the 100% × (1−α/2) percentile of a standard normal distribution, and solving for SE[log(OR)] we see that: Substituting the right hand side of (3) into the equation for the 2-tailed z test statistic gives: The corresponding P value is computed as: Where: Ordering the P values (p’s) from the lowest to highest values i.e., p(1) ≤p(2) ≤ ... ≤ p( (with arbitrary ordering in the case of ties), the Hochberg multiplicity corrected P values denoted by “*” are computed as: where j ranges from 1 to n in a 1:1 identity mapping with the i values and p*(j) is bounded by unity.Rearranging (5) and solving for SE*[log (OR)] in the equation: gives the Hochberg corrected standard error for the logarithm of OR, i.e.: The multiplicity corrected (1−α/2) × 100% CI for OR( based upon the Hochberg step-up procedure can then be computed by substituting the above standard error from eq. 9 into the following basic equation: By analogy, replacing OR in the above equations with other relative effect estimates such as RR or HR gives the corresponding multiplicity corrected CI for these measures. When P values are directly available for the individual hypothesis tests, the Hochberg multiplicity corrected CI may be computed directly beginning with eq. 7. Furthermore, if the hypothesis test is 1-sided, then α must be multiplied by 2 in the above equations.

Example

Table 1 below presents OR from a case-control study for a hypothetical disease (D) and exposure to 3 dichotomously coded environmental risk factors. The OR and 95% CI for (D) uncorrected for multiplicity (n=3 factors) are shown in Columns 3 and 4: Factor 1 (OR=1.652, 95% CI=0.551–4.953); Factor 2 (OR=1.151, 95% CI=0.142–9.324); and Factor 3 (OR=6.509, 95% CI=1.646–25.743). Applying equations 7, 9 and 10 gives the corresponding multiplicity corrected P values (0.740, 0.895, 0.024; not shown in table), standard errors for the logarithm of the OR (1.513, 1.068, 0.830), and 95% CI (0.09–32, 0.14–9.3, 1.3–33). The multiplicity corrected CI for Factor 1 and Factor 3 are considerably wider than the corresponding uncorrected intervals, thus indicating a greater degree of variability for the estimated OR. In the case of Factor 2, the uncorrected and corrected CI is the same since Factor 2 had the highest P value of the 3 comparisons when applying the Hochberg algorithm.

Table 1:

Odds ratios (OR) and 95% confidence intervals (CI) for a hypothetical disease (D) and exposure to 3 dichotomously coded environmental risk factors, uncorrected and corrected for multiplicity

Variable	Cases/Control	Odds Ratio^a	Uncorrected for Multiplicity	Corrected for Multiplicity^b

			95% CI (OR)	SE^* [log(OR)]	95% CI^* (OR)
Factor 1
Non-Exposed	587 / 2143	1.0	Referent	1.513	Referent
Exposed	5 / 10	1.652	[0.551–4.953]		[0.09–32]
Factor 2
Non-Exposed	246 / 2143	1.0	Referent	1.068	Referent
Exposed	1 / 10	1.151	[0.142–9.324]		[0.14–9.3]^c
Factor 3
Non-Exposed	141 / 2143	1.0	Referent	0.830	Referent
Exposed	3 / 10	6.509	[1.646–25.743]		[1.3–33]

Adjusted for age and sex.

Using Hochberg step-up procedure.

Note: The multiplicity adjusted and unadjusted 95% CI will be equal in this case since the corresponding unadjusted P value for the Factor 2 comparison was the highest of the 3 comparisons and thus the multiplicative factor for p(j) in equation (7) will be equal to 1.

Multiplicity adjusted estimates.

In this example, the conclusions regarding the association (or lack thereof) of (D) and the exposure do not substantively change after correction for multiplicity, thus lending weight to what otherwise might be only cautious interpretation referencing the possibility of a chance observation due to multiple comparisons. However, in other situations where CI is close to containing unity, a null hypothesis might no longer be rejected at least in strict statistical terms after correction for multiplicity.

Discussion

Confidence intervals for OR, RR and other relative effect estimates are commonly reported in epidemiologic and public health literature without correction for multiple hypothesis testing. The failure to account for multiplicity may lead to inflation of type I error and over interpretation of any apparently “positive” findings. In the current paper, we show how CI for relative effect size estimates such as OR may be corrected for multiplicity by use of the Hochberg step-up procedure, a “closed-testing” method for protecting against making excessive false-positive inferences due to multiple comparisons. Our method has several strengths. The corrected CI are simple to compute in standard statistical software packages that have function routines for determining percentiles and areas under a curve for a normal distribution. Since P values are not required for the original hypothesis tests, multiplicity corrected CI may be computed post hoc (when estimates are reported with sufficient precision) from publications that only report values for effect size estimates and corresponding CI. When the test statistics are uncorrelated, the family-wise type I error probability is theoretically guaranteed by the Hochberg step-up procedure. Simulation results also show that the Hochberg step-up procedure holds for many commonly encountered dependent test statistics [27]. Several limitations must be observed when applying our procedure for computing CI. The technique is not applicable when “exact sampling distribution” methods have been used to make statistical inferences. The Hochberg multiplicity correction also will inflate P values and related CI when one or more of the hypothesis tests involve a multi-level, logically related categorical variable (e.g., current smoker, former smoker, never smoker). In this case, it is unnecessary to correct CI for multiplicity for a logically related variable in multivariate space. The computed multiplicity corrected CI will be an approximate solution when the decimal accuracy is limited for the original OR and CI values. Accordingly, it is generally recommended that at least 2 or 3 significant digits of accuracy are available for published estimates when using this method in a post hoc manner to compute multiplicity corrected confidence intervals. Additionally, the rule for computing p*(j) (eq. 7) in rare cases may lead to an anomaly wherein p*(j) but not p*(j−1) will achieve statistical significance. In this situation, one might apply the de facto variation of multiplying p( and lesser ranked P values by j to obtain the corresponding Hochberg corrected P values.[28] And finally, the method should not be used if the logarithm of the effect estimate does not follow a normal distribution, or if the underlying observations are not independent and identically distributed. It also is important to note that correction for multiplicity may not be necessary or even desirable in some situations [29-33]. For example, correction for multiplicity may be unnecessary when an a priori biologic mechanism of action exists for an independent variable that manifests a linear dose response in relationship to the outcome variable. Similarly, multiplicity correction may not be desirable when attempting to control type II errors as the latter will be inflated by virtue of decreasing type I errors [31]. Furthermore, multiplicity correction based upon the “universal null hypothesis,” which tests that two groups are identical for all comparisons between variables, fails to take into account which and how many variables differ if the joint hypothesis is rejected [31]. Methods to correct for multiplicity also do not account for the inclusion of hypotheses that are biologically improbable or otherwise indefensible, which unnecessarily inflate the probability of incorrectly rejecting the joint null hypotheses [18, 29]. Philosophically, some researchers believe that the “primary” purpose for CI are to indicate a range of parameter values consistent with the data rather than for de facto hypothesis testing based on whether or not they include 1.0. Another salient concern regarding the appropriateness of multiplicity correction techniques is “how does one choose the universe for the number of comparisons.” Clearly, multiplicity adjustment remains a debated topic with diverse opinions presented in the literature [34-35]. In the early days of the development of stepwise and closed tests for the control of type I error in multiple hypothesis testing, epidemiologists and statisticians commonly believed that joint CI could not be constructed for these procedures. However, it has been shown since that standard methods for constructing CI also readily apply to common stepwise multiplicity procedures.[23-24] Here, we have expanded on the seminal work of these researchers to develop a simple method for computing multiplicity corrected CI for standard estimates of effect size. Although our derivation has focused on the case of binary predictor variables, it is possible that similar principles might be developed and applied to obtain joint confidence sets in the more complex case of multilevel categorical variables.

Conclusions

Although the most effective strategy to minimize type I error related to multiple comparisons is to simply reduce the number of comparisons, this in effect penalizes the researcher for conducting a more informative multivariable study [32]. Statistical correction for multiple comparisons is not a substitution for the parsimonious and epidemiologically prudent selection - during the design phase of a study - of hypotheses to test. Nor should it be used in lieu of careful and informed interpretation of the results, taking into account biological plausibility (or lack thereof) and the results of prior studies. However, when statistical correction for multiple comparisons is appropriate, as is the case in many but not all situations, the method we present may have application as a supportive measure. A key advantage of this method is its correspondence with CI, which are typically more informative, and potentially more readily available, than P values.

13 in total

1. Multiple inferences using confidence intervals.

Authors: J Ludbrook
Journal: Clin Exp Pharmacol Physiol Date: 2000-03 Impact factor: 2.557

Review 2. Interpretation of genetic association studies in complex disease.

Authors: H Campbell; I Rudan
Journal: Pharmacogenomics J Date: 2002 Impact factor: 3.550

Review 3. Reporting research results: recommendations for improving communication.

Authors: Richelle J Cooper; Robert L Wears; David L Schriger
Journal: Ann Emerg Med Date: 2003-04 Impact factor: 5.721

4. The value of a p-valueless paper.

Authors: Jason T Connor
Journal: Am J Gastroenterol Date: 2004-09 Impact factor: 10.864

5. Why we need confidence intervals.

Authors: Douglas G Altman
Journal: World J Surg Date: 2005-05 Impact factor: 3.352

6. Other method for adjustment of multiple testing exists.

Authors: M Aickin
Journal: BMJ Date: 1999-01-09

Review 7. What's wrong with Bonferroni adjustments.

Authors: T V Perneger
Journal: BMJ Date: 1998-04-18

8. Multiple-comparison procedures: a dissenting view.

Authors: J N Perry
Journal: J Econ Entomol Date: 1986-10 Impact factor: 2.381

9. Reporting the results of epidemiologic studies.

Authors: A M Walker
Journal: Am J Public Health Date: 1986-05 Impact factor: 9.308

10. Multiple comparisons and related issues in the interpretation of epidemiologic data.

Authors: D A Savitz; A F Olshan
Journal: Am J Epidemiol Date: 1995-11-01 Impact factor: 4.897

10 in total

1. Cardiac Outcomes After Ischemic Stroke or Transient Ischemic Attack: Effects of Pioglitazone in Patients With Insulin Resistance Without Diabetes Mellitus.

Authors: Lawrence H Young; Catherine M Viscoli; Jeptha P Curtis; Silvio E Inzucchi; Gregory G Schwartz; Anne M Lovejoy; Karen L Furie; Mark J Gorman; Robin Conwit; J Dawn Abbott; Daniel L Jacoby; Daniel M Kolansky; Steven E Pfau; Frederick S Ling; Walter N Kernan
Journal: Circulation Date: 2017-02-28 Impact factor: 29.690

2. D-CARE: The Dementia Care Study: Design of a Pragmatic Trial of the Effectiveness and Cost Effectiveness of Health System-Based Versus Community-Based Dementia Care Versus Usual Dementia Care.

Authors: David B Reuben; Thomas M Gill; Alan Stevens; Jeff Williamson; Elena Volpi; Maya Lichtenstein; Lee A Jennings; Zaldy Tan; Leslie Evertson; David Bass; Lisa Weitzman; Martie Carnie; Nancy Wilson; Katy Araujo; Peter Charpentier; Can Meng; Erich J Greene; James Dziura; Jodi Liu; Erin Unger; Mia Yang; Katherine Currie; Kristin M Lenoir; Aval-NaʼRee S Green; Sitara Abraham; Ashley Vernon; Rafael Samper-Ternent; Mukaila Raji; Roxana M Hirst; Rebecca Galloway; Glen R Finney; Ilene Ladd; Alanna Kulchak Rahm; Pamela Borek; Peter Peduzzi
Journal: J Am Geriatr Soc Date: 2020-10-06 Impact factor: 5.562

3. Medications Associated with Lower Mortality in a SARS-CoV-2 Positive Cohort of 26,508 Veterans.

Authors: Christine M Hunt; Jimmy T Efird; Thomas S Redding; Andrew D Thompson; Ashlyn M Press; Christina D Williams; Christopher J Hostler; Ayako Suzuki
Journal: J Gen Intern Med Date: 2022-06-29 Impact factor: 6.473

4. An Efficient Gatekeeper Algorithm for Detecting GxE.

Authors: Jimmy T Efird
Journal: Cancer Inform Date: 2010-05-12

5. Pioglitazone after Ischemic Stroke or Transient Ischemic Attack.

Authors: Walter N Kernan; Catherine M Viscoli; Karen L Furie; Lawrence H Young; Silvio E Inzucchi; Mark Gorman; Peter D Guarino; Anne M Lovejoy; Peter N Peduzzi; Robin Conwit; Lawrence M Brass; Gregory G Schwartz; Harold P Adams; Leo Berger; Antonio Carolei; Wayne Clark; Bruce Coull; Gary A Ford; Dawn Kleindorfer; John R O'Leary; Mark W Parsons; Peter Ringleb; Souvik Sen; J David Spence; David Tanne; David Wang; Toni R Winder
Journal: N Engl J Med Date: 2016-02-17 Impact factor: 91.245

6. Sinusoidal cox regression-a rare cancer example.

Authors: Jimmy Thomas Efird
Journal: Cancer Inform Date: 2010-11-28

7. Higher Serum 25-Hydroxyvitamin D Is Associated with Lower All-Cause and Cardiovascular Mortality among US Adults with Nonalcoholic Fatty Liver Disease.

Authors: Yuxiong Chen; Siqin Feng; Zhen'ge Chang; Yakun Zhao; Yanbo Liu; Jia Fu; Yijie Liu; Siqi Tang; Yitao Han; Shuyang Zhang; Zhongjie Fan
Journal: Nutrients Date: 2022-09-27 Impact factor: 6.706

Review 8. Recommendations for designing and analysing multi-arm non-inferiority trials: a review of methodology and current practice.

Authors: Jake Emmerson; Susan Todd; Julia M Brown
Journal: Trials Date: 2021-06-26 Impact factor: 2.279

9. Genetic association between NFKB1 -94 ins/del ATTG Promoter Polymorphism and cancer risk: a meta-analysis of 42 case-control studies.

Authors: Duan Wang; Tianhang Xie; Jin Xu; Haoyang Wang; Weinan Zeng; Shuquan Rao; Kai Zhou; Fuxing Pei; Zongke Zhou
Journal: Sci Rep Date: 2016-07-22 Impact factor: 4.379

10. Increased TNF-alpha and sTNFR2 levels are associated with high-grade anal squamous intraepithelial lesions in HIV-positive patients with low CD4 level.

Authors: Takeshi Haga; Jimmy T Efird; Sharof Tugizov; Joel M Palefsky
Journal: Papillomavirus Res Date: 2016-12-01

10 in total