Literature DB >> 26750687

Some clarifications regarding multiple comparisons.

Abstract

Entities: Disease

Year: 2016 PMID： 26750687 PMCID： PMC4900392 DOI： 10.4103/0971-9784.173033

Source DB: PubMed Journal: Ann Card Anaesth ISSN： 0971-9784

× No keyword cloud information.

The Editor, Barkan[1] reviewed some basic statistical concepts used in research and rightly called for increased statistical literacy among clinicians. The study is largely correct but contains several statements that may foster more confusion than clarity–particularly on the topic of multiple comparisons. For example, it was claimed that if each of 20 comparisons is tested at the 0.05 level, then “there's a virtual certainty that at least one will yield a false positive result.” On the contrary, as noted in Bland and Altman,[2] which was apparently misinterpreted in Barkan,[1] the maximum probability of at least one false positive in the given scenario is approximately 0.64-not “virtual certainty” by any reasonable definition. Moreover, the probability is even lower if the test statistics are positively correlated and/or if any null hypotheses are false. Barkan may have confused the family-wise Type I error rate (FWER; the probability of obtaining at least one false positive in a given set of comparisons) with the per-family Type I error rate (PFER; the expected mean number of false positives per set, as a theoretical long-term average). Indeed, the maximum PFER in the given scenario is 1, but that by no means indicates 100% chance of error. It was also claimed that although the Bonferroni procedure is considered conservative, “Equally rigorous but less stringent techniques such as the false detection (sic) rate are now in use.” However, the false discovery rate (FDR) is not a “technique” in itself, and techniques for controlling FDR are considerably less rigorous than the Bonferroni procedure (which controls the PFER). In general, one adjustment cannot simultaneously be both “equally rigorous” and “less stringent” relative to another. The more stringent the thresholds for significance the more rigorous the control of Type I errors (false positives). PFER control is more rigorous/stringent than FWER control, and FWER control is more rigorous/stringent than FDR control. Thus, controlling PFER always controls FWER and controlling FWER always controls FDR, but not vice versa. It was claimed that all adjustments for multiple comparisons “adjust individual comparison thresholds, so the final statistical significance for all comparisons combined is P < 0.05.” However, although the Bonferroni procedure controls the sum of probabilities of Type I error “for all comparisons combined,” FDR control does not provide that protection.[3] Note that different methods of adjustment address different error rates, and which error rate is relevant depends on the strength of inference that is required.[4] For instance, procedures that control only FDR and not FWER are generally not adequate for confirmatory clinical trials.[5] It is also notable that although advantages of using confidence intervals rather than P values alone were correctly noted in Barkan,[1] simultaneous confidence intervals for multiple comparisons were not mentioned. Contrary to a common misconception, the issue of multiple comparisons is just as relevant to confidence intervals as to P values.[4] Due to space constraints, additional errors and omissions in Barkan[1] are not addressed in this letter. Suffice to say, improving statistical literacy (e.g., on the topic of multiple comparisons) is certainly a worthy goal. Instructive publications that are clear, simple, and correct are essential to that goal.

Financial support and sponsorship

Nil.

Conflicts of interest

There are no conflicts of interest.

3 in total

Some clarifications regarding multiple comparisons.

Financial support and sponsorship

Conflicts of interest

Review 1. Simultaneous and selective inference: Current successes and future challenges.

2. Multiple significance tests: the Bonferroni method.

3. Statistics in clinical research: Important considerations.