| Literature DB >> 29337724 |
Patrick Schober1, Sebastiaan M Bossers, Lothar A Schwarte.
Abstract
Effect size measures are used to quantify treatment effects or associations between variables. Such measures, of which >70 have been described in the literature, include unstandardized and standardized differences in means, risk differences, risk ratios, odds ratios, or correlations. While null hypothesis significance testing is the predominant approach to statistical inference on effect sizes, results of such tests are often misinterpreted, provide no information on the magnitude of the estimate, and tell us nothing about the clinically importance of an effect. Hence, researchers should not merely focus on statistical significance but should also report the observed effect size. However, all samples are to some degree affected by randomness, such that there is a certain uncertainty on how well the observed effect size represents the actual magnitude and direction of the effect in the population. Therefore, point estimates of effect sizes should be accompanied by the entire range of plausible values to quantify this uncertainty. This facilitates assessment of how large or small the observed effect could actually be in the population of interest, and hence how clinically important it could be. This tutorial reviews different effect size measures and describes how confidence intervals can be used to address not only the statistical significance but also the clinical significance of the observed effect or association. Moreover, we discuss what P values actually represent, and how they provide supplemental information about the significant versus nonsignificant dichotomy. This tutorial intentionally focuses on an intuitive explanation of concepts and interpretation of results, rather than on the underlying mathematical theory or concepts.Entities:
Mesh:
Year: 2018 PMID: 29337724 PMCID: PMC5811238 DOI: 10.1213/ANE.0000000000002798
Source DB: PubMed Journal: Anesth Analg ISSN: 0003-2999 Impact factor: 5.108
Figure 1.Ninety-five percent confidence intervals (vertical lines) and means (dots) calculated from a simulation of 25 samples (sample size of 30 each) drawn from a normally distributed population with a mean of 30 and a standard deviation of 10. One could think of this as a population of patients with a mean age of 30 y and a standard deviation of 10 y, from which we sample n = 30 patients to estimate the mean age in the population, and we repeat this experiment 25 times. Note that 23 of the confidence intervals (23/25 = 92%) cover the “true” population mean of 30. If we (infinitely) keep repeating this simulation, we would expect that 95% of the confidence intervals contain the true population parameter value.
Figure 2.Five examples (A–E) of 95% CIs (vertical lines) of the difference between mean SBP from 5 hypothetical studies in which SBP is compared between 2 groups. Note that in examples A–C, the 95% CI does not include 0 (horizontal solid line), indicating a significant difference at a 0.05 significance level, whereas D–E correspond to a nonsignificant result. For the sake of the example, we consider a difference of >5 mm Hg between the groups as clinically relevant (horizontal dashed lines at +5 and −5 mm Hg). Example A: the entire 95% CI covers a clinically relevant range, suggesting that the observed difference between the groups is not only statistically significant but also clinically important. Example B: although the result is statistically significant, the clinical relevance remains unclear. The result is compatible with a clinically nonimportant difference of only 2 mm Hg, but it is also compatible with a true difference as high as 15 mm Hg. Example C: the entire 95% CI includes a range that is clinically unimportant. Although there is a statistically significant effect, the clinical relevance of the finding appears to be limited. Example D: although the result is nonsignificant (95% CI spans 0), the data are also compatible with a clinically relevant difference in either direction. Hence, the result is inconclusive and should not be interpreted as demonstrating no effect. Example E: the narrow 95% CI around 0 does not include any clinically important values. In this specific example, there appears to be no clinically relevant effect. CI indicates confidence interval; SBP, systolic blood pressure.