| Literature DB >> 29197347 |
Aidan G O'Keeffe1, Gareth Ambler2, Julie A Barber2.
Abstract
BACKGROUND: In healthcare research, outcomes with skewed probability distributions are common. Sample size calculations for such outcomes are typically based on estimates on a transformed scale (e.g. log) which may sometimes be difficult to obtain. In contrast, estimates of median and variance on the untransformed scale are generally easier to pre-specify. The aim of this paper is to describe how to calculate a sample size for a two group comparison of interest based on median and untransformed variance estimates for log-normal outcome data.Entities:
Keywords: Hypothesis test; Log-transformation; Median; Sample size; Skewness
Mesh:
Year: 2017 PMID: 29197347 PMCID: PMC5712177 DOI: 10.1186/s12874-017-0426-1
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Simulation study results
| Significance level = 0.05, Power = 0.8 | |||||||
| Estimated power from simulation study | |||||||
|
|
|
|
|
| log t-test | M-W test | t-test |
| 1 | 1.5 | 0.5 | 0.5 | 14 | 0.781 | 0.755 | 0.690 |
| 1 | 1.25 | 0.5 | 0.5 | 51 | 0.797 | 0.776 | 0.664 |
| 1 | 1.1 | 0.5 | 0.5 | 303 | 0.801 | 0.781 | 0.639 |
| 1 | 0.5 | 0.4 | 0.4 | 9 | 0.788 | 0.724 | 0.687 |
| 1 | 0.7 | 0.4 | 0.4 | 23 | 0.794 | 0.772 | 0.662 |
| 1 | 0.9 | 0.4 | 0.4 | 204 | 0.800 | 0.781 | 0.669 |
| 1 | 0.6 | 0.3 | 0.3 | 9 | 0.791 | 0.728 | 0.728 |
| 1 | 0.7 | 0.3 | 0.3 | 15 | 0.800 | 0.762 | 0.723 |
| 1 | 0.8 | 0.3 | 0.3 | 32 | 0.797 | 0.773 | 0.713 |
| 1 | 0.75 | 0.25 | 0.25 | 15 | 0.784 | 0.747 | 0.729 |
| 1 | 0.88 | 0.25 | 0.25 | 63 | 0.797 | 0.776 | 0.737 |
| 1 | 0.94 | 0.25 | 0.25 | 250 | 0.800 | 0.781 | 0.741 |
| Significance level = 0.05, Power = 0.9 | |||||||
| Estimated power from simulation study | |||||||
|
|
|
|
|
| log t-test | M-W test | t-test |
| 1 | 1.5 | 0.5 | 0.7 | 23 | 0.888 | 0.873 | 0.847 |
| 1 | 1.25 | 0.5 | 0.7 | 87 | 0.898 | 0.883 | 0.910 |
| 1 | 1.1 | 0.5 | 0.7 | 530 | 0.900 | 0.886 | 0.992 |
| 1 | 0.5 | 0.6 | 0.4 | 14 | 0.890 | 0.872 | 0.800 |
| 1 | 0.7 | 0.6 | 0.4 | 40 | 0.897 | 0.882 | 0.875 |
| 1 | 0.9 | 0.6 | 0.4 | 383 | 0.900 | 0.886 | 0.993 |
| 1 | 0.6 | 0.5 | 0.3 | 16 | 0.896 | 0.874 | 0.867 |
| 1 | 0.7 | 0.5 | 0.3 | 28 | 0.894 | 0.877 | 0.897 |
| 1 | 0.8 | 0.5 | 0.3 | 65 | 0.896 | 0.880 | 0.946 |
| 1 | 0.75 | 0.4 | 0.25 | 29 | 0.892 | 0.875 | 0.906 |
| 1 | 0.88 | 0.4 | 0.25 | 131 | 0.898 | 0.882 | 0.970 |
| 1 | 0.94 | 0.4 | 0.25 | 537 | 0.900 | 0.883 | 0.998 |
Here, m 1,m 2 are pre-specified untransformed median values for groups 1 and 2, with ϕ 1,ϕ 2 corresponding untransformed standard deviations. The column ‘n’ denotes the analytically derived sample size calculated using Eq. (4). Estimated powers are shown for a two-sample t-test of log-transformed outcomes (‘log t-test’), a Mann-Whitney U test of untransformed outcomes (‘M-W test’) and a two-sample t-test of untransformed outcomes (‘t-test’)
Fig. 1Plots showing the probability density function of an Exp(2) random variable (left-hand plot) and the natural logarithm of an Exp(2) random variable (right-hand plot)
Results from the simulation study where outcome data have Exponential distributions
| Significance level = 0.05, Power = 0.9 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Estimated power from simulation study | ||||||||
| Scenario |
|
|
|
|
| log t-test | M-W test | t-test |
| 1 | 0.1 | 0.3 |
|
| 13 | 0.576 | 0.600 | 0.661 |
| 2 | 1 | 1.5 |
|
| 91 | 0.567 | 0.650 | 0.769 |
| 3 | 10 | 7 |
|
| 117 | 0.564 | 0.649 | 0.768 |
| 4 | 20 | 15 |
|
| 180 | 0.565 | 0.654 | 0.772 |
| 5 | 60 | 48 |
|
| 299 | 0.564 | 0.653 | 0.775 |
| 6 | 80 | 70 |
|
| 833 | 0.565 | 0.654 | 0.778 |
Here, m and ϕ denote the median and standard deviation of the outcome data for group j. Estimated powers are shown for a two-sample t-test of log-transformed outcomes (‘log t-test’), a Mann-Whitney U test of untransformed outcomes (‘M-W test’) and a two-sample t-test of untransformed outcomes (‘t-test’)
Fig. 2Plots showing the probability density functions of the Exponential distribution (left-hand column) and log-Exponential distribution (right-hand column) for each simulation scenario of Table 2. Black lines indicate densities for group 1 and the red lines those for group 2
Results from the simulation study where outcome data have Exponential distributions, with analytic sample sizes (n) calculated using the formula given in Eq. 5
| Significance level = 0.05, Power = 0.9 | ||||||||
|---|---|---|---|---|---|---|---|---|
| Estimated power from simulation study | ||||||||
| Scenario |
|
|
|
|
| log t-test | M-W test | t-test |
| 1 | 0.1 | 0.3 |
|
| 29 | 0.890 | 0.933 | 0.975 |
| 2 | 1 | 1.5 |
|
| 211 | 0.900 | 0.948 | 0.985 |
| 3 | 10 | 7 |
|
| 272 | 0.898 | 0.947 | 0.985 |
| 4 | 20 | 15 |
|
| 418 | 0.900 | 0.950 | 0.986 |
| 5 | 60 | 48 |
|
| 695 | 0.899 | 0.949 | 0.985 |
| 6 | 80 | 70 |
|
| 1939 | 0.898 | 0.949 | 0.985 |
Here, m and ϕ denote the median and standard deviation of the outcome data for group j. Estimated powers are shown for a two-sample t-test of log-transformed outcomes (‘log t-test’), a Mann-Whitney U test of untransformed outcomes (‘M-W test’) and a two-sample t-test of untransformed outcomes (‘t-test’)