Literature DB >> 26075012

Conditional and Unconditional Tests (and Sample Size) Based on Multiple Comparisons for Stratified 2 × 2 Tables.

A Martín Andrés¹, I Herranz Tejedor², M Álvarez Hernández³.

Abstract

The Mantel-Haenszel test is the most frequent asymptotic test used for analyzing stratified 2 × 2 tables. Its exact alternative is the test of Birch, which has recently been reconsidered by Jung. Both tests have a conditional origin: Pearson's chi-squared test and Fisher's exact test, respectively. But both tests have the same drawback that the result of global test (the stratified test) may not be compatible with the result of individual tests (the test for each stratum). In this paper, we propose to carry out the global test using a multiple comparisons method (MC method) which does not have this disadvantage. By refining the method (MCB method) an alternative to the Mantel-Haenszel and Birch tests may be obtained. The new MC and MCB methods have the advantage that they may be applied from an unconditional view, a methodology which until now has not been applied to this problem. We also propose some sample size calculation methods.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2015 PMID： 26075012 PMCID： PMC4446475 DOI： 10.1155/2015/147038

Source DB: PubMed Journal: Comput Math Methods Med ISSN： 1748-670X Impact factor: 2.238

1. Introduction

In statistics it is very usual to have to verify whether association exists between two dichotomic qualities. This is especially frequent in medicine, for example, where the aim is to assess whether the presence or absence of a risk factor conditions the presence or absence of a disease or compare two treatments whose answers are success or failure, and so forth. In all the cases the problem produces data whose frequencies are presented in a 2 × 2 table: the two levels of one of the qualities are set out in the rows, the two levels of the other quality in the columns, and the observed frequencies are set out inside the table. The exact and the asymptotic analyses of a 2 × 2 table have their roots in the origins of statistics, and hundred of papers have been devoted to the problem [1]. It is traditional to carry out the exact independence test using the Fisher exact test, which is a conditional test (because it assumes that the marginals of the rows and columns are previously fixed). More than thirty years has passed since the situation changed, and it is well known that the unconditional exact test tends to be less conservative and more powerful than the conditional test [2-4], because the loss of information as a result of conditioning may be as high as 26% [5]. The unconditional tests assume that it is only the values that were really previously fixed: the marginal of the rows, the marginal of the columns or the total data in the table. This causes two types of unconditional test: that of the double binominal model (the first two cases) and that of the multinomial model (the third case). The same can be said of the asymptotic tests, generally based on Pearson's chi-squared statistic with different corrections for continuity (cc). However, the unconditional exact tests have the great disadvantage of being very laborious to compute. An overall view of the problem can be seen in Martín Andrés [1, 6]. Frequently the individuals who take part in the study are stratified in groups based on a covariate such as sex or age, which gives rise to several 2 × 2 tables. In this case the aim is to contrast the independence of both the original dichotomic qualities, bearing in mind the heterogeneity of the populations defined by the strata. To this end, the most frequent approach is to suggest a test under the null hypothesis of Mantel-Haenszel for which the odds ratio (or the risk ratio) for all the strata is equal to unity. For this purpose the most frequent asymptotic tests are those of Cochran [7] and Mantel and Haenszel [8], both of which are very similar; the exact version of the test is due to Birch [9] (and has recently been reconsidered by [10]). In all these cases the proposed tests are conditional and, when there is only one stratum, the test for the case of only one 2 × 2 table is obtained (Fisher's exact test or Pearson's chi-squared test). Moreover, Jung [10] and Jung et al. [11] propose a sample size calculation method, asymptotic in the first and exact in the second. The procedures indicated have the drawback of almost all the tests for a global null hypothesis like the one in question that the result of the global (stratified) test may not be compatible with that of the individual tests (the test for each stratum). In this paper, we propose a global test (MC test) which does not have this disadvantage because it is based on a multiple comparisons method: the global test is significant if and only if at least one of the individual tests is significant. In return the MC test will have the drawback of being less powerful, given that it must control both the alpha error of the global test and the alpha errors in the individual tests. Because of this, another procedure is proposed (MCB test) which only controls the alpha error of the global test (just as in the classic stratified tests), although the alpha error in the individual tests will only exceed the nominal value on a few occasions (and generally by very little). The two procedures are applicable from both the conditional and the unconditional point of view and also when carrying out an asymptotic test or an exact test. The advantage of applying them in the form of an unconditional test is that in this way the loss of power mentioned above is reduced with regard to the classic global tests. In addition this paper shows that the asymptotic tests function well, even for small samples, if they are carried out with the appropriate continuity correction. And finally, the sample size for almost all the cases studied (exact or asymptotic tests, conditional or unconditional tests) is determined.

2. Hypothesis Test

2.1. Notation, Models, and Example

In the following (without loss of generality) it will be assumed that each 2 × 2 table refers to the successes or failures in two treatments which are applied to m and n individuals, respectively. Let J be the number of strata, N = m + n (j = 1,…, J) the total of individuals in the stratum j, N = ∑ N the total sample size, and the number of successes and the number of failures with the treatments 1 and 2, respectively, and z = x + y and the total number of successes and failures in the stratum j respectively. These data may be summarized as shown in Table 1. Once the experiment has been performed, the values obtained will be written with an extra subindex “0,” that is, x , y , m , N ,….

Table 1

Frequency data of 2 × 2 table for stratum j.

Treatment	Response		Total
Treatment	Yes	No	Total
1	x _j	x-j	m _j
2	y _j	y-j	n _j

Total	z _j	z-j	N _j

Let p and q ( and ) be the probabilities of success (failure) with treatments 1 and 2 in the stratum j, respectively. The odds ratio for each stratum is , and the aim is to contrast the null hypothesis H: θ 1 = ⋯ = θ = 1 against an alternative hypothesis with one tail (K: θ > 1 for some j) or with two tails (K: θ ≠ 1 for some j). This paper addresses only the case of one-sided test; for the two-tail test the procedure is similar. In the previous description it was assumed that the data (x , y ) of each stratum j proceed from a double binomial distribution of sizes m and n and probabilities p and q in groups 1 and 2, respectively. Because in each stratum j there are two previously fixed values (m and n ) the model will be referred to as Model 2; the model is very frequently used in practice so that it will serve here as a basis for defining and illustrating the procedures MC and MCB. If in each stratum there is conditioning in the observed value z = x + y , then one has Model 3; now the three values m , z , and N are previously fixed in each stratum j and the only variable x arises from a hypergeometric distribution. If only the values of N are fixed in each stratum j, one will get Model 1: proceeding from a multinomial distribution. Finally, if only the global sample size N is fixed (so that now even the values for N are obtained at random), one will have Model 0. With conditioning in the appropriate marginal, the model X leads to the model (X + 1). Therefore, whatever the initial model (i.e., whatever the sampling method for the data obtained), by conditioning in all the nonfixed marginals one always obtains Model 3 (which is the one covered by Birch and Mantel and Haenszel). Each model produces a different sample space, which is formed by the set of all possible values of the set of variables involved in the same. For example, the sample space of stratum j under Model 2 consists of (m + 1)×(n + 1) possible values of (x , y ). Each transition from a Model X to Model (X + 1) constitutes a loss of information, because the number of points of the new sample space is very much smaller than that of the previous one. Probably the most dramatic transition is that of Models 2 to 3, a transition in which the loss of information may reach 26% for J = 1 [5]. In addition, each transition implies using a conditional rather than an unconditional method of eliminating nuisance parameters, something which is generally never advisable [13]. The data in Table 2, which are given by Li et al. [12], are taken from preliminary analysis of an experiment of three groups to evaluate whether thymosin (treatment 1), compared to a placebo (treatment 2), has any effect on the treatment of bronchogenic carcinoma patients receiving radiotherapy. The one-sided p values are P Birch = 0.1563 by global conditional stratified exact test and P 1 = 0.80073, P 2 = 0.57143, and P 3 = 0.14706 by Fisher's individual conditional exact test in each stratum. If the global test is carried out to an error we conclude K, so that now θ > 1 at least once. However no individual test has significance if these are carried out to an alpha error that respects the former global error; for example, by using Bonferroni's method, the smaller of the three p values P 3 = 0.14706 > 0.1563/3. The same thing occurs if asymptotic tests are used. Our aim is to define procedures in which these incompatibilities will not occur.

Table 2

Response to thymosin in cancer patients (yes = success, no = failure).

	Stratum 1		Total	Stratum 2		Total	Stratum 3		Total
	Yes	No	Total	Yes	No	Total	Yes	No	Total
Thymosin	10	1	11	9	0	9	8	0	8
Placebo	12	1	13	11	1	12	7	3	10

Total	22	2	24	20	1	21	15	3	18

2.2. Conditional Tests Obtained by Using Classic Methods (Model 3)

The p value of exact test is P Birch = 0.1563. Table 3 shows this value and the remaining p values in this paper. This result is based on determining the probability of all the configurations (x ∣N , m , z ), j = 1,2,…, J, such as S = ∑ x ≥ S 0 = ∑ x = 27. Here S is a test statistic determining the order in which the points of the sample space (x 1, x 2, x 3) enter the region R, a region whose probability under H yields the value of P Birch. Note that as the sample spaces in each stratum are 9 ≤ x 1 ≤ 11, 8 ≤ x 2 ≤ 9, and 5 ≤ x 3 ≤ 8, the possible values of (x 1, x 2, x 3) will be 3 × 2 × 4 = 24, which is the total number of points in the global sample space; of these, four belong to R (three with S = 27 and one with S = 28), so that 4/24 = 0.1667. Moreover note that, under the original Model 2, the number of points in the sample space of strata 1, 2, and 3 are (m + 1)×(n + 1) = (11 + 1)×(13 + 1), (9 + 1)×(12 + 1), and (8 + 1)×(10 + 1), respectively. The total points for the global sample space will be 168 × 130 × 99: more than two million, compared to only 24 in Model 3. To determine the value P Birch have developed various programs (see references in [14]); an easy way to get it is through http://www.openepi.com/Menu/OE_Menu.htm (option “Two by Two Table”).

Table 3

p values obtained by various methods for the data in the example of Li et al. [12]. Each asymptotic method is placed directly below the exact method from which it proceeds.

Model	Test	Procedure	Statistic used	p value
3	Exact	Birch	Sum of successes (treated group)	0.1563
	Asymptotic	MH	χ _MH of Mantel-Haenszel (without cc)	0.0760
	Asymptotic	MH	χ _MH of Mantel-Haenszel (with cc)	0.1573
	Exact	MC	p value Fisher	0.3795
	Asymptotic	MC	χ ₃ of Yates	0.3887
	Exact	MCB	p value Fisher	0.1471
	Asymptotic	MCB	χ ₃ of Yates	0.1513

2	Exact	MC	p value Barnard	0.1602
	Asymptotic	MC	χ ₂ of Martín et al.	0.1614
	Exact	MCB	p value Barnard	0.1533
	Asymptotic	MCB	χ ₂ of Martín et al.	0.1588

1	Exact	MC	p value Barnard	0.1282
1	Asymptotic	MC	χ ₁ of Pirie and Hamdan	0.1512

Note: MH = Mantel-Haenszel test; MC = multiple comparisons method; MCB = method based on the multiple comparisons.

The asymptotic test of Mantel-Haenszel based on ∑x is asymptotically normal with mean ∑E = ∑ m z /N and variance . Therefore the contrast statistic is χ MH = (∑x − ∑E )/(∑V )0.5, whose p value P MH = 0.0760 patently does not agree with P Jung = 0.1563. However because the variable S is discrete, it is convenient to carry out a continuity correction [15]. As S jumps one space at a time, the cc should be 0.5 and so the statistic with cc will be χ MHc = (∑x − ∑E − 0.5)/(∑V )0.5 [8]. The new p value P MHc = 0.1573 itself is already compatible with the exact value.

2.3. MC and MCB Tests Based on the Criterion of the Multiple Comparisons: General Observations

Let us suppose that in each stratum the hypotheses H : θ = 1 versus K : θ > 1 to error α are contrasted. Thereby H = ∩H and K = ∪K . If the global null hypothesis H is rejected when there exists at least one j in which the individual test rejected H , then the alpha error of the global test (H versus K) will be [16]In particular, if α = α (∀j) method MC is obtained (the “method of the multiple comparisons”), and its global alpha error will be Method MC guarantees the compatibility of the results of the global test and of the individual tests, because the global test is significant if and only if at least one of the individual tests is so. When J = 1, the global test is the same as the individual test. On the basis of the above, in general the test can be defined as follows. In each stratum j an order statistic S will have been defined which allows the p value for each one of its points to be determined. If the points from all strata are mixed, they are ordered from the lowest value of their p value to the highest and will be introduced one by one into the global critical region R until a given condition (stopping rule) has been verified; then R = ∪R , with R the critical region formed by the points in the stratum j which belong to R. Let α be the largest of the p values of the points in R . The real global alpha error of the test constructed thus will be given by expression (1). When the stopping rule is “stop introducing points into R when the maximum of the α is as close as possible to α (but less than or equal to α),” with α given bythen method MC is obtained, and this method simultaneously controls global error and the individual error α. Now, the critical region R = R of each stratum consists of all the points whose p value is smaller or equal to α, α = α ≤ α, R = R MC = ∪R and the real global error will be . It is a simpler process to obtain the p value P MC of some observed data. Let P be the p-value of the individual test in stratum j. The first individual alpha error for which K is concluded will be α = P 0 = min⁡ P , so that for expression (2) the p value of the global text will be When the stopping rule is “stop introducing points into R when 1 − ∏(1 − α ) is the closest possible to (but smaller than or equal to ),” method MCB is obtained (the method “based on the multiple comparisons”). Because now only the global error is controlled, its goal is similar to that of Jung's method [10]. The method MCB causes that R = R , α = α , R = R MCB = ∪R and the real global error is . Note that R MC⊆R MCB, since , something to be expected given that method MC controls two errors and the MCB method controls only one of these. Let us see how we can obtain the p value P MCB of some observed data in which P 0 = P 1 for example. The region R MCB which yields the first significance of the global test is obtained when the observed point in stratum 1 is the last introduced into R MCB, that is, when α 1MCB = P 0; in the other strata it should be α ≤ P 0, but as close as possible to P 0. Thus the p value will be P MCB = 1 − ∏(1 − α ). It can now be seen that α = α where α are the values of the MC test when this is carried out to the error α = P 0. Therefore P MCB ≤ P MC and, for effects of calculating the p value P MCB, the p values α = α and the regions R = R will be written just as α ∗ and R ∗, respectively. Thus, if α ∗ is the largest p value in stratum j which is smaller than or equal to P 0, Methods MC and MCB may be applied with exact methods or with asymptotic methods and to any of the three models, as illustrated in the following sections.

2.4. MC and MCB Tests under Model 3

The p values of the Fisher exact test in each stratum are P 1 = 0.80073, P 2 = 0.57143, and P 3 = 0.14706. So, P 0 = P 3 = 0.14706 and P MC = 0.3795 by expression (4). In order to apply method MCB the critical regions R ∗ (j = 1 and 2) must be determined to the objective error α = 0.14706 = P 0 = α 3 ∗. For j = 1, 9 ≤ x 1 ≤ 11 with Pr⁡{x 1 = 11∣H 1} = 0.2862 > P 0; thus R 1 ∗ = ϕ and α 1 ∗ = 0. This same occurs for j = 2 (α 2 ∗ = 0). For expression (5), P MCB = 0.1471 (smaller than P Jung). Generally speaking the critical region of Birch [9] and Jung [10] has the form S = ∑ x ≥ S 0 = ∑ x , while that of method MCB is in the form ∪{x ≥ x ∗}, with x ∗ ≥ x . It can be proved that this generally implies that the Birch method will yield a p value smaller than or equal to that of method MCB when the p values P are similar or when the observed values x are the highest possible. Let us now apply an asymptotic test. In general, whatever the model is, the appropriate statistic is the chi-squared statistic [6]:The appropriate value for the continuity correction c depends on the assumed model, and that value is what causes the results of the three models to be different. When c = 0 (∀j) Pearson's classic chi-squared statistic is obtained. In the case here of Model 3, by making c = N /2 the classic statistic χ 3 (or the Yates chi-squared statistic) is obtained. Its maximum value is reached in stratum 3 (χ 33 = 1.0308), which yields the p values P 0 = 0.15132 and P MC = 0.3887. In order to apply method MCB, one must obtain in the other two strata the first value χ 3 ∗ of χ 3 which is larger than or equal to χ 33. As there is none, α 1 ∗ = α 2 ∗ = 0, α 3 ∗ = 0.15132 and P MCB = 0.1513. Note that the asymptotic p values are similar to the exact ones, both with method MC and with method MCB. Despite the small size of the samples, the asymptotic methods function well (something which also occurs with the rest of the methods, as will be seen).

2.5. MC and MCB Tests under Model 2

The data in the example in reality proceeds from Model 2. In determining the p value P of an observed table of Model 2 (x , y ∣m , n ) the same steps are followed as in Model 3 (except the last, which is special): (1) define an order statistic S (x , y ∣m , n ), which does not need to be the same one in each stratum; (2) determine the set of points R = {(x , y ∣m , n )∣S (x , y ∣m , n ) ≥ S (x , y ∣m , n )}; (3) calculate the probability of R under H : p = q = π given by ; and (4) determine the p value as P = max⁡⁡α (π ), where π is the nuisance parameter that is eliminated by maximization (the most complicated step). Note that π is the marginal probability of columns under H . In the case of Model 3 there is only one order statistic S possible [17], because the convexity of the region R must be verified and the points ordered “from the largest to the smallest value of x .” In the case of Model 2 there are many possible test statistics. One of these is the order F of Boschloo [18]: order the points from the smaller to larger value of its one-tailed p value obtained using the Fisher exact test. It is already known [19] that the unconditional test based on the order F is uniformly more powerful (UMP) than Fisher's own exact test. Although no unconditional order is UMP compared to the rest, the generally most powerful order is [3] the complex statistic B of Barnard [20]. As far as we know, the only program that carries out the above calculations for the statistic B is SMP.EXE, which may be obtained free of charge at http://www.ugr.es/local/bioest/software.htm. The program also gives the solution for other simpler test statistics. Using this program, because the minimum p value is P 3 = 0.05653 then P MC = 0.1602. In order to obtain P MCB one has to proceed as in the previous section, although now the process is now somewhat more difficult. In stratum 1, the table (x 1, y 1) = (11,10) is the one that gives a larger p value α 1 ∗ = 0.05462, but smaller than or equal to α 3 ∗ = 0.05653. In stratum 2 the results are (x 2, y 2) = (4,1) and α 2 ∗ = 0.05069. So, P MCB = 0.1533, a value which is similar to that of P Birch (the results are alike if other order statistics of the program SMP.EXE are used). It can be seen that the use of the unconditional method allows the inherent conservatism in the definitions of methods MC and MCB to be reduced. In order to carry out the asymptotic test we shall use the optimal version of expression (6) for Model 2: χ 2 is the value of expression (6) when c = 1 (or 2) if m ≠ n (or m = n ) [6]. Now the maximum value is χ 23 = 1.5805, whereby P 3 = 0.05700 and P MC = 0.1614 (a value, i.e., very near the 0.1602 of the exact method). Proceeding as above, the first values χ 2 ∗ of χ 2 (j = 1 or 2) which are larger than or equal to χ 23 are χ 21 ∗ = 1.5822 for (x 1, y 1) = (10,8) and χ 22 ∗ = 1.6056 for (x 2, y 2) = (2,0). This makes α 1 ∗ = 0.05680, α 2 ∗ = 0.05418, and P MCB = 0.1588 (which is also a value, i.e., very close to the 0.1533 of the exact method).

2.6. MC and MCB Tests under Models 1 and 0

Let us suppose now that the data contained in the example in Table 2 proceed from Model 1. The determining of the p value P of an observed table ) is the same as in Model 2, but now the calculations are more complicated because the nuisance parameters must be eliminated (the marginal probabilities of rows and columns under H ). Again there are many possible test statistics [1, 21], although none of them is UMP compared to the others. The generally more powerful statistic is again Barnard's B statistic [22] and, as far as we know, the only program to apply it is TMP.EXE which may be obtained free of charge at http://www.ugr.es/local/bioest/software.htm. The program also gives the solution using other simpler test statistics. Using this program, the minimum p value is P 3 = 0.04472 and from this P MC = 0.1282 (substantially smaller than P Birch). In order to carry out the asymptotic test we shall use the optimal version of expression (6) for Model 1: χ 1 is the value of expression (6) when c = 0.5 ∀j [6]. The statistic is given by Pirie and Hamdan [23]. Now the maximum value is χ 13 = 1.6149, with the result that P 3 = 0.05317 and P MC = 0.1512. Method MCB (which is very laborious to calculate) is omitted here, because the large number of points in the sample space will make α 1 ∗ ≈ α 2 ∗ ≈ α 3 ∗ = P 3 and so P MC ≈ P MCB. Note that stratum 1 under Model 2 consists of (m 1 + 1)(n 1 + 1) = (11 + 1)×(13 + 1) = 168 points, but under Model 1 it consists of (N 1 + 1)(N 1 + 2)(N 1 + 3)/6 = 25 × 26 × 27/6 = 2,925 points. For similar reasons, Model 0 can be treated as if it were Model 1 (by conditioning in the real obtained values N ).

3. Sample Size under Model 2

3.1. Example and Conditional Solutions Obtained by Classic Methods

Jung [10] proposes a sample size calculation for its stratified exact test. For the example described in Section 2.1, he accepts Model 2 and sets out a case study with N = N/3 and m = N/6. The aim is to determine the value of N for the alternative hypotheses (θ 1, θ 2, θ 3) = (1,30,30), a type I error of and a power of . Jung also assumes that (q 1, q 2, q 3) = (0.9,0.75,0.6), so that under the alternative hypothesis . His solution is N Jung = 62. From what can be deduced from other parts of his paper, the detailed solution is n 1 = n 2 = 20, n 3 = 22, m 1 = m 2 = 10, and m 3 = 11. These values are included in Table 4 (as well as the most relevant ones obtained in all the following). This sample size provides a real error of and a real power of .

Table 4

Sample sizes by stratum (m = n ) and global (N) obtained by various methods for the data of Jung's example [10] under Model 2. Each asymptotic method is placed immediately below the exact method from which it proceeds.

Model	Test	Procedure	Stratum			N
Model	Test	Procedure	1	2	3	N
Conditional	Exact	Jung	10, 10	10, 10	11, 11	62
	Asymptotic	χ _MH without cc	8, 8	8, 8	9, 9	50
	Asymptotic	χ _MH with cc	11, 11	11, 11	12, 12	68

Unconditional	Exact	MC (Barnard's order)	12, 12	12, 12	13, 13	74
			10, 11	11, 12	12, 13	69
			1, 2	11, 12	12, 13	51
	Asymptotic	MC (χ ₂ with cc)	11, 11	12, 12	12, 12	70
			10, 11	11, 12	12, 13	69
			1, 2	11, 12	12, 13	51

Note: χ MH: χ of Mantel-Haenszel; MC = multiple comparisons method; χ 2 = χ of Model 2.

Let us suppose that generally n = k m , with k known values, and that the aim is to determine the values m which guarantee the desired power, which implies using Model 2. The reasoning that follows is the same as that with which Casagrande et al. [24] and Fleiss et al. [25] obtained the classic formula for sample size in the comparison of two independent proportions. The solutions without cc that follow are a special case of those of Jung et al. [11]. The test χ MHc in Section 2.2 is based on the statistic , where . Because is distributed asymptotically as a normal distribution with the mean D = k m (p − q )/(k + 1) and the variance , will be asymptotically normal with the mean D = ∑D and the variance S 2 = ∑S 2. Under H, p = q = π (∀j), with the result that the mean and variance of will be D = 0 and of , respectively, with . Because under H the nuisance parameter π is estimated by z /N , it is usual to substitute it by its average value under K, that is, by π = (p + k q )/(k + 1); hence . Consequently the statistic will reach significance in the critical value D ∗ which verifies , in which the number 0.5 corresponds to the cc indicated above and z refers to a normal standard variable. Therefore , with the -percentile of the normal standard distribution. Under K the parameters D = D and S 2 = S 2 are obtained in the values p and q which specify K: D = ∑k m (p − q )/(k + 1) and . Given the above, the error beta will be If the solution is restricted to the case of m = m (∀j), by making equal to the fraction of expression (7) and by working out m, one obtains the equation , where δ = ∑k (p − q )/(k + 1), , and ; thereforeThe solutions m 0 and m are those of the tests χ MH and χ MHc, respectively. Frequently k = 1 (∀j); in this case expression (8) explicitly takes the following form: For the example at the beginning of this section (in which k = 1), if at first we restrict the solution to m 1 = m 2 = m 3 = m, expression (9) indicates that m 0 = 8.27 and m = 11.3. Assuming that in this example the values of m are allowed to differ at most by 1, then the solution that is sought must be 8 ≤ m ≤ 9 (∀j) without cc or 11 ≤ m ≤ 12 (∀j) with cc. In the second phase, expression (7) indicates that in m 1 = m 2 = 11 and m 3 = 12 is the first time that (=0.183) ≤ 0.2, so that this is the solution with cc that was being sought (N = 68). The solution without cc is obtained in the same way (m 1 = m 2 = 8, m 3 = 9, and N = 50), but it is too liberal.

3.2. Solution Using the Exact Method MC

For fixed values of the global error and the sample sizes (m , n ), the method MC described in Section 2.3 allows one to obtain the critical region R and the real type 1 error . Moreover, let β be the error beta for each individual test, with 1 − β equal to the probability of the region β under K . Because of the way method MC was defined, the real global error beta will beIf , these values {(m , n )} guarantee the desired power. If , it is necessary to increase some values of m and/or n and to repeat the previous procedure. Let us initially assume that m = n . The process for determining the sample sizes m may be shortened if it begins with a value m = m (∀j) like that of expression (8). With the method MC, one obtains that m = 12 is not a solution because , but m = 13 is a solution because . The solution can now be refined allowing values m to differ by a maximum of one. The final solution is m 1 = m 2 = 12, m 3 = 13 (N = 74), , and . Unconditioned tests are more powerful when the sample sizes are slightly different [3], since the number of ties that produces any statistic S that is used is reduced. By planning n = m + 1 and making the values of m consecutive, the solution m 1 = 10, m 2 = 11, and m 3 = 12 (N = 69) is obtained, with and (the solution based on n = m − 1 is worse). Actually, stratum 1 is of virtually no interest since in it H 1 = K 1. Despite everything, if it is introduced, the configuration n = m + 1, m 1 = 1, m 2 = 11 and m 3 = 12 (N = 51) is correct because and .

3.3. Solution Using the Asymptotic Method MC Based on the Chi-Square Test with cc

In the following the procedure is the same as in Section 2.1, assuming for the moment that m and n can be any values. The numerator of χ 2 may be written as , where c is the cc of Model 2 (c = 2 or 1 depending on whether m and n are equal or different, resp.) and (the base statistic for the test) is asymptotically normal with mean d = m n (p − q ) and variance ). Under H , p = q = π and is asymptotically normal with mean 0 and variance , with . Because under H the nuisance parameter π is estimated by z /N , it is usual to substitute it by its average value under K , that is, by π = (m p + n q )/N ; hence . If each individual test is realized to the error α of expression (3), the critical value d ∗ for will verify , in which the value c corresponds to the cc indicated above; therefore d ∗ = z 1− s + c . Under K , is asymptotically normal with mean d = m n (p − q ) and variance . Thus and applying the first equality in expression (10)in particular, if m = n = m (∀j), then c = 2, and For the data in the example, α = 1 − 0.91/3 = 0.03451 and by making m = m (∀j) the solution, the solution based on expression (12) is m = 12. This solution can be refined by allowing the values of m to differ by a maximum of one, in which case the new solution, now based on expression (11), is m 1 = 11, m 2 = m 3 = 12 (N = 70) with . If a cc is not carried out the solution is too liberal: m 1 = m 2 = 10, m 1 = 11 (N = 62) with . By planning n = m + 1 and making the values of m consecutive, the solution m 1 = 10, m 2 = 11, and m 3 = 12 is obtained (as in the exact method), with = . This is the same result as for the configuration at the end of the previous section.

9 in total

1. Dealing with discreteness: making 'exact' confidence intervals for proportions, differences of proportions, and odds ratios more exact.

Authors: A Agresti
Journal: Stat Methods Med Res Date: 2003-01 Impact factor: 3.021

Conditional and Unconditional Tests (and Sample Size) Based on Multiple Comparisons for Stratified 2 × 2 Tables.

1. Introduction

2. Hypothesis Test

2.1. Notation, Models, and Example

2.2. Conditional Tests Obtained by Using Classic Methods (Model 3)

2.3. MC and MCB Tests Based on the Criterion of the Multiple Comparisons: General Observations

2.4. MC and MCB Tests under Model 3

2.5. MC and MCB Tests under Model 2

2.6. MC and MCB Tests under Models 1 and 0

3. Sample Size under Model 2

3.1. Example and Conditional Solutions Obtained by Classic Methods

3.2. Solution Using the Exact Method MC

3.3. Solution Using the Asymptotic Method MC Based on the Chi-Square Test with cc

1. Dealing with discreteness: making 'exact' confidence intervals for proportions, differences of proportions, and odds ratios more exact.

2. Statistical aspects of the analysis of data from retrospective studies of disease.

3. A simple approximation for calculating sample sizes for comparing independent proportions.

4. Comments on 'Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations'.

5. The choice of statistical tests illustrated on the interpretation of data classed in a 2 X 2 table.

6. Significance tests for 2 X 2 tables.

7. Stratified Fisher's exact test and its sample size calculation.

8. A note on sample size calculation based on propensity analysis in nonrandomized trials.

9. An improved approximate formula for calculating sample sizes for comparing two binomial distributions.