Cornelia M Borkhoff1, Patrick R Johnston2, Derek Stephens3, Eshetu Atenafu4. 1. Division of Pediatric Medicine and the Pediatric Outcomes Research Team (PORT), Department of Pediatrics and Child Health Evaluative Sciences, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning, 686 Bay St., Toronto, Ontario, M5G 0A4, Canada; Women's College Research Institute, Women's College Hospital, 7th Floor, 790 Bay St., Toronto, Ontario, M5G 1N8, Canada; Institute of Health Policy, Management and Evaluation, University of Toronto, 155 College St., Suite 425, Toronto, Ontario, M5T 3M6, Canada. Electronic address: cory.borkhoff@sickkids.ca. 2. Clinical Research Program, Children's Hospital Boston, 300 Longwood Avenue, Boston, MA 02115, USA. 3. Child Health Evaluative Sciences, The Hospital for Sick Children, Peter Gilgan Centre for Research and Learning, 686 Bay St., Toronto, Ontario, M5G 0A4, Canada; Department of Biostatistics, Dalla Lana School of Public Health, University of Toronto, 6th Floor, 155 College St., Toronto, Ontario, M5T 3M7, Canada. 4. Department of Biostatistics, Dalla Lana School of Public Health, University of Toronto, 6th Floor, 155 College St., Toronto, Ontario, M5T 3M7, Canada; Department of Biostatistics, Princess Margaret Cancer Center, University Health Network, 610 University Avenue, Toronto, Ontario, M5G 2M9, Canada.
Abstract
OBJECTIVES: Aligning the method used to estimate sample size with the planned analytic method ensures the sample size needed to achieve the planned power. When using generalized estimating equations (GEE) to analyze a paired binary primary outcome with no covariates, many use an exact McNemar test to calculate sample size. We reviewed the approaches to sample size estimation for paired binary data and compared the sample size estimates on the same numerical examples. STUDY DESIGN AND SETTING: We used the hypothesized sample proportions for the 2 × 2 table to calculate the correlation between the marginal proportions to estimate sample size based on GEE. We solved the inside proportions based on the correlation and the marginal proportions to estimate sample size based on exact McNemar, asymptotic unconditional McNemar, and asymptotic conditional McNemar. RESULTS: The asymptotic unconditional McNemar test is a good approximation of GEE method by Pan. The exact McNemar is too conservative and yields unnecessarily large sample size estimates than all other methods. CONCLUSION: In the special case of a 2 × 2 table, even when a GEE approach to binary logistic regression is the planned analytic method, the asymptotic unconditional McNemar test can be used to estimate sample size. We do not recommend using an exact McNemar test.
OBJECTIVES: Aligning the method used to estimate sample size with the planned analytic method ensures the sample size needed to achieve the planned power. When using generalized estimating equations (GEE) to analyze a paired binary primary outcome with no covariates, many use an exact McNemar test to calculate sample size. We reviewed the approaches to sample size estimation for paired binary data and compared the sample size estimates on the same numerical examples. STUDY DESIGN AND SETTING: We used the hypothesized sample proportions for the 2 × 2 table to calculate the correlation between the marginal proportions to estimate sample size based on GEE. We solved the inside proportions based on the correlation and the marginal proportions to estimate sample size based on exact McNemar, asymptotic unconditional McNemar, and asymptotic conditional McNemar. RESULTS: The asymptotic unconditional McNemar test is a good approximation of GEE method by Pan. The exact McNemar is too conservative and yields unnecessarily large sample size estimates than all other methods. CONCLUSION: In the special case of a 2 × 2 table, even when a GEE approach to binary logistic regression is the planned analytic method, the asymptotic unconditional McNemar test can be used to estimate sample size. We do not recommend using an exact McNemar test.