OBJECTIVE: p Values are inaccurate for model-free linkage analysis using the conditional logistic model if we assume that the LOD score is asymptotically distributed as a simple mixture of chi-square distributions. When analyzing affected relative pairs alone, permuting the allele sharing of relative pairs does not lead to a useful permutation distribution. As an alternative, we have developed regression prediction models that provide more accurate p values. METHODS: Let E(alpha) be the empirical p value, which is the proportion of statistical tests whose LOD score under the null hypothesis exceeds a threshold determined by alpha, the nominal single test significance value. We used simulated data to obtain values of E(alpha) and compared them with alpha. We also developed a regression model, based on sample size, number of covariates in the model, alpha and marker density, to derive predicted p values for both single-point and multipoint analyses. To evaluate our predictions we used another set of simulated data, comparing the Ealpha for these data with those obtained by using the prediction model, referred to as predicted p values (P(alpha)). RESULTS: Under almost all circumstances the values of P(alpha) were closer to the E(alpha) than were the values of alpha. CONCLUSION: The regression models suggested by our analysis provide more accurate alternative p values for model-free linkage analysis when using the conditional logistic model.
OBJECTIVE: p Values are inaccurate for model-free linkage analysis using the conditional logistic model if we assume that the LOD score is asymptotically distributed as a simple mixture of chi-square distributions. When analyzing affected relative pairs alone, permuting the allele sharing of relative pairs does not lead to a useful permutation distribution. As an alternative, we have developed regression prediction models that provide more accurate p values. METHODS: Let E(alpha) be the empirical p value, which is the proportion of statistical tests whose LOD score under the null hypothesis exceeds a threshold determined by alpha, the nominal single test significance value. We used simulated data to obtain values of E(alpha) and compared them with alpha. We also developed a regression model, based on sample size, number of covariates in the model, alpha and marker density, to derive predicted p values for both single-point and multipoint analyses. To evaluate our predictions we used another set of simulated data, comparing the Ealpha for these data with those obtained by using the prediction model, referred to as predicted p values (P(alpha)). RESULTS: Under almost all circumstances the values of P(alpha) were closer to the E(alpha) than were the values of alpha. CONCLUSION: The regression models suggested by our analysis provide more accurate alternative p values for model-free linkage analysis when using the conditional logistic model.
Authors: Brion S Maher; Hugh B Hughes; Wendy N Zubenko; George S Zubenko Journal: Am J Med Genet B Neuropsychiatr Genet Date: 2010-01-05 Impact factor: 3.568