Literature DB >> 17227481

Two-sample comparison based on prediction error, with applications to candidate gene association studies.

K Yu1, R Martin, N Rothman, T Zheng, Q Lan.   

Abstract

To take advantage of the increasingly available high-density SNP maps across the genome, various tests that compare multilocus genotypes or estimated haplotypes between cases and controls have been developed for candidate gene association studies. Here we view this two-sample testing problem from the perspective of supervised machine learning and propose a new association test. The approach adopts the flexible and easy-to-understand classification tree model as the learning machine, and uses the estimated prediction error of the resulting prediction rule as the test statistic. This procedure not only provides an association test but also generates a prediction rule that can be useful in understanding the mechanisms underlying complex disease. Under the set-up of a haplotype-based transmission/disequilibrium test (TDT) type of analysis, we find through simulation studies that the proposed procedure has the correct type I error rates and is robust to population stratification. The power of the proposed procedure is sensitive to the chosen prediction error estimator. Among commonly used prediction error estimators, the .632+ estimator results in a test that has the best overall performance. We also find that the test using the .632+ estimator is more powerful than the standard single-point TDT analysis, the Pearson's goodness-of-fit test based on estimated haplotype frequencies, and two haplotype-based global tests implemented in the genetic analysis package FBAT. To illustrate the application of the proposed method in population-based association studies, we use the procedure to study the association between non-Hodgkin lymphoma and the IL10 gene.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17227481     DOI: 10.1111/j.1469-1809.2006.00306.x

Source DB:  PubMed          Journal:  Ann Hum Genet        ISSN: 0003-4800            Impact factor:   1.670


  4 in total

1.  A partially linear tree-based regression model for multivariate outcomes.

Authors:  Kai Yu; William Wheeler; Qizhai Li; Andrew W Bergen; Neil Caporaso; Nilanjan Chatterjee; Jinbo Chen
Journal:  Biometrics       Date:  2009-05-07       Impact factor: 2.571

2.  A fast and powerful tree-based association test for detecting complex joint effects in case-control studies.

Authors:  Han Zhang; William Wheeler; Zhaoming Wang; Philip R Taylor; Kai Yu
Journal:  Bioinformatics       Date:  2014-04-09       Impact factor: 6.937

3.  Better-than-chance classification for signal detection.

Authors:  Jonathan D Rosenblatt; Yuval Benjamini; Roee Gilron; Roy Mukamel; Jelle J Goeman
Journal:  Biostatistics       Date:  2021-04-10       Impact factor: 5.899

4.  The future of primary intraocular lymphoma (retinal lymphoma).

Authors:  Chi-Chao Chan; Sylvain Fisson; Bahram Bodaghi
Journal:  Ocul Immunol Inflamm       Date:  2009 Nov-Dec       Impact factor: 3.070

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.