Han Zhang1, William Wheeler1, Zhaoming Wang2, Philip R Taylor1, Kai Yu1. 1. Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, USA, Information Management Services, Inc., Silver Spring, Maryland 20904, USA, and Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Gaithersburg, Maryland 20877, USA. 2. Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, USA, Information Management Services, Inc., Silver Spring, Maryland 20904, USA, and Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Gaithersburg, Maryland 20877, USABiostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, USA, Information Management Services, Inc., Silver Spring, Maryland 20904, USA, and Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Gaithersburg, Maryland 20877, USA.
Abstract
MOTIVATION: Multivariate tests derived from the logistic regression model are widely used to assess the joint effect of multiple predictors on a disease outcome in case-control studies. These tests become less optimal if the joint effect cannot be approximated adequately by the additive model. The tree-structure model is an attractive alternative, as it is more apt to capture non-additive effects. However, the tree model is used most commonly for prediction and seldom for hypothesis testing, mainly because of the computational burden associated with the resampling-based procedure required for estimating the significance level. RESULTS: We designed a fast algorithm for building the tree-structure model and proposed a robust TREe-based Association Test (TREAT) that incorporates an adaptive model selection procedure to identify the optimal tree model representing the joint effect. We applied TREAT as a multilocus association test on >20 000 genes/regions in a study of esophageal squamous cell carcinoma (ESCC) and detected a highly significant novel association between the gene CDKN2B and ESCC ([Formula: see text]). We also demonstrated, through simulation studies, the power advantage of TREAT over other commonly used tests. AVAILABILITY AND IMPLEMENTATION: The package TREAT is freely available for download at http://www.hanzhang.name/softwares/treat, implemented in C++ and R and supported on 64-bit Linux and 64-bit MS Windows. CONTACT: yuka@mail.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
MOTIVATION: Multivariate tests derived from the logistic regression model are widely used to assess the joint effect of multiple predictors on a disease outcome in case-control studies. These tests become less optimal if the joint effect cannot be approximated adequately by the additive model. The tree-structure model is an attractive alternative, as it is more apt to capture non-additive effects. However, the tree model is used most commonly for prediction and seldom for hypothesis testing, mainly because of the computational burden associated with the resampling-based procedure required for estimating the significance level. RESULTS: We designed a fast algorithm for building the tree-structure model and proposed a robust TREe-based Association Test (TREAT) that incorporates an adaptive model selection procedure to identify the optimal tree model representing the joint effect. We applied TREAT as a multilocus association test on >20 000 genes/regions in a study of esophageal squamous cell carcinoma (ESCC) and detected a highly significant novel association between the gene CDKN2B and ESCC ([Formula: see text]). We also demonstrated, through simulation studies, the power advantage of TREAT over other commonly used tests. AVAILABILITY AND IMPLEMENTATION: The package TREAT is freely available for download at http://www.hanzhang.name/softwares/treat, implemented in C++ and R and supported on 64-bit Linux and 64-bit MS Windows. CONTACT: yuka@mail.nih.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Authors: Xiang Wan; Can Yang; Qiang Yang; Hong Xue; Xiaodan Fan; Nelson L S Tang; Weichuan Yu Journal: Am J Hum Genet Date: 2010-09-10 Impact factor: 11.025
Authors: Michael C Wu; Peter Kraft; Michael P Epstein; Deanne M Taylor; Stephen J Chanock; David J Hunter; Xihong Lin Journal: Am J Hum Genet Date: 2010-06-11 Impact factor: 11.025
Authors: Kai Yu; Qizhai Li; Andrew W Bergen; Ruth M Pfeiffer; Philip S Rosenberg; Neil Caporaso; Peter Kraft; Nilanjan Chatterjee Journal: Genet Epidemiol Date: 2009-12 Impact factor: 2.135
Authors: Christian C Abnet; Neal D Freedman; Nan Hu; Zhaoming Wang; Kai Yu; Xiao-Ou Shu; Jian-Min Yuan; Wei Zheng; Sanford M Dawsey; Linda M Dong; Maxwell P Lee; Ti Ding; You-Lin Qiao; Yu-Tang Gao; Woon-Puay Koh; Yong-Bing Xiang; Ze-Zhong Tang; Jin-Hu Fan; Chaoyu Wang; William Wheeler; Mitchell H Gail; Meredith Yeager; Jeff Yuenger; Amy Hutchinson; Kevin B Jacobs; Carol A Giffen; Laurie Burdett; Joseph F Fraumeni; Margaret A Tucker; Wong-Ho Chow; Alisa M Goldstein; Stephen J Chanock; Philip R Taylor Journal: Nat Genet Date: 2010-08-22 Impact factor: 38.330
Authors: Casey S Greene; Nicholas A Sinnott-Armstrong; Daniel S Himmelstein; Paul J Park; Jason H Moore; Brent T Harris Journal: Bioinformatics Date: 2010-01-16 Impact factor: 6.937
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Han Zhang; Colin O Wu; Yifan Yang; Sonja I Berndt; Stephen J Chanock; Kai Yu Journal: Stat Methods Med Res Date: 2016-08-08 Impact factor: 3.021
Authors: Han Zhang; William Wheeler; Paula L Hyland; Yifan Yang; Jianxin Shi; Nilanjan Chatterjee; Kai Yu Journal: PLoS Genet Date: 2016-06-30 Impact factor: 5.917