Literature DB >> 25792622

Hypothesis testing at the extremes: fast and robust association for high-throughput data.

Yi-Hui Zhou1, Fred A Wright2.   

Abstract

A number of biomedical problems require performing many hypothesis tests, with an attendant need to apply stringent thresholds. Often the data take the form of a series of predictor vectors, each of which must be compared with a single response vector, perhaps with nuisance covariates. Parametric tests of association are often used, but can result in inaccurate type I error at the extreme thresholds, even for large sample sizes. Furthermore, standard two-sided testing can reduce power compared with the doubled [Formula: see text]-value, due to asymmetry in the null distribution. Exact (permutation) testing is attractive, but can be computationally intensive and cumbersome. We present an approximation to exact association tests of trend that is accurate and fast enough for standard use in high-throughput settings, and can easily provide standard two-sided or doubled [Formula: see text]-values. The approach is shown to be equivalent under permutation to likelihood ratio tests for the most commonly used generalized linear models (GLMs). For linear regression, covariates are handled by working with covariate-residualized responses and predictors. For GLMs, stratified covariates can be handled in a manner similar to exact conditional testing. Simulations and examples illustrate the wide applicability of the approach. The accompanying mcc package is available on CRAN http://cran.r-project.org/web/packages/mcc/index.html.
© The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Keywords:  Density approximation; Exact testing; Permutation

Mesh:

Year:  2015        PMID: 25792622      PMCID: PMC4804120          DOI: 10.1093/biostatistics/kxv007

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  14 in total

Review 1.  Computational tools for exact conditional logistic regression.

Authors:  C Corcoran; C Mehta; N Patel; P Senchaudhuri
Journal:  Stat Med       Date:  2001 Sep 15-30       Impact factor: 2.373

2.  A fast method for computing high-significance disease association in large population-based studies.

Authors:  Gad Kimmel; Ron Shamir
Journal:  Am J Hum Genet       Date:  2006-07-24       Impact factor: 11.025

3.  A powerful and flexible approach to the analysis of RNA sequence count data.

Authors:  Yi-Hui Zhou; Kai Xia; Fred A Wright
Journal:  Bioinformatics       Date:  2011-08-02       Impact factor: 6.937

4.  Including additional controls from public databases improves the power of a genome-wide association study.

Authors:  Semanti Mukherjee; Jennifer Simon; Sharon Bayuga; Emmy Ludwig; Sarah Yoo; Irene Orlow; Agnes Viale; Kenneth Offit; Robert C Kurtz; Sara H Olson; Robert J Klein
Journal:  Hum Hered       Date:  2011-08-17       Impact factor: 0.444

5.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.

Authors:  Yun Li; Cristen J Willer; Jun Ding; Paul Scheet; Gonçalo R Abecasis
Journal:  Genet Epidemiol       Date:  2010-12       Impact factor: 2.135

6.  An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival.

Authors:  Lance D Miller; Johanna Smeds; Joshy George; Vinsensius B Vega; Liza Vergara; Alexander Ploner; Yudi Pawitan; Per Hall; Sigrid Klaar; Edison T Liu; Jonas Bergh
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-02       Impact factor: 11.205

7.  Statistical methods in cancer research. Volume I - The analysis of case-control studies.

Authors:  N E Breslow; N E Day
Journal:  IARC Sci Publ       Date:  1980

8.  Efficient Moments-based Permutation Tests.

Authors:  Chunxiao Zhou; Huixia Judy Wang; Yongmei Michelle Wang
Journal:  Adv Neural Inf Process Syst       Date:  2009

9.  A genome-wide approach to identify genetic variants that contribute to etoposide-induced cytotoxicity.

Authors:  R Stephanie Huang; Shiwei Duan; Wasim K Bleibel; Emily O Kistner; Wei Zhang; Tyson A Clark; Tina X Chen; Anthony C Schweitzer; John E Blume; Nancy J Cox; M Eileen Dolan
Journal:  Proc Natl Acad Sci U S A       Date:  2007-05-30       Impact factor: 11.205

10.  Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13.2.

Authors:  Fred A Wright; Lisa J Strug; Vishal K Doshi; Clayton W Commander; Scott M Blackman; Lei Sun; Yves Berthiaume; David Cutler; Andreea Cojocaru; J Michael Collaco; Mary Corey; Ruslan Dorfman; Katrina Goddard; Deanna Green; Jack W Kent; Ethan M Lange; Seunggeun Lee; Weili Li; Jingchun Luo; Gregory M Mayhew; Kathleen M Naughton; Rhonda G Pace; Peter Paré; Johanna M Rommens; Andrew Sandford; Jaclyn R Stonebraker; Wei Sun; Chelsea Taylor; Lori L Vanscoy; Fei Zou; John Blangero; Julian Zielenski; Wanda K O'Neal; Mitchell L Drumm; Peter R Durie; Michael R Knowles; Garry R Cutting
Journal:  Nat Genet       Date:  2011-05-22       Impact factor: 38.330

View more
  8 in total

1.  A Zero-inflated Beta-binomial Model for Microbiome Data Analysis.

Authors:  Tao Hu; Paul Gallins; Yi-Hui Zhou
Journal:  Stat (Int Stat Inst)       Date:  2018-06-19

2.  A hybrid method of the sequential Monte Carlo and the Edgeworth expansion for computation of very small p-values in permutation tests.

Authors:  James J Yang; Elisa M Trucco; Anne Buu
Journal:  Stat Methods Med Res       Date:  2018-08-03       Impact factor: 3.021

3.  Pathway analysis for RNA-Seq data using a score-based approach.

Authors:  Yi-Hui Zhou
Journal:  Biometrics       Date:  2015-08-10       Impact factor: 2.571

4.  Estimation of cis-eQTL effect sizes using a log of linear model.

Authors:  John Palowitch; Andrey Shabalin; Yi-Hui Zhou; Andrew B Nobel; Fred A Wright
Journal:  Biometrics       Date:  2017-10-26       Impact factor: 2.571

5.  Changes in vaginal community state types reflect major shifts in the microbiome.

Authors:  J Paul Brooks; Gregory A Buck; Guanhua Chen; Liyang Diao; David J Edwards; Jennifer M Fettweis; Snehalata Huzurbazar; Alexander Rakitin; Glen A Satten; Ekaterina Smirnova; Zeev Waks; Michelle L Wright; Chen Yanover; Yi-Hui Zhou
Journal:  Microb Ecol Health Dis       Date:  2017-04-10

6.  Set-based differential covariance testing for genomics.

Authors:  Yi-Hui Zhou
Journal:  Stat (Int Stat Inst)       Date:  2019-08-06

7.  Improve the Colorectal Cancer Diagnosis Using Gut Microbiome Data.

Authors:  Yi-Hui Zhou; George Sun
Journal:  Front Mol Biosci       Date:  2022-08-12

8.  A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics.

Authors:  John S House; Fabian A Grimm; Dereje D Jima; Yi-Hui Zhou; Ivan Rusyn; Fred A Wright
Journal:  Front Genet       Date:  2017-11-01       Impact factor: 4.599

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.