Literature DB >> 31687060

MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS.

Eugene Katsevich1, Chiara Sabatti1.   

Abstract

We tackle the problem of selecting from among a large number of variables those that are "important" for an outcome. We consider situations where groups of variables are also of interest. For example, each variable might be a genetic polymorphism, and we might want to study how a trait depends on variability in genes, segments of DNA that typically contain multiple such polymorphisms. In this context, to discover that a variable is relevant for the outcome implies discovering that the larger entity it represents is also important. To guarantee meaningful results with high chance of replicability, we suggest controlling the rate of false discoveries for findings at the level of individual variables and at the level of groups. Building on the knockoff construction of Barber and Candès [Ann. Statist. 43 (2015) 2055-2085] and the multilayer testing framework of Barber and Ramdas [J. Roy. Statist. Soc. Ser. B 79 (2017) 1247-1268], we introduce the multilayer knockoff filter (MKF). We prove that MKF simultaneously controls the FDR at each resolution and use simulations to show that it incurs little power loss compared to methods that provide guarantees only for the discoveries of individual variables. We apply MKF to analyze a genetic dataset and find that it successfully reduces the number of false gene discoveries without a significant reduction in power.

Entities:  

Keywords:  Variable selection; false discovery rate (FDR); genomewide association study (GWAS); group FDR; knockoff filter; multiresolution; p-filter

Year:  2019        PMID: 31687060      PMCID: PMC6827557          DOI: 10.1214/18-AOAS1185

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  21 in total

1.  ChromHMM: automating chromatin-state discovery and characterization.

Authors:  Jason Ernst; Manolis Kellis
Journal:  Nat Methods       Date:  2012-02-28       Impact factor: 28.547

2.  Genetic Variant Selection: Learning Across Traits and Sites.

Authors:  Laurel Stell; Chiara Sabatti
Journal:  Genetics       Date:  2015-12-17       Impact factor: 4.562

3.  Pathway-based approaches for analysis of genomewide association studies.

Authors:  Kai Wang; Mingyao Li; Maja Bucan
Journal:  Am J Hum Genet       Date:  2007-12       Impact factor: 11.025

4.  Network-constrained regularization and variable selection for analysis of genomic data.

Authors:  Caiyan Li; Hongzhe Li
Journal:  Bioinformatics       Date:  2008-03-01       Impact factor: 6.937

5.  structSSI: Simultaneous and Selective Inference for Grouped or Hierarchically Structured Data.

Authors:  Kris Sankaran; Susan Holmes
Journal:  J Stat Softw       Date:  2014-09-12       Impact factor: 6.440

6.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

7.  Progress in methods for rare variant association.

Authors:  Stephanie A Santorico; Audrey E Hendricks
Journal:  BMC Genet       Date:  2016-02-03       Impact factor: 2.797

8.  Statistical estimation of correlated genome associations to a quantitative trait network.

Authors:  Seyoung Kim; Eric P Xing
Journal:  PLoS Genet       Date:  2009-08-14       Impact factor: 5.917

9.  SparSNP: fast and memory-efficient analysis of all SNPs for phenotype prediction.

Authors:  Gad Abraham; Adam Kowalczyk; Justin Zobel; Michael Inouye
Journal:  BMC Bioinformatics       Date:  2012-05-10       Impact factor: 3.169

10.  Re-sequencing expands our understanding of the phenotypic impact of variants at GWAS loci.

Authors:  Susan K Service; Tanya M Teslovich; Christian Fuchsberger; Vasily Ramensky; Pranav Yajnik; Daniel C Koboldt; David E Larson; Qunyuan Zhang; Ling Lin; Ryan Welch; Li Ding; Michael D McLellan; Michele O'Laughlin; Catrina Fronick; Lucinda L Fulton; Vincent Magrini; Amy Swift; Paul Elliott; Marjo-Riitta Jarvelin; Marika Kaakinen; Mark I McCarthy; Leena Peltonen; Anneli Pouta; Lori L Bonnycastle; Francis S Collins; Narisu Narisu; Heather M Stringham; Jaakko Tuomilehto; Samuli Ripatti; Robert S Fulton; Chiara Sabatti; Richard K Wilson; Michael Boehnke; Nelson B Freimer
Journal:  PLoS Genet       Date:  2014-01-30       Impact factor: 5.917

View more
  2 in total

1.  False discovery rate control in genome-wide association studies with population structure.

Authors:  Matteo Sesia; Stephen Bates; Emmanuel Candès; Jonathan Marchini; Chiara Sabatti
Journal:  Proc Natl Acad Sci U S A       Date:  2021-10-05       Impact factor: 11.205

2.  Multi-resolution localization of causal variants across the genome.

Authors:  Matteo Sesia; Eugene Katsevich; Stephen Bates; Emmanuel Candès; Chiara Sabatti
Journal:  Nat Commun       Date:  2020-02-27       Impact factor: 14.919

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.