Literature DB >> 34522887

Contra: Contrarian statistics for controlled variable selection.

Mukund Sudarshan1, Aahlad Puli1, Lakshmi Subramanian1, Sriram Sankararaman2, Rajesh Ranganath1,3.   

Abstract

The holdout randomization test (HRT) discovers a set of covariates most predictive of a response. Given the covariate distribution, HRTs can explicitly control the false discovery rate (FDR). However, if this distribution is unknown and must be estimated from data, HRTs can inflate the FDR. To alleviate the inflation of FDR, we propose the contrarian randomization test (CONTRA), which is designed explicitly for scenarios where the covariate distribution must be estimated from data and may even be misspecified. Our key insight is to use an equal mixture of two "contrarian" probabilistic models in determining the importance of a covariate. One model is fit with the real data, while the other is fit using the same data, but with the covariate being tested replaced with samples from an estimate of the covariate distribution. CONTRA is flexible enough to achieve a power of 1 asymptotically, can reduce the FDR compared to state-of-the-art CVS methods when the covariate distribution is misspecified, and is computationally efficient in high dimensions and large sample sizes. We further demonstrate the effectiveness of CONTRA on numerous synthetic benchmarks, and highlight its capabilities on a genetic dataset.

Entities:  

Year:  2021        PMID: 34522887      PMCID: PMC8436172     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  8 in total

1.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

Review 2.  Coeliac disease: dissecting a complex inflammatory disorder.

Authors:  Ludvig M Sollid
Journal:  Nat Rev Immunol       Date:  2002-09       Impact factor: 53.106

3.  Multiple common variants for celiac disease influencing immune gene expression.

Authors:  Patrick C A Dubois; Gosia Trynka; Lude Franke; Karen A Hunt; Jihane Romanos; Alessandra Curtotti; Alexandra Zhernakova; Graham A R Heap; Róza Adány; Arpo Aromaa; Maria Teresa Bardella; Leonard H van den Berg; Nicholas A Bockett; Emilio G de la Concha; Bárbara Dema; Rudolf S N Fehrmann; Miguel Fernández-Arquero; Szilvia Fiatal; Elvira Grandone; Peter M Green; Harry J M Groen; Rhian Gwilliam; Roderick H J Houwen; Sarah E Hunt; Katri Kaukinen; Dermot Kelleher; Ilma Korponay-Szabo; Kalle Kurppa; Padraic MacMathuna; Markku Mäki; Maria Cristina Mazzilli; Owen T McCann; M Luisa Mearin; Charles A Mein; Muddassar M Mirza; Vanisha Mistry; Barbara Mora; Katherine I Morley; Chris J Mulder; Joseph A Murray; Concepción Núñez; Elvira Oosterom; Roel A Ophoff; Isabel Polanco; Leena Peltonen; Mathieu Platteel; Anna Rybak; Veikko Salomaa; Joachim J Schweizer; Maria Pia Sperandeo; Greetje J Tack; Graham Turner; Jan H Veldink; Wieke H M Verbeek; Rinse K Weersma; Victorien M Wolters; Elena Urcelay; Bozena Cukrowska; Luigi Greco; Susan L Neuhausen; Ross McManus; Donatella Barisani; Panos Deloukas; Jeffrey C Barrett; Paivi Saavalainen; Cisca Wijmenga; David A van Heel
Journal:  Nat Genet       Date:  2010-02-28       Impact factor: 38.330

4.  Association study of IL2/IL21 and FcgRIIa: significant association with the IL2/IL21 region in Scandinavian coeliac disease families.

Authors:  S Adamovic; S S Amundsen; B A Lie; A H Gudjónsdóttir; H Ascher; J Ek; D A van Heel; S Nilsson; L M Sollid; A Torinsson Naluai
Journal:  Genes Immun       Date:  2008-04-17       Impact factor: 2.676

5.  Newly identified genetic risk variants for celiac disease related to the immune response.

Authors:  Karen A Hunt; Alexandra Zhernakova; Graham Turner; Graham A R Heap; Lude Franke; Marcel Bruinenberg; Jihane Romanos; Lotte C Dinesen; Anthony W Ryan; Davinder Panesar; Rhian Gwilliam; Fumihiko Takeuchi; William M McLaren; Geoffrey K T Holmes; Peter D Howdle; Julian R F Walters; David S Sanders; Raymond J Playford; Gosia Trynka; Chris J J Mulder; M Luisa Mearin; Wieke H M Verbeek; Valerie Trimble; Fiona M Stevens; Colm O'Morain; Nicholas P Kennedy; Dermot Kelleher; Daniel J Pennington; David P Strachan; Wendy L McArdle; Charles A Mein; Martin C Wapenaar; Panos Deloukas; Ralph McGinnis; Ross McManus; Cisca Wijmenga; David A van Heel
Journal:  Nat Genet       Date:  2008-03-02       Impact factor: 38.330

6.  SNPrune: an efficient algorithm to prune large SNP array and sequence datasets based on high linkage disequilibrium.

Authors:  Mario P L Calus; Jérémie Vandenplas
Journal:  Genet Sel Evol       Date:  2018-06-26       Impact factor: 4.297

Review 7.  Chapter 11: Genome-wide association studies.

Authors:  William S Bush; Jason H Moore
Journal:  PLoS Comput Biol       Date:  2012-12-27       Impact factor: 4.475

8.  ROCS: receiver operating characteristic surface for class-skewed high-throughput data.

Authors:  Tianwei Yu
Journal:  PLoS One       Date:  2012-07-06       Impact factor: 3.240

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.