Literature DB >> 30692194

Adjusting for Principal Components of Molecular Phenotypes Induces Replicating False Positives.

Andy Dahl1, Vincent Guillemot2, Joel Mefford3, Hugues Aschard2,4, Noah Zaitlen1.   

Abstract

High-throughput measurements of molecular phenotypes provide an unprecedented opportunity to model cellular processes and their impact on disease. These highly structured datasets are usually strongly confounded, creating false positives and reducing power. This has motivated many approaches based on principal components analysis (PCA) to estimate and correct for confounders, which have become indispensable elements of association tests between molecular phenotypes and both genetic and nongenetic factors. Here, we show that these correction approaches induce a bias, and that it persists for large sample sizes and replicates out-of-sample. We prove this theoretically for PCA by deriving an analytic, deterministic, and intuitive bias approximation. We assess other methods with realistic simulations, which show that perturbing any of several basic parameters can cause false positive rate (FPR) inflation. Our experiments show the bias depends on covariate and confounder sparsity, effect sizes, and their correlation. Surprisingly, when the covariate and confounder have [Formula: see text], standard two-step methods all have [Formula: see text]-fold FPR inflation. Our analysis informs best practices for confounder correction in genomic studies, and suggests many false discoveries have been made and replicated in some differential expression analyses.
Copyright © 2019 by the Genetics Society of America.

Keywords:  confounder; eigenvector perturbation; molecular trait; quantitative trait loci

Mesh:

Year:  2019        PMID: 30692194      PMCID: PMC6456307          DOI: 10.1534/genetics.118.301768

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


  51 in total

1.  Using control genes to correct for unwanted variation in microarray data.

Authors:  Johann A Gagnon-Bartsch; Terence P Speed
Journal:  Biostatistics       Date:  2011-11-17       Impact factor: 5.899

Review 2.  Computational and analytical challenges in single-cell transcriptomics.

Authors:  Oliver Stegle; Sarah A Teichmann; John C Marioni
Journal:  Nat Rev Genet       Date:  2015-01-28       Impact factor: 53.242

3.  Gene expression profiling predicts clinical outcome of breast cancer.

Authors:  Laura J van 't Veer; Hongyue Dai; Marc J van de Vijver; Yudong D He; Augustinus A M Hart; Mao Mao; Hans L Peterse; Karin van der Kooy; Matthew J Marton; Anke T Witteveen; George J Schreiber; Ron M Kerkhoven; Chris Roberts; Peter S Linsley; René Bernards; Stephen H Friend
Journal:  Nature       Date:  2002-01-31       Impact factor: 49.962

Review 4.  Tackling the widespread and critical impact of batch effects in high-throughput data.

Authors:  Jeffrey T Leek; Robert B Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W Evan Johnson; Donald Geman; Keith Baggerly; Rafael A Irizarry
Journal:  Nat Rev Genet       Date:  2010-09-14       Impact factor: 53.242

5.  Mediation analysis demonstrates that trans-eQTLs are often explained by cis-mediation: a genome-wide analysis among 1,800 South Asians.

Authors:  Brandon L Pierce; Lin Tong; Lin S Chen; Ronald Rahaman; Maria Argos; Farzana Jasmine; Shantanu Roy; Rachelle Paul-Brutus; Harm-Jan Westra; Lude Franke; Tonu Esko; Rakibuz Zaman; Tariqul Islam; Mahfuzar Rahman; John A Baron; Muhammad G Kibriya; Habibul Ahsan
Journal:  PLoS Genet       Date:  2014-12-04       Impact factor: 5.917

6.  DNA methylation arrays as surrogate measures of cell mixture distribution.

Authors:  Eugene Andres Houseman; William P Accomando; Devin C Koestler; Brock C Christensen; Carmen J Marsit; Heather H Nelson; John K Wiencke; Karl T Kelsey
Journal:  BMC Bioinformatics       Date:  2012-05-08       Impact factor: 3.169

7.  Capturing heterogeneity in gene expression studies by surrogate variable analysis.

Authors:  Jeffrey T Leek; John D Storey
Journal:  PLoS Genet       Date:  2007-08-01       Impact factor: 5.917

8.  Transcriptome and genome sequencing uncovers functional variation in humans.

Authors:  Tuuli Lappalainen; Michael Sammeth; Marc R Friedländer; Peter A C 't Hoen; Jean Monlong; Manuel A Rivas; Mar Gonzàlez-Porta; Natalja Kurbatova; Thasso Griebel; Pedro G Ferreira; Matthias Barann; Thomas Wieland; Liliana Greger; Maarten van Iterson; Jonas Almlöf; Paolo Ribeca; Irina Pulyakhina; Daniela Esser; Thomas Giger; Andrew Tikhonov; Marc Sultan; Gabrielle Bertier; Daniel G MacArthur; Monkol Lek; Esther Lizano; Henk P J Buermans; Ismael Padioleau; Thomas Schwarzmayr; Olof Karlberg; Halit Ongen; Helena Kilpinen; Sergi Beltran; Marta Gut; Katja Kahlem; Vyacheslav Amstislavskiy; Oliver Stegle; Matti Pirinen; Stephen B Montgomery; Peter Donnelly; Mark I McCarthy; Paul Flicek; Tim M Strom; Hans Lehrach; Stefan Schreiber; Ralf Sudbrak; Angel Carracedo; Stylianos E Antonarakis; Robert Häsler; Ann-Christine Syvänen; Gert-Jan van Ommen; Alvis Brazma; Thomas Meitinger; Philip Rosenstiel; Roderic Guigó; Ivo G Gut; Xavier Estivill; Emmanouil T Dermitzakis
Journal:  Nature       Date:  2013-09-15       Impact factor: 49.962

9.  Genetics of single-cell protein abundance variation in large yeast populations.

Authors:  Frank W Albert; Sebastian Treusch; Arthur H Shockley; Joshua S Bloom; Leonid Kruglyak
Journal:  Nature       Date:  2014-01-08       Impact factor: 49.962

10.  Genetic effects on gene expression across human tissues.

Authors:  Alexis Battle; Christopher D Brown; Barbara E Engelhardt; Stephen B Montgomery
Journal:  Nature       Date:  2017-10-11       Impact factor: 49.962

View more
  7 in total

1.  Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation.

Authors:  Josine L Min; Gibran Hemani; Eilis Hannon; Koen F Dekkers; Juan Castillo-Fernandez; René Luijk; Elena Carnero-Montoro; Daniel J Lawson; Kimberley Burrows; Matthew Suderman; Andrew D Bretherick; Tom G Richardson; Johanna Klughammer; Valentina Iotchkova; Gemma Sharp; Ahmad Al Khleifat; Aleksey Shatunov; Alfredo Iacoangeli; Wendy L McArdle; Karen M Ho; Ashish Kumar; Cilla Söderhäll; Carolina Soriano-Tárraga; Eva Giralt-Steinhauer; Nabila Kazmi; Dan Mason; Allan F McRae; David L Corcoran; Karen Sugden; Silva Kasela; Alexia Cardona; Felix R Day; Giovanni Cugliari; Clara Viberti; Simonetta Guarrera; Michael Lerro; Richa Gupta; Sailalitha Bollepalli; Pooja Mandaviya; Yanni Zeng; Toni-Kim Clarke; Rosie M Walker; Vanessa Schmoll; Darina Czamara; Carlos Ruiz-Arenas; Faisal I Rezwan; Riccardo E Marioni; Tian Lin; Yvonne Awaloff; Marine Germain; Dylan Aïssi; Ramona Zwamborn; Kristel van Eijk; Annelot Dekker; Jenny van Dongen; Jouke-Jan Hottenga; Gonneke Willemsen; Cheng-Jian Xu; Guillermo Barturen; Francesc Català-Moll; Martin Kerick; Carol Wang; Phillip Melton; Hannah R Elliott; Jean Shin; Manon Bernard; Idil Yet; Melissa Smart; Tyler Gorrie-Stone; Chris Shaw; Ammar Al Chalabi; Susan M Ring; Göran Pershagen; Erik Melén; Jordi Jiménez-Conde; Jaume Roquer; Deborah A Lawlor; John Wright; Nicholas G Martin; Grant W Montgomery; Terrie E Moffitt; Richie Poulton; Tõnu Esko; Lili Milani; Andres Metspalu; John R B Perry; Ken K Ong; Nicholas J Wareham; Giuseppe Matullo; Carlotta Sacerdote; Salvatore Panico; Avshalom Caspi; Louise Arseneault; France Gagnon; Miina Ollikainen; Jaakko Kaprio; Janine F Felix; Fernando Rivadeneira; Henning Tiemeier; Marinus H van IJzendoorn; André G Uitterlinden; Vincent W V Jaddoe; Chris Haley; Andrew M McIntosh; Kathryn L Evans; Alison Murray; Katri Räikkönen; Jari Lahti; Ellen A Nohr; Thorkild I A Sørensen; Torben Hansen; Camilla S Morgen; Elisabeth B Binder; Susanne Lucae; Juan Ramon Gonzalez; Mariona Bustamante; Jordi Sunyer; John W Holloway; Wilfried Karmaus; Hongmei Zhang; Ian J Deary; Naomi R Wray; John M Starr; Marian Beekman; Diana van Heemst; P Eline Slagboom; Pierre-Emmanuel Morange; David-Alexandre Trégouët; Jan H Veldink; Gareth E Davies; Eco J C de Geus; Dorret I Boomsma; Judith M Vonk; Bert Brunekreef; Gerard H Koppelman; Marta E Alarcón-Riquelme; Rae-Chi Huang; Craig E Pennell; Joyce van Meurs; M Arfan Ikram; Alun D Hughes; Therese Tillin; Nish Chaturvedi; Zdenka Pausova; Tomas Paus; Timothy D Spector; Meena Kumari; Leonard C Schalkwyk; Peter M Visscher; George Davey Smith; Christoph Bock; Tom R Gaunt; Jordana T Bell; Bastiaan T Heijmans; Jonathan Mill; Caroline L Relton
Journal:  Nat Genet       Date:  2021-09-06       Impact factor: 41.307

2.  Genetic effects on the commensal microbiota in inflammatory bowel disease patients.

Authors:  Hugues Aschard; Vincent Laville; Eric Tchetgen Tchetgen; Dan Knights; Floris Imhann; Philippe Seksik; Noah Zaitlen; Mark S Silverberg; Jacques Cosnes; Rinse K Weersma; Ramnik Xavier; Laurent Beaugerie; David Skurnik; Harry Sokol
Journal:  PLoS Genet       Date:  2019-03-08       Impact factor: 5.917

3.  Cell-Type Heterogeneity in Adipose Tissue Is Associated with Complex Traits and Reveals Disease-Relevant Cell-Specific eQTLs.

Authors:  Craig A Glastonbury; Alexessander Couto Alves; Julia S El-Sayed Moustafa; Kerrin S Small
Journal:  Am J Hum Genet       Date:  2019-05-23       Impact factor: 11.025

4.  Genetic regulation of gene expression and splicing during a 10-year period of human aging.

Authors:  Brunilda Balliu; Matthew Durrant; Olivia de Goede; Nathan Abell; Xin Li; Boxiang Liu; Michael J Gloudemans; Naomi L Cook; Kevin S Smith; David A Knowles; Mauro Pala; Francesco Cucca; David Schlessinger; Siddhartha Jaiswal; Chiara Sabatti; Lars Lind; Erik Ingelsson; Stephen B Montgomery
Journal:  Genome Biol       Date:  2019-11-04       Impact factor: 13.583

5.  Co-expression analysis reveals interpretable gene modules controlled by trans-acting genetic variants.

Authors:  Liis Kolberg; Nurlan Kerimov; Hedi Peterson; Kaur Alasoo
Journal:  Elife       Date:  2020-09-03       Impact factor: 8.140

6.  GBAT: a gene-based association test for robust detection of trans-gene regulation.

Authors:  Xuanyao Liu; Joel A Mefford; Andrew Dahl; Yuan He; Meena Subramaniam; Alexis Battle; Alkes L Price; Noah Zaitlen
Journal:  Genome Biol       Date:  2020-08-24       Impact factor: 13.583

7.  Significant out-of-sample classification from methylation profile scoring for amyotrophic lateral sclerosis.

Authors:  Jian Yang; Ian P Blair; Allan F McRae; Naomi R Wray; Marta F Nabais; Tian Lin; Beben Benyamin; Kelly L Williams; Fleur C Garton; Anna A E Vinkhuyzen; Futao Zhang; Costanza L Vallerga; Restuadi Restuadi; Anna Freydenzon; Ramona A J Zwamborn; Paul J Hop; Matthew R Robinson; Jacob Gratten; Peter M Visscher; Eilis Hannon; Jonathan Mill; Matthew A Brown; Nigel G Laing; Karen A Mather; Perminder S Sachdev; Shyuan T Ngo; Frederik J Steyn; Leanne Wallace; Anjali K Henders; Merrilee Needham; Jan H Veldink; Susan Mathers; Garth Nicholson; Dominic B Rowe; Robert D Henderson; Pamela A McCombe; Roger Pamphlett
Journal:  NPJ Genom Med       Date:  2020-02-27       Impact factor: 8.617

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.