MOTIVATION: Genome-wide association studies (GWASs) have been widely used to map loci contributing to variation in complex traits and risk of diseases in humans. Accurate specification of familial relationships is crucial for family-based GWAS, as well as in population-based GWAS with unknown (or unrecognized) family structure. The family structure in a GWAS should be routinely investigated using the SNP data prior to the analysis of population structure or phenotype. Existing algorithms for relationship inference have a major weakness of estimating allele frequencies at each SNP from the entire sample, under a strong assumption of homogeneous population structure. This assumption is often untenable. RESULTS: Here, we present a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). We present simulation experiments to demonstrate that the algorithm has sufficient power to provide reliable inference on millions of unrelated pairs and thousands of relative pairs (up to 3rd-degree relationships). Application of our robust algorithm to HapMap and GWAS datasets demonstrates that it performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. Our extremely efficient implementation performs relationship inference on millions of pairs of individuals in a matter of minutes, dozens of times faster than the most efficient existing algorithm known to us. AVAILABILITY: Our robust relationship inference algorithm is implemented in a freely available software package, KING, available for download at http://people.virginia.edu/∼wc9c/KING.
MOTIVATION: Genome-wide association studies (GWASs) have been widely used to map loci contributing to variation in complex traits and risk of diseases in humans. Accurate specification of familial relationships is crucial for family-based GWAS, as well as in population-based GWAS with unknown (or unrecognized) family structure. The family structure in a GWAS should be routinely investigated using the SNP data prior to the analysis of population structure or phenotype. Existing algorithms for relationship inference have a major weakness of estimating allele frequencies at each SNP from the entire sample, under a strong assumption of homogeneous population structure. This assumption is often untenable. RESULTS: Here, we present a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). We present simulation experiments to demonstrate that the algorithm has sufficient power to provide reliable inference on millions of unrelated pairs and thousands of relative pairs (up to 3rd-degree relationships). Application of our robust algorithm to HapMap and GWAS datasets demonstrates that it performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. Our extremely efficient implementation performs relationship inference on millions of pairs of individuals in a matter of minutes, dozens of times faster than the most efficient existing algorithm known to us. AVAILABILITY: Our robust relationship inference algorithm is implemented in a freely available software package, KING, available for download at http://people.virginia.edu/∼wc9c/KING.
Authors: Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich Journal: Nat Genet Date: 2006-07-23 Impact factor: 38.330
Authors: Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham Journal: Am J Hum Genet Date: 2007-07-25 Impact factor: 11.025
Authors: Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher Journal: Nat Genet Date: 2010-06-20 Impact factor: 38.330
Authors: Kathleen A Daly; W Mark Brown; Fernando Segade; Donald W Bowden; Bronya J Keats; Bruce R Lindgren; Samuel C Levine; Stephen S Rich Journal: Am J Hum Genet Date: 2004-10-22 Impact factor: 11.025
Authors: Matthew P Conomos; Cecelia A Laurie; Adrienne M Stilp; Stephanie M Gogarten; Caitlin P McHugh; Sarah C Nelson; Tamar Sofer; Lindsay Fernández-Rhodes; Anne E Justice; Mariaelisa Graff; Kristin L Young; Amanda A Seyerle; Christy L Avery; Kent D Taylor; Jerome I Rotter; Gregory A Talavera; Martha L Daviglus; Sylvia Wassertheil-Smoller; Neil Schneiderman; Gerardo Heiss; Robert C Kaplan; Nora Franceschini; Alex P Reiner; John R Shaffer; R Graham Barr; Kathleen F Kerr; Sharon R Browning; Brian L Browning; Bruce S Weir; M Larissa Avilés-Santa; George J Papanicolaou; Thomas Lumley; Adam A Szpiro; Kari E North; Ken Rice; Timothy A Thornton; Cathy C Laurie Journal: Am J Hum Genet Date: 2016-01-07 Impact factor: 11.025
Authors: Jean Morrison; Cathy C Laurie; Mary L Marazita; Anne E Sanders; Steven Offenbacher; Christian R Salazar; Matthew P Conomos; Timothy Thornton; Deepti Jain; Cecelia A Laurie; Kathleen F Kerr; George Papanicolaou; Kent Taylor; Linda M Kaste; James D Beck; John R Shaffer Journal: Hum Mol Genet Date: 2015-12-11 Impact factor: 6.150
Authors: Mengyuan Kan; Paul L Auer; Gao T Wang; Kristine L Bucasas; Stanley Hooker; Alejandra Rodriguez; Biao Li; Jaclyn Ellis; L Adrienne Cupples; Yii-Der Ida Chen; Josée Dupuis; Caroline S Fox; Myron D Gross; Joshua D Smith; Nancy Heard-Costa; James B Meigs; James S Pankow; Jerome I Rotter; David Siscovick; James G Wilson; Jay Shendure; Rebecca Jackson; Ulrike Peters; Hua Zhong; Danyu Lin; Li Hsu; Nora Franceschini; Chris Carlson; Goncalo Abecasis; Stacey Gabriel; Michael J Bamshad; David Altshuler; Deborah A Nickerson; Kari E North; Leslie A Lange; Alexander P Reiner; Suzanne M Leal Journal: Eur J Hum Genet Date: 2016-01-13 Impact factor: 4.246
Authors: Lynsey S Hall; Mark J Adams; Aleix Arnau-Soler; Toni-Kim Clarke; David M Howard; Yanni Zeng; Gail Davies; Saskia P Hagenaars; Ana Maria Fernandez-Pujals; Jude Gibson; Eleanor M Wigmore; Thibaud S Boutin; Caroline Hayward; Generation Scotland; David J Porteous; Ian J Deary; Pippa A Thomson; Chris S Haley; Andrew M McIntosh Journal: Transl Psychiatry Date: 2018-01-10 Impact factor: 6.222
Authors: Kathryn J Ruddy; Daniel J Schaid; Ann H Partridge; Nicholas B Larson; Anthony Batzler; Lothar Häberle; Ralf Dittrich; Peter Widschwendter; Visnja Fink; Emanuel Bauer; Judith Schwitulla; Matthias Rübner; Arif B Ekici; Viktoria Aivazova-Fuchs; Elizabeth A Stewart; Matthias W Beckmann; Elizabeth Ginsburg; Liewei Wang; Richard M Weinshilboum; Fergus J Couch; Wolfgang Janni; Brigitte Rack; Celine Vachon; Peter A Fasching Journal: Fertil Steril Date: 2019-07-29 Impact factor: 7.329