Literature DB >> 26722118

Utilizing the Jaccard index to reveal population stratification in sequencing data: a simulation study and an application to the 1000 Genomes Project.

Dmitry Prokopenko1, Julian Hecker1, Edwin K Silverman2, Marcello Pagano3, Markus M Nöthen4, Christian Dina5, Christoph Lange6, Heide Loehlein Fier7.   

Abstract

MOTIVATION: Population stratification is one of the major sources of confounding in genetic association studies, potentially causing false-positive and false-negative results. Here, we present a novel approach for the identification of population substructure in high-density genotyping data/next generation sequencing data. The approach exploits the co-appearances of rare genetic variants in individuals. The method can be applied to all available genetic loci and is computationally fast. Using sequencing data from the 1000 Genomes Project, the features of the approach are illustrated and compared to existing methodology (i.e. EIGENSTRAT). We examine the effects of different cutoffs for the minor allele frequency on the performance of the approach. We find that our approach works particularly well for genetic loci with very small minor allele frequencies. The results suggest that the inclusion of rare-variant data/sequencing data in our approach provides a much higher resolution picture of population substructure than it can be obtained with existing methodology. Furthermore, in simulation studies, we find scenarios where our method was able to control the type 1 error more precisely and showed higher power. CONTACT: dmitry.prokopenko@uni-bonn.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2015        PMID: 26722118      PMCID: PMC5860507          DOI: 10.1093/bioinformatics/btv752

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  31 in total

1.  Genomic control for association studies.

Authors:  B Devlin; K Roeder
Journal:  Biometrics       Date:  1999-12       Impact factor: 2.571

2.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

3.  Power of deep, all-exon resequencing for discovery of human trait genes.

Authors:  Gregory V Kryukov; Alexander Shpunt; John A Stamatoyannopoulos; Shamil R Sunyaev
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-06       Impact factor: 11.205

4.  Sparse principal component analysis for identifying ancestry-informative markers in genome-wide association studies.

Authors:  Seokho Lee; Michael P Epstein; Richard Duncan; Xihong Lin
Journal:  Genet Epidemiol       Date:  2012-04-16       Impact factor: 2.135

5.  A simple and improved correction for population stratification in case-control studies.

Authors:  Michael P Epstein; Andrew S Allen; Glen A Satten
Journal:  Am J Hum Genet       Date:  2007-03-29       Impact factor: 11.025

6.  Adjustment for population stratification via principal components in association analysis of rare variants.

Authors:  Yiwei Zhang; Weihua Guan; Wei Pan
Journal:  Genet Epidemiol       Date:  2012-10-12       Impact factor: 2.135

7.  Discovering genetic ancestry using spectral graph theory.

Authors:  Ann B Lee; Diana Luca; Lambertus Klei; Bernie Devlin; Kathryn Roeder
Journal:  Genet Epidemiol       Date:  2010-01       Impact factor: 2.135

8.  Population structure and eigenanalysis.

Authors:  Nick Patterson; Alkes L Price; David Reich
Journal:  PLoS Genet       Date:  2006-12       Impact factor: 5.917

9.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

10.  The UK10K project identifies rare variants in health and disease.

Authors:  Klaudia Walter; Josine L Min; Jie Huang; Lucy Crooks; Yasin Memari; Shane McCarthy; John R B Perry; ChangJiang Xu; Marta Futema; Daniel Lawson; Valentina Iotchkova; Stephan Schiffels; Audrey E Hendricks; Petr Danecek; Rui Li; James Floyd; Louise V Wain; Inês Barroso; Steve E Humphries; Matthew E Hurles; Eleftheria Zeggini; Jeffrey C Barrett; Vincent Plagnol; J Brent Richards; Celia M T Greenwood; Nicholas J Timpson; Richard Durbin; Nicole Soranzo
Journal:  Nature       Date:  2015-09-14       Impact factor: 49.962

View more
  15 in total

1.  Effect of population stratification on SNP-by-environment interaction.

Authors:  Jaehoon An; Sungho Won; Sharon M Lutz; Julian Hecker; Christoph Lange
Journal:  Genet Epidemiol       Date:  2019-08-20       Impact factor: 2.135

2.  Whole-Genome Sequencing in Severe Chronic Obstructive Pulmonary Disease.

Authors:  Dmitry Prokopenko; Phuwanat Sakornsakolpat; Heide Loehlein Fier; Dandi Qiao; Margaret M Parker; Merry-Lynn N McDonald; Ani Manichaikul; Stephen S Rich; R Graham Barr; Christopher J Williams; Mark L Brantly; Christoph Lange; Terri H Beaty; James D Crapo; Edwin K Silverman; Michael H Cho
Journal:  Am J Respir Cell Mol Biol       Date:  2018-11       Impact factor: 6.914

Review 3.  Genetic Advances in Chronic Obstructive Pulmonary Disease. Insights from COPDGene.

Authors:  Margaret F Ragland; Christopher J Benway; Sharon M Lutz; Russell P Bowler; Julian Hecker; John E Hokanson; James D Crapo; Peter J Castaldi; Dawn L DeMeo; Craig P Hersh; Brian D Hobbs; Christoph Lange; Terri H Beaty; Michael H Cho; Edwin K Silverman
Journal:  Am J Respir Crit Care Med       Date:  2019-09-15       Impact factor: 21.405

4.  A flexible and nearly optimal sequential testing approach to randomized testing: QUICK-STOP.

Authors:  Julian Hecker; Ingo Ruczinski; Michael H Cho; Edwin K Silverman; Brent Coull; Christoph Lange
Journal:  Genet Epidemiol       Date:  2019-11-11       Impact factor: 2.135

5.  Identification of genetic outliers due to sub-structure and cryptic relationships.

Authors:  Daniel Schlauch; Heide Fier; Christoph Lange
Journal:  Bioinformatics       Date:  2017-07-01       Impact factor: 6.937

6.  Unsupervised cluster analysis of SARS-CoV-2 genomes reflects its geographic progression and identifies distinct genetic subgroups of SARS-CoV-2 virus.

Authors:  Georg Hahn; Sanghun Lee; Scott T Weiss; Christoph Lange
Journal:  Genet Epidemiol       Date:  2021-01-08       Impact factor: 2.135

7.  Determining population stratification and subgroup effects in association studies of rare genetic variants for nicotine dependence.

Authors:  Ai-Ru Hsieh; Li-Shiun Chen; Ying-Ju Li; Cathy S J Fann
Journal:  Psychiatr Genet       Date:  2019-08       Impact factor: 2.458

8.  Fuzzy set-based generalized multifactor dimensionality reduction analysis of gene-gene interactions.

Authors:  Hye-Young Jung; Sangseob Leem; Taesung Park
Journal:  BMC Med Genomics       Date:  2018-04-20       Impact factor: 3.063

9.  Exploring the OncoGenomic Landscape of cancer.

Authors:  Lidia Mateo; Oriol Guitart-Pla; Miquel Duran-Frigola; Patrick Aloy
Journal:  Genome Med       Date:  2018-08-03       Impact factor: 11.117

10.  Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Authors:  Daniel Taliun; Daniel N Harris; Michael D Kessler; Jedidiah Carlson; Zachary A Szpiech; Raul Torres; Sarah A Gagliano Taliun; André Corvelo; Stephanie M Gogarten; Hyun Min Kang; Achilleas N Pitsillides; Jonathon LeFaive; Seung-Been Lee; Xiaowen Tian; Brian L Browning; Sayantan Das; Anne-Katrin Emde; Wayne E Clarke; Douglas P Loesch; Amol C Shetty; Thomas W Blackwell; Albert V Smith; Quenna Wong; Xiaoming Liu; Matthew P Conomos; Dean M Bobo; François Aguet; Christine Albert; Alvaro Alonso; Kristin G Ardlie; Dan E Arking; Stella Aslibekyan; Paul L Auer; John Barnard; R Graham Barr; Lucas Barwick; Lewis C Becker; Rebecca L Beer; Emelia J Benjamin; Lawrence F Bielak; John Blangero; Michael Boehnke; Donald W Bowden; Jennifer A Brody; Esteban G Burchard; Brian E Cade; James F Casella; Brandon Chalazan; Daniel I Chasman; Yii-Der Ida Chen; Michael H Cho; Seung Hoan Choi; Mina K Chung; Clary B Clish; Adolfo Correa; Joanne E Curran; Brian Custer; Dawood Darbar; Michelle Daya; Mariza de Andrade; Dawn L DeMeo; Susan K Dutcher; Patrick T Ellinor; Leslie S Emery; Celeste Eng; Diane Fatkin; Tasha Fingerlin; Lukas Forer; Myriam Fornage; Nora Franceschini; Christian Fuchsberger; Stephanie M Fullerton; Soren Germer; Mark T Gladwin; Daniel J Gottlieb; Xiuqing Guo; Michael E Hall; Jiang He; Nancy L Heard-Costa; Susan R Heckbert; Marguerite R Irvin; Jill M Johnsen; Andrew D Johnson; Robert Kaplan; Sharon L R Kardia; Tanika Kelly; Shannon Kelly; Eimear E Kenny; Douglas P Kiel; Robert Klemmer; Barbara A Konkle; Charles Kooperberg; Anna Köttgen; Leslie A Lange; Jessica Lasky-Su; Daniel Levy; Xihong Lin; Keng-Han Lin; Chunyu Liu; Ruth J F Loos; Lori Garman; Robert Gerszten; Steven A Lubitz; Kathryn L Lunetta; Angel C Y Mak; Ani Manichaikul; Alisa K Manning; Rasika A Mathias; David D McManus; Stephen T McGarvey; James B Meigs; Deborah A Meyers; Julie L Mikulla; Mollie A Minear; Braxton D Mitchell; Sanghamitra Mohanty; May E Montasser; Courtney Montgomery; Alanna C Morrison; Joanne M Murabito; Andrea Natale; Pradeep Natarajan; Sarah C Nelson; Kari E North; Jeffrey R O'Connell; Nicholette D Palmer; Nathan Pankratz; Gina M Peloso; Patricia A Peyser; Jacob Pleiness; Wendy S Post; Bruce M Psaty; D C Rao; Susan Redline; Alexander P Reiner; Dan Roden; Jerome I Rotter; Ingo Ruczinski; Chloé Sarnowski; Sebastian Schoenherr; David A Schwartz; Jeong-Sun Seo; Sudha Seshadri; Vivien A Sheehan; Wayne H Sheu; M Benjamin Shoemaker; Nicholas L Smith; Jennifer A Smith; Nona Sotoodehnia; Adrienne M Stilp; Weihong Tang; Kent D Taylor; Marilyn Telen; Timothy A Thornton; Russell P Tracy; David J Van Den Berg; Ramachandran S Vasan; Karine A Viaud-Martinez; Scott Vrieze; Daniel E Weeks; Bruce S Weir; Scott T Weiss; Lu-Chen Weng; Cristen J Willer; Yingze Zhang; Xutong Zhao; Donna K Arnett; Allison E Ashley-Koch; Kathleen C Barnes; Eric Boerwinkle; Stacey Gabriel; Richard Gibbs; Kenneth M Rice; Stephen S Rich; Edwin K Silverman; Pankaj Qasba; Weiniu Gan; George J Papanicolaou; Deborah A Nickerson; Sharon R Browning; Michael C Zody; Sebastian Zöllner; James G Wilson; L Adrienne Cupples; Cathy C Laurie; Cashell E Jaquish; Ryan D Hernandez; Timothy D O'Connor; Gonçalo R Abecasis
Journal:  Nature       Date:  2021-02-10       Impact factor: 69.504

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.