Literature DB >> 28334167

Identification of genetic outliers due to sub-structure and cryptic relationships.

Daniel Schlauch1,2, Heide Fier1,3, Christoph Lange1,4.   

Abstract

MOTIVATION: In order to minimize the effects of genetic confounding on the analysis of high-throughput genetic association studies, e.g. (whole-genome) sequencing (WGS) studies, genome-wide association studies (GWAS), etc., we propose a general framework to assess and to test formally for genetic heterogeneity among study subjects. As the approach fully utilizes the recent ancestor information captured by rare variants, it is especially powerful in WGS studies. Even for relatively moderate sample sizes, the proposed testing framework is able to identify study subjects that are genetically too similar, e.g. cryptic relationships, or that are genetically too different, e.g. population substructure. The approach is computationally fast, enabling the application to whole-genome sequencing data, and straightforward to implement.
RESULTS: Simulation studies illustrate the overall performance of our approach. In an application to the 1000 Genomes Project, we outline an analysis/cleaning pipeline that utilizes our approach to formally assess whether study subjects are related and whether population substructure is present. In the analysis of the 1000 Genomes Project data, our approach revealed subjects that are most likely related, but had previously passed standard qc-filters.
AVAILABILITY AND IMPLEMENTATION: An implementation of our method, Similarity Test for Estimating Genetic Outliers (STEGO), is available in the R package stego from Github at https://github.com/dschlauch/stego . CONTACT: dschlauch@fas.harvard.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2017        PMID: 28334167      PMCID: PMC5870703          DOI: 10.1093/bioinformatics/btx109

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  27 in total

1.  Association studies for quantitative traits in structured populations.

Authors:  Silviu-Alin Bacanu; Bernie Devlin; Kathryn Roeder
Journal:  Genet Epidemiol       Date:  2002-01       Impact factor: 2.135

2.  Improved linear mixed models for genome-wide association studies.

Authors:  Jennifer Listgarten; Christoph Lippert; Carl M Kadie; Robert I Davidson; Eleazar Eskin; David Heckerman
Journal:  Nat Methods       Date:  2012-05-30       Impact factor: 28.547

3.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

4.  Long-range LD can confound genome scans in admixed populations.

Authors:  Alkes L Price; Michael E Weale; Nick Patterson; Simon R Myers; Anna C Need; Kevin V Shianna; Dongliang Ge; Jerome I Rotter; Esther Torres; Kent D Taylor; David B Goldstein; David Reich
Journal:  Am J Hum Genet       Date:  2008-07       Impact factor: 11.025

5.  FaST linear mixed models for genome-wide association studies.

Authors:  Christoph Lippert; Jennifer Listgarten; Ying Liu; Carl M Kadie; Robert I Davidson; David Heckerman
Journal:  Nat Methods       Date:  2011-09-04       Impact factor: 28.547

6.  Common SNPs explain a large proportion of the heritability for human height.

Authors:  Jian Yang; Beben Benyamin; Brian P McEvoy; Scott Gordon; Anjali K Henders; Dale R Nyholt; Pamela A Madden; Andrew C Heath; Nicholas G Martin; Grant W Montgomery; Michael E Goddard; Peter M Visscher
Journal:  Nat Genet       Date:  2010-06-20       Impact factor: 38.330

7.  Case-control association testing in the presence of unknown relationships.

Authors:  Yoonha Choi; Ellen M Wijsman; Bruce S Weir
Journal:  Genet Epidemiol       Date:  2009-12       Impact factor: 2.135

8.  Population structure and eigenanalysis.

Authors:  Nick Patterson; Alkes L Price; David Reich
Journal:  PLoS Genet       Date:  2006-12       Impact factor: 5.917

9.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  3 in total

1.  Effect of population stratification on SNP-by-environment interaction.

Authors:  Jaehoon An; Sungho Won; Sharon M Lutz; Julian Hecker; Christoph Lange
Journal:  Genet Epidemiol       Date:  2019-08-20       Impact factor: 2.135

2.  Unsupervised cluster analysis of SARS-CoV-2 genomes reflects its geographic progression and identifies distinct genetic subgroups of SARS-CoV-2 virus.

Authors:  Georg Hahn; Sanghun Lee; Scott T Weiss; Christoph Lange
Journal:  Genet Epidemiol       Date:  2021-01-08       Impact factor: 2.135

3.  Genome-wide association analysis of COVID-19 mortality risk in SARS-CoV-2 genomes identifies mutation in the SARS-CoV-2 spike protein that colocalizes with P.1 of the Brazilian strain.

Authors:  Georg Hahn; Chloe M Wu; Sanghun Lee; Sharon M Lutz; Surender Khurana; Lindsey R Baden; Sebastien Haneuse; Dandi Qiao; Julian Hecker; Dawn L DeMeo; Rudolph E Tanzi; Manish C Choudhary; Behzad Etemad; Abbas Mohammadi; Elmira Esmaeilzadeh; Michael H Cho; Jonathan Z Li; Adrienne G Randolph; Nan M Laird; Scott T Weiss; Edwin K Silverman; Katharina Ribbeck; Christoph Lange
Journal:  Genet Epidemiol       Date:  2021-06-22       Impact factor: 2.344

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.