Literature DB >> 25012181

Fast spatial ancestry via flexible allele frequency surfaces.

John Michael Rañola1, John Novembre1, Kenneth Lange1.   

Abstract

MOTIVATION: Unique modeling and computational challenges arise in locating the geographic origin of individuals based on their genetic backgrounds. Single-nucleotide polymorphisms (SNPs) vary widely in informativeness, allele frequencies change non-linearly with geography and reliable localization requires evidence to be integrated across a multitude of SNPs. These problems become even more acute for individuals of mixed ancestry. It is hardly surprising that matching genetic models to computational constraints has limited the development of methods for estimating geographic origins. We attack these related problems by borrowing ideas from image processing and optimization theory. Our proposed model divides the region of interest into pixels and operates SNP by SNP. We estimate allele frequencies across the landscape by maximizing a product of binomial likelihoods penalized by nearest neighbor interactions. Penalization smooths allele frequency estimates and promotes estimation at pixels with no data. Maximization is accomplished by a minorize-maximize (MM) algorithm. Once allele frequency surfaces are available, one can apply Bayes' rule to compute the posterior probability that each pixel is the pixel of origin of a given person. Placement of admixed individuals on the landscape is more complicated and requires estimation of the fractional contribution of each pixel to a person's genome. This estimation problem also succumbs to a penalized MM algorithm.
RESULTS: We applied the model to the Population Reference Sample (POPRES) data. The model gives better localization for both unmixed and admixed individuals than existing methods despite using just a small fraction of the available SNPs. Computing times are comparable with the best competing software.
AVAILABILITY AND IMPLEMENTATION: Software will be freely available as the OriGen package in R. CONTACT: ranolaj@uw.edu or klange@ucla.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2014        PMID: 25012181      PMCID: PMC4184261          DOI: 10.1093/bioinformatics/btu418

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Informativeness of genetic markers for inference of ancestry.

Authors:  Noah A Rosenberg; Lei M Li; Ryk Ward; Jonathan K Pritchard
Journal:  Am J Hum Genet       Date:  2003-11-20       Impact factor: 11.025

2.  The Stepping Stone Model of Population Structure and the Decrease of Genetic Correlation with Distance.

Authors:  M Kimura; G H Weiss
Journal:  Genetics       Date:  1964-04       Impact factor: 4.562

3.  Convergence of EM image reconstruction algorithms with Gibbs smoothing.

Authors:  K Lange
Journal:  IEEE Trans Med Imaging       Date:  1990       Impact factor: 10.048

4.  Correlation between genetic and geographic structure in Europe.

Authors:  Oscar Lao; Timothy T Lu; Michael Nothnagel; Olaf Junge; Sandra Freitag-Wolf; Amke Caliebe; Miroslava Balascakova; Jaume Bertranpetit; Laurence A Bindoff; David Comas; Gunilla Holmlund; Anastasia Kouvatsi; Milan Macek; Isabelle Mollet; Walther Parson; Jukka Palo; Rafal Ploski; Antti Sajantila; Adriano Tagliabracci; Ulrik Gether; Thomas Werge; Fernando Rivadeneira; Albert Hofman; André G Uitterlinden; Christian Gieger; Heinz-Erich Wichmann; Andreas Rüther; Stefan Schreiber; Christian Becker; Peter Nürnberg; Matthew R Nelson; Michael Krawczak; Manfred Kayser
Journal:  Curr Biol       Date:  2008-08-07       Impact factor: 10.834

5.  The Population Reference Sample, POPRES: a resource for population, disease, and pharmacological genetics research.

Authors:  Matthew R Nelson; Katarzyna Bryc; Karen S King; Amit Indap; Adam R Boyko; John Novembre; Linda P Briley; Yuka Maruyama; Dawn M Waterworth; Gérard Waeber; Peter Vollenweider; Jorge R Oksenberg; Stephen L Hauser; Heide A Stirnadel; Jaspal S Kooner; John C Chambers; Brendan Jones; Vincent Mooser; Carlos D Bustamante; Allen D Roses; Daniel K Burns; Margaret G Ehm; Eric H Lai
Journal:  Am J Hum Genet       Date:  2008-08-28       Impact factor: 11.025

6.  A model-based approach for analysis of spatial structure in genetic data.

Authors:  Wen-Yun Yang; John Novembre; Eleazar Eskin; Eran Halperin
Journal:  Nat Genet       Date:  2012-05-20       Impact factor: 38.330

7.  A quasi-Newton acceleration for high-dimensional optimization algorithms.

Authors:  Hua Zhou; David Alexander; Kenneth Lange
Journal:  Stat Comput       Date:  2011-01-04       Impact factor: 2.559

8.  Assigning African elephant DNA to geographic region of origin: applications to the ivory trade.

Authors:  Samuel K Wasser; Andrew M Shedlock; Kenine Comstock; Elaine A Ostrander; Benezeth Mutayoba; Matthew Stephens
Journal:  Proc Natl Acad Sci U S A       Date:  2004-09-30       Impact factor: 11.205

9.  Genes mirror geography within Europe.

Authors:  John Novembre; Toby Johnson; Katarzyna Bryc; Zoltán Kutalik; Adam R Boyko; Adam Auton; Amit Indap; Karen S King; Sven Bergmann; Matthew R Nelson; Matthew Stephens; Carlos D Bustamante
Journal:  Nature       Date:  2008-08-31       Impact factor: 49.962

10.  Enhancements to the ADMIXTURE algorithm for individual ancestry estimation.

Authors:  David H Alexander; Kenneth Lange
Journal:  BMC Bioinformatics       Date:  2011-06-18       Impact factor: 3.307

View more
  2 in total

1.  Novel probabilistic models of spatial genetic ancestry with applications to stratification correction in genome-wide association studies.

Authors:  Anand Bhaskar; Adel Javanmard; Thomas A Courtade; David Tse
Journal:  Bioinformatics       Date:  2017-03-15       Impact factor: 6.937

2.  Predicting geographic location from genetic variation with deep neural networks.

Authors:  C J Battey; Peter L Ralph; Andrew D Kern
Journal:  Elife       Date:  2020-06-08       Impact factor: 8.140

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.