Literature DB >> 26888080

Inference and Analysis of Population Structure Using Genetic Data and Network Theory.

Gili Greenbaum1, Alan R Templeton2, Shirli Bar-David3.   

Abstract

Clustering individuals to subpopulations based on genetic data has become commonplace in many genetic studies. Inference about population structure is most often done by applying model-based approaches, aided by visualization using distance-based approaches such as multidimensional scaling. While existing distance-based approaches suffer from a lack of statistical rigor, model-based approaches entail assumptions of prior conditions such as that the subpopulations are at Hardy-Weinberg equilibria. Here we present a distance-based approach for inference about population structure using genetic data by defining population structure using network theory terminology and methods. A network is constructed from a pairwise genetic-similarity matrix of all sampled individuals. The community partition, a partition of a network to dense subgraphs, is equated with population structure, a partition of the population to genetically related groups. Community-detection algorithms are used to partition the network into communities, interpreted as a partition of the population to subpopulations. The statistical significance of the structure can be estimated by using permutation tests to evaluate the significance of the partition's modularity, a network theory measure indicating the quality of community partitions. To further characterize population structure, a new measure of the strength of association (SA) for an individual to its assigned community is presented. The strength of association distribution (SAD) of the communities is analyzed to provide additional population structure characteristics, such as the relative amount of gene flow experienced by the different subpopulations and identification of hybrid individuals. Human genetic data and simulations are used to demonstrate the applicability of the analyses. The approach presented here provides a novel, computationally efficient model-free method for inference about population structure that does not entail assumption of prior conditions. The method is implemented in the software NetStruct (available at https://giligreenbaum.wordpress.com/software/).
Copyright © 2016 by the Genetics Society of America.

Entities:  

Keywords:  community detection; hierarchical population structure; modularity; subpopulations

Mesh:

Year:  2016        PMID: 26888080      PMCID: PMC4905528          DOI: 10.1534/genetics.115.182626

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


  39 in total

1.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

2.  Superparamagnetic clustering of data.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-04-29       Impact factor: 9.161

3.  Bayesian analysis of genetic differentiation between populations.

Authors:  Jukka Corander; Patrik Waldmann; Mikko J Sillanpää
Journal:  Genetics       Date:  2003-01       Impact factor: 4.562

4.  SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history.

Authors:  Guillaume Laval; Laurent Excoffier
Journal:  Bioinformatics       Date:  2004-04-29       Impact factor: 6.937

5.  Analysis of weighted networks.

Authors:  M E J Newman
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-11-24

6.  Uncovering the overlapping community structure of complex networks in nature and society.

Authors:  Gergely Palla; Imre Derényi; Illés Farkas; Tamás Vicsek
Journal:  Nature       Date:  2005-06-09       Impact factor: 49.962

7.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

8.  The neighbor-joining method: a new method for reconstructing phylogenetic trees.

Authors:  N Saitou; M Nei
Journal:  Mol Biol Evol       Date:  1987-07       Impact factor: 16.240

9.  Discovering genetic ancestry using spectral graph theory.

Authors:  Ann B Lee; Diana Luca; Lambertus Klei; Bernie Devlin; Kathryn Roeder
Journal:  Genet Epidemiol       Date:  2010-01       Impact factor: 2.135

10.  Detecting individual ancestry in the human genome.

Authors:  Andreas Wollstein; Oscar Lao
Journal:  Investig Genet       Date:  2015-05-01
View more
  11 in total

1.  CONE: Community Oriented Network Estimation Is a Versatile Framework for Inferring Population Structure in Large-Scale Sequencing Data.

Authors:  Markku O Kuismin; Jon Ahlinder; Mikko J Sillanpӓӓ
Journal:  G3 (Bethesda)       Date:  2017-10-05       Impact factor: 3.154

2.  Clustering of 770,000 genomes reveals post-colonial population structure of North America.

Authors:  Eunjung Han; Peter Carbonetto; Ross E Curtis; Yong Wang; Julie M Granka; Jake Byrnes; Keith Noto; Amir R Kermany; Natalie M Myres; Mathew J Barber; Kristin A Rand; Shiya Song; Theodore Roman; Erin Battat; Eyal Elyashiv; Harendra Guturu; Eurie L Hong; Kenneth G Chahine; Catherine A Ball
Journal:  Nat Commun       Date:  2017-02-07       Impact factor: 14.919

3.  Seascape genetics and biophysical connectivity modelling support conservation of the seagrass Zostera marina in the Skagerrak-Kattegat region of the eastern North Sea.

Authors:  Marlene Jahnke; Per R Jonsson; Per-Olav Moksnes; Lars-Ove Loo; Martin Nilsson Jacobi; Jeanine L Olsen
Journal:  Evol Appl       Date:  2018-01-26       Impact factor: 5.183

4.  GRAF-pop: A Fast Distance-Based Method To Infer Subject Ancestry from Multiple Genotype Datasets Without Principal Components Analysis.

Authors:  Yumi Jin; Alejandro A Schaffer; Michael Feolo; J Bradley Holmes; Brandi L Kattman
Journal:  G3 (Bethesda)       Date:  2019-08-08       Impact factor: 3.154

5.  Genome-Wide Association Study of Yield Component Traits in Intermediate Wheatgrass and Implications in Genomic Selection and Breeding.

Authors:  Prabin Bajgain; Xiaofei Zhang; James A Anderson
Journal:  G3 (Bethesda)       Date:  2019-08-08       Impact factor: 3.154

6.  Coalescent Theory of Migration Network Motifs.

Authors:  Nicolas Alcala; Amy Goldberg; Uma Ramakrishnan; Noah A Rosenberg
Journal:  Mol Biol Evol       Date:  2019-10-01       Impact factor: 16.240

7.  Network-based hierarchical population structure analysis for large genomic data sets.

Authors:  Gili Greenbaum; Amir Rubin; Alan R Templeton; Noah A Rosenberg
Journal:  Genome Res       Date:  2019-11-06       Impact factor: 9.043

8.  Life history and past demography maintain genetic structure, outcrossing rate, contemporary pollen gene flow of an understory herb in a highly fragmented rainforest.

Authors:  Pilar Suárez-Montes; Mariana Chávez-Pesqueira; Juan Núñez-Farfán
Journal:  PeerJ       Date:  2016-12-22       Impact factor: 3.061

9.  Detecting hierarchical levels of connectivity in a population of Acacia tortilis at the northern edge of the species' global distribution: Combining classical population genetics and network analyses.

Authors:  Yael S Rodger; Gili Greenbaum; Micha Silver; Shirli Bar-David; Gidon Winters
Journal:  PLoS One       Date:  2018-04-12       Impact factor: 3.240

10.  Conservation priorities for endangered Indian tigers through a genomic lens.

Authors:  Meghana Natesh; Goutham Atla; Parag Nigam; Yadvendradev V Jhala; Arun Zachariah; Udayan Borthakur; Uma Ramakrishnan
Journal:  Sci Rep       Date:  2017-08-29       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.