Literature DB >> 28830924

CONE: Community Oriented Network Estimation Is a Versatile Framework for Inferring Population Structure in Large-Scale Sequencing Data.

Markku O Kuismin1, Jon Ahlinder2, Mikko J Sillanpӓӓ3,4.   

Abstract

Estimation of genetic population structure based on molecular markers is a common task in population genetics and ecology. We apply a generalized linear model with LASSO regularization to infer relationships between individuals and populations from molecular marker data. Specifically, we apply a neighborhood selection algorithm to infer population genetic structure and gene flow between populations. The resulting relationships are used to construct an individual-level population graph. Different network substructures known as communities are then dissociated from each other using a community detection algorithm. Inference of population structure using networks combines the good properties of: (i) network theory (broad collection of tools, including aesthetically pleasing visualization), (ii) principal component analysis (dimension reduction together with simple visual inspection), and (iii) model-based methods (e.g., ancestry coefficient estimates). We have named our process CONE (for community oriented network estimation). CONE has fewer restrictions than conventional assignment methods in that properties such as the number of subpopulations need not be fixed before the analysis and the sample may include close relatives or involve uneven sampling. Applying CONE on simulated data sets resulted in more accurate estimates of the true number of subpopulations than model-based methods, and provided comparable ancestry coefficient estimates. Inference of empirical data sets of teosinte single nucleotide polymorphism, bacterial disease outbreak, and the human genome diversity panel illustrate that population structures estimated with CONE are consistent with the earlier findings.
Copyright © 2017 Kuismin et al.

Entities:  

Keywords:  community detection; graphical models; neighborhood selection; population genetic structure; population graph

Mesh:

Year:  2017        PMID: 28830924      PMCID: PMC5633386          DOI: 10.1534/g3.117.300131

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


  39 in total

1.  Inference of population structure using multilocus genotype data.

Authors:  J K Pritchard; M Stephens; P Donnelly
Journal:  Genetics       Date:  2000-06       Impact factor: 4.562

2.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies.

Authors:  Daniel Falush; Matthew Stephens; Jonathan K Pritchard
Journal:  Genetics       Date:  2003-08       Impact factor: 4.562

3.  The huge Package for High-dimensional Undirected Graph Estimation in R.

Authors:  Tuo Zhao; Han Liu; Kathryn Roeder; John Lafferty; Larry Wasserman
Journal:  J Mach Learn Res       Date:  2012-04       Impact factor: 3.654

4.  Principal components analysis corrects for stratification in genome-wide association studies.

Authors:  Alkes L Price; Nick J Patterson; Robert M Plenge; Michael E Weinblatt; Nancy A Shadick; David Reich
Journal:  Nat Genet       Date:  2006-07-23       Impact factor: 38.330

5.  CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure.

Authors:  Mattias Jakobsson; Noah A Rosenberg
Journal:  Bioinformatics       Date:  2007-05-07       Impact factor: 6.937

6.  Efficient methods to compute genomic predictions.

Authors:  P M VanRaden
Journal:  J Dairy Sci       Date:  2008-11       Impact factor: 4.034

7.  Fast and efficient estimation of individual ancestry coefficients.

Authors:  Eric Frichot; François Mathieu; Théo Trouillon; Guillaume Bouchard; Olivier François
Journal:  Genetics       Date:  2014-02-04       Impact factor: 4.562

8.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

9.  Detecting individual ancestry in the human genome.

Authors:  Andreas Wollstein; Oscar Lao
Journal:  Investig Genet       Date:  2015-05-01

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  2 in total

1.  Network-based hierarchical population structure analysis for large genomic data sets.

Authors:  Gili Greenbaum; Amir Rubin; Alan R Templeton; Noah A Rosenberg
Journal:  Genome Res       Date:  2019-11-06       Impact factor: 9.043

2.  Gap-com: general model selection criterion for sparse undirected gene networks with nontrivial community structure.

Authors:  Markku Kuismin; Fatemeh Dodangeh; Mikko J Sillanpää
Journal:  G3 (Bethesda)       Date:  2022-02-04       Impact factor: 3.542

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.