Literature DB >> 19930644

Iterative pruning PCA improves resolution of highly structured populations.

Apichart Intarapanich1, Philip J Shaw, Anunchai Assawamakin, Pongsakorn Wangkumhang, Chumpol Ngamphiw, Kridsadakorn Chaichoompu, Jittima Piriyapongsa, Sissades Tongsima.   

Abstract

BACKGROUND: Non-random patterns of genetic variation exist among individuals in a population owing to a variety of evolutionary factors. Therefore, populations are structured into genetically distinct subpopulations. As genotypic datasets become ever larger, it is increasingly difficult to correctly estimate the number of subpopulations and assign individuals to them. The computationally efficient non-parametric, chiefly Principal Components Analysis (PCA)-based methods are thus becoming increasingly relied upon for population structure analysis. Current PCA-based methods can accurately detect structure; however, the accuracy in resolving subpopulations and assigning individuals to them is wanting. When subpopulations are closely related to one another, they overlap in PCA space and appear as a conglomerate. This problem is exacerbated when some subpopulations in the dataset are genetically far removed from others. We propose a novel PCA-based framework which addresses this shortcoming.
RESULTS: A novel population structure analysis algorithm called iterative pruning PCA (ipPCA) was developed which assigns individuals to subpopulations and infers the total number of subpopulations present. Genotypic data from simulated and real population datasets with different degrees of structure were analyzed. For datasets with simple structures, the subpopulation assignments of individuals made by ipPCA were largely consistent with the STRUCTURE, BAPS and AWclust algorithms. On the other hand, highly structured populations containing many closely related subpopulations could be accurately resolved only by ipPCA, and not by other methods.
CONCLUSION: The algorithm is computationally efficient and not constrained by the dataset complexity. This systematic subpopulation assignment approach removes the need for prior population labels, which could be advantageous when cryptic stratification is encountered in datasets containing individuals otherwise assumed to belong to a homogenous population.

Entities:  

Mesh:

Year:  2009        PMID: 19930644      PMCID: PMC2790469          DOI: 10.1186/1471-2105-10-382

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  28 in total

1.  A genomewide association study of skin pigmentation in a South Asian population.

Authors:  Renee P Stokowski; P V Krishna Pant; Tony Dadd; Amelia Fereday; David A Hinds; Carl Jarman; Wendy Filsell; Rebecca S Ginger; Martin R Green; Frans J van der Ouderaa; David R Cox
Journal:  Am J Hum Genet       Date:  2007-10-15       Impact factor: 11.025

2.  GENOME: a rapid coalescent-based whole genome simulator.

Authors:  Liming Liang; Sebastian Zöllner; Gonçalo R Abecasis
Journal:  Bioinformatics       Date:  2007-04-25       Impact factor: 6.937

3.  On the use of general control samples for genome-wide association studies: genetic matching highlights causal variants.

Authors:  Diana Luca; Steven Ringquist; Lambertus Klei; Ann B Lee; Christian Gieger; H-Erich Wichmann; Stefan Schreiber; Michael Krawczak; Ying Lu; Alexis Styche; Bernie Devlin; Kathryn Roeder; Massimo Trucco
Journal:  Am J Hum Genet       Date:  2008-01-24       Impact factor: 11.025

4.  Recent genetic selection in the ancestral admixture of Puerto Ricans.

Authors:  Hua Tang; Shweta Choudhry; Rui Mei; Martin Morgan; William Rodriguez-Cintron; Esteban González Burchard; Neil J Risch
Journal:  Am J Hum Genet       Date:  2007-08-01       Impact factor: 11.025

5.  Principal component analysis of genetic data.

Authors:  David Reich; Alkes L Price; Nick Patterson
Journal:  Nat Genet       Date:  2008-05       Impact factor: 38.330

6.  Worldwide human relationships inferred from genome-wide patterns of variation.

Authors:  Jun Z Li; Devin M Absher; Hua Tang; Audrey M Southwick; Amanda M Casto; Sohini Ramachandran; Howard M Cann; Gregory S Barsh; Marcus Feldman; Luigi L Cavalli-Sforza; Richard M Myers
Journal:  Science       Date:  2008-02-22       Impact factor: 47.728

7.  A genome-wide association study of psoriasis and psoriatic arthritis identifies new disease loci.

Authors:  Ying Liu; Cynthia Helms; Wilson Liao; Lisa C Zaba; Shenghui Duan; Jennifer Gardner; Carol Wise; Andrew Miner; M J Malloy; Clive R Pullinger; John P Kane; Scott Saccone; Jane Worthington; Ian Bruce; Pui-Yan Kwok; Alan Menter; James Krueger; Anne Barton; Nancy L Saccone; Anne M Bowcock
Journal:  PLoS Genet       Date:  2008-03-28       Impact factor: 5.917

8.  Analysis and application of European genetic substructure using 300 K SNP information.

Authors:  Chao Tian; Robert M Plenge; Michael Ransom; Annette Lee; Pablo Villoslada; Carlo Selmi; Lars Klareskog; Ann E Pulver; Lihong Qi; Peter K Gregersen; Michael F Seldin
Journal:  PLoS Genet       Date:  2008-01       Impact factor: 5.917

9.  AWclust: point-and-click software for non-parametric population structure analysis.

Authors:  Xiaoyi Gao; Joshua D Starmer
Journal:  BMC Bioinformatics       Date:  2008-01-31       Impact factor: 3.169

10.  A genome-wide association study identifies novel alleles associated with hair color and skin pigmentation.

Authors:  Jiali Han; Peter Kraft; Hongmei Nan; Qun Guo; Constance Chen; Abrar Qureshi; Susan E Hankinson; Frank B Hu; David L Duffy; Zhen Zhen Zhao; Nicholas G Martin; Grant W Montgomery; Nicholas K Hayward; Gilles Thomas; Robert N Hoover; Stephen Chanock; David J Hunter
Journal:  PLoS Genet       Date:  2008-05-16       Impact factor: 5.917

View more
  10 in total

1.  A comparison of DMET Plus microarray and genome-wide technologies by assessing population substructure.

Authors:  Jami N Jackson; Kevin M Long; Yijing He; Alison A Motsinger-Reif; Howard L McLeod; John Jack
Journal:  Pharmacogenet Genomics       Date:  2016-04       Impact factor: 2.089

Review 2.  Nonparametric approaches for population structure analysis.

Authors:  Luluah Alhusain; Alaaeldin M Hafez
Journal:  Hum Genomics       Date:  2018-05-09       Impact factor: 4.639

3.  Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure.

Authors:  Tulaya Limpiti; Apichart Intarapanich; Anunchai Assawamakin; Philip J Shaw; Pongsakorn Wangkumhang; Jittima Piriyapongsa; Chumpol Ngamphiw; Sissades Tongsima
Journal:  BMC Bioinformatics       Date:  2011-06-23       Impact factor: 3.169

4.  HaploPOP: a software that improves population assignment by combining markers into haplotypes.

Authors:  Nicolas Duforet-Frebourg; Lucie M Gattepaille; Michael G B Blum; Mattias Jakobsson
Journal:  BMC Bioinformatics       Date:  2015-07-31       Impact factor: 3.169

Review 5.  Challenges in analysis and interpretation of microsatellite data for population genetic studies.

Authors:  Alexander I Putman; Ignazio Carbone
Journal:  Ecol Evol       Date:  2014-10-30       Impact factor: 2.912

6.  IPCAPS: an R package for iterative pruning to capture population structure.

Authors:  Kridsadakorn Chaichoompu; Fentaw Abegaz; Sissades Tongsima; Philip James Shaw; Anavaj Sakuntabhai; Luísa Pereira; Kristel Van Steen
Journal:  Source Code Biol Med       Date:  2019-03-20

7.  SHIPS: Spectral Hierarchical clustering for the Inference of Population Structure in genetic studies.

Authors:  Matthieu Bouaziz; Caroline Paccard; Mickael Guedj; Christophe Ambroise
Journal:  PLoS One       Date:  2012-10-12       Impact factor: 3.240

Review 8.  Softwares and methods for estimating genetic ancestry in human populations.

Authors:  Yushi Liu; Toru Nyunoya; Shuguang Leng; Steven A Belinsky; Yohannes Tesfaigzi; Shannon Bruse
Journal:  Hum Genomics       Date:  2013-01-05       Impact factor: 4.639

9.  NetView: a high-definition network-visualization approach to detect fine-scale population structures from genome-wide patterns of variation.

Authors:  Markus Neuditschko; Mehar S Khatkar; Herman W Raadsma
Journal:  PLoS One       Date:  2012-10-31       Impact factor: 3.240

10.  Insight into the peopling of Mainland Southeast Asia from Thai population genetic structure.

Authors:  Pongsakorn Wangkumhang; Philip James Shaw; Kridsadakorn Chaichoompu; Chumpol Ngamphiw; Anunchai Assawamakin; Manit Nuinoon; Orapan Sripichai; Saovaros Svasti; Suthat Fucharoen; Verayuth Praphanphoj; Sissades Tongsima
Journal:  PLoS One       Date:  2013-11-04       Impact factor: 3.240

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.