Literature DB >> 21589856

Genetic classification of populations using supervised learning.

Michael Bridges1, Elizabeth A Heron, Colm O'Dushlaine, Ricardo Segurado, Derek Morris, Aiden Corvin, Michael Gill, Carlos Pinto.   

Abstract

There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case-control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available.In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies.

Entities:  

Mesh:

Year:  2011        PMID: 21589856      PMCID: PMC3093382          DOI: 10.1371/journal.pone.0014802

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  11 in total

1.  PLINK: a tool set for whole-genome association and population-based linkage analyses.

Authors:  Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham
Journal:  Am J Hum Genet       Date:  2007-07-25       Impact factor: 11.025

2.  Correlation between genetic and geographic structure in Europe.

Authors:  Oscar Lao; Timothy T Lu; Michael Nothnagel; Olaf Junge; Sandra Freitag-Wolf; Amke Caliebe; Miroslava Balascakova; Jaume Bertranpetit; Laurence A Bindoff; David Comas; Gunilla Holmlund; Anastasia Kouvatsi; Milan Macek; Isabelle Mollet; Walther Parson; Jukka Palo; Rafal Ploski; Antti Sajantila; Adriano Tagliabracci; Ulrik Gether; Thomas Werge; Fernando Rivadeneira; Albert Hofman; André G Uitterlinden; Christian Gieger; Heinz-Erich Wichmann; Andreas Rüther; Stefan Schreiber; Christian Becker; Peter Nürnberg; Matthew R Nelson; Michael Krawczak; Manfred Kayser
Journal:  Curr Biol       Date:  2008-08-07       Impact factor: 10.834

3.  Assessment of the role of genetic polymorphism in venous thrombosis through artificial neural networks.

Authors:  S Penco; E Grossi; S Cheng; M Intraligi; G Maurelli; M C Patrosso; A Marocchi; M Buscema
Journal:  Ann Hum Genet       Date:  2005-11       Impact factor: 1.670

4.  Assessing optimal neural network architecture for identifying disease-associated multi-marker genotypes using a permutation test, and application to calpain 10 polymorphisms associated with diabetes.

Authors:  B V North; D Curtis; P G Cassell; G A Hitman; P C Sham
Journal:  Ann Hum Genet       Date:  2003-07       Impact factor: 1.670

5.  Population structure and eigenanalysis.

Authors:  Nick Patterson; Alkes L Price; David Reich
Journal:  PLoS Genet       Date:  2006-12       Impact factor: 5.917

6.  Neural network analysis in pharmacogenetics of mood disorders.

Authors:  Alessandro Serretti; Enrico Smeraldi
Journal:  BMC Med Genet       Date:  2004-12-09       Impact factor: 2.103

7.  Reconstructing Indian population history.

Authors:  David Reich; Kumarasamy Thangaraj; Nick Patterson; Alkes L Price; Lalji Singh
Journal:  Nature       Date:  2009-09-24       Impact factor: 49.962

8.  Common polygenic variation contributes to risk of schizophrenia and bipolar disorder.

Authors:  Shaun M Purcell; Naomi R Wray; Jennifer L Stone; Peter M Visscher; Michael C O'Donovan; Patrick F Sullivan; Pamela Sklar
Journal:  Nature       Date:  2009-07-01       Impact factor: 49.962

9.  Genetic structure of Europeans: a view from the North-East.

Authors:  Mari Nelis; Tõnu Esko; Reedik Mägi; Fritz Zimprich; Alexander Zimprich; Draga Toncheva; Sena Karachanak; Tereza Piskácková; Ivan Balascák; Leena Peltonen; Eveliina Jakkula; Karola Rehnström; Mark Lathrop; Simon Heath; Pilar Galan; Stefan Schreiber; Thomas Meitinger; Arne Pfeufer; H-Erich Wichmann; Béla Melegh; Noémi Polgár; Daniela Toniolo; Paolo Gasparini; Pio D'Adamo; Janis Klovins; Liene Nikitina-Zake; Vaidutis Kucinskas; Jūrate Kasnauskiene; Jan Lubinski; Tadeusz Debniak; Svetlana Limborska; Andrey Khrunin; Xavier Estivill; Raquel Rabionet; Sara Marsal; Antonio Julià; Stylianos E Antonarakis; Samuel Deutsch; Christelle Borel; Homa Attar; Maryline Gagnebin; Milan Macek; Michael Krawczak; Maido Remm; Andres Metspalu
Journal:  PLoS One       Date:  2009-05-08       Impact factor: 3.240

10.  Neural networks for genetic epidemiology: past, present, and future.

Authors:  Marylyn D Ritchie; Alison A Motsinger-Reif
Journal:  BioData Min       Date:  2008-07-17       Impact factor: 2.522

View more
  6 in total

1.  The genomic psychiatry cohort: partners in discovery.

Authors:  Michele T Pato; Janet L Sobell; Helena Medeiros; Colony Abbott; Brooke M Sklar; Peter F Buckley; Evelyn J Bromet; Michael A Escamilla; Ayman H Fanous; Douglas S Lehrer; Fabio Macciardi; Dolores Malaspina; Steve A McCarroll; Stephen R Marder; Jennifer Moran; Christopher P Morley; Humberto Nicolini; Diana O Perkins; Shaun M Purcell; Mark H Rapaport; Pamela Sklar; Jordan W Smoller; James A Knowles; Carlos N Pato
Journal:  Am J Med Genet B Neuropsychiatr Genet       Date:  2013-05-03       Impact factor: 3.568

2.  Machine learning for genetic prediction of psychiatric disorders: a systematic review.

Authors:  Matthew Bracher-Smith; Karen Crawford; Valentina Escott-Price
Journal:  Mol Psychiatry       Date:  2020-06-26       Impact factor: 15.992

3.  From genes to behavior: placing cognitive models in the context of biological pathways.

Authors:  Ignacio Saez; Eric Set; Ming Hsu
Journal:  Front Neurosci       Date:  2014-11-04       Impact factor: 4.677

4.  Prediction of treatment response in rheumatoid arthritis patients using genome-wide SNP data.

Authors:  Svetlana Cherlin; Darren Plant; John C Taylor; Marco Colombo; Athina Spiliopoulou; Evan Tzanis; Ann W Morgan; Michael R Barnes; Paul McKeigue; Jennifer H Barrett; Costantino Pitzalis; Anne Barton; Matura Consortium; Heather J Cordell
Journal:  Genet Epidemiol       Date:  2018-10-12       Impact factor: 2.135

5.  Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach.

Authors:  Timothy Vivian-Griffiths; Emily Baker; Karl M Schmidt; Matthew Bracher-Smith; James Walters; Andreas Artemiou; Peter Holmans; Michael C O'Donovan; Michael J Owen; Andrew Pocklington; Valentina Escott-Price
Journal:  Am J Med Genet B Neuropsychiatr Genet       Date:  2018-12-04       Impact factor: 3.568

6.  Detecting responses to treatment with fenofibrate in pedigrees.

Authors:  Svetlana Cherlin; Maggie Haitian Wang; Heike Bickeböller; Rita M Cantor
Journal:  BMC Genet       Date:  2018-09-17       Impact factor: 2.797

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.