Andrei S Rodin1, Eric Boerwinkle. 1. Human Genetics Center, School of Public Health, University of Texas Health Science Center Houston, TX 77030, USA. arodin@uth.tmc.edu
Abstract
MOTIVATION: The wealth of single nucleotide polymorphism (SNP) data within candidate genes and anticipated across the genome poses enormous analytical problems for studies of genotype-to-phenotype relationships, and modern data mining methods may be particularly well suited to meet the swelling challenges. In this paper, we introduce the method of Belief (Bayesian) networks to the domain of genotype-to-phenotype analyses and provide an example application. RESULTS: A Belief network is a graphical model of a probabilistic nature that represents a joint multivariate probability distribution and reflects conditional independences between variables. Given the data, optimal network topology can be estimated with the assistance of heuristic search algorithms and scoring criteria. Statistical significance of edge strengths can be evaluated using Bayesian methods and bootstrapping. As an example application, the method of Belief networks was applied to 20 SNPs in the apolipoprotein (apo) E gene and plasma apoE levels in a sample of 702 individuals from Jackson, MS. Plasma apoE level was the primary target variable. These analyses indicate that the edge between SNP 4075, coding for the well-known epsilon2 allele, and plasma apoE level was strong. Belief networks can effectively describe complex uncertain processes and can both learn from data and incorporate prior knowledge. AVAILABILITY: Various alternative and supplemental networks (not given in the text) as well as source code extensions, are available from the authors. SUPPLEMENTARY INFORMATION: http://bioinformatics.oxfordjournals.org.
MOTIVATION: The wealth of single nucleotide polymorphism (SNP) data within candidate genes and anticipated across the genome poses enormous analytical problems for studies of genotype-to-phenotype relationships, and modern data mining methods may be particularly well suited to meet the swelling challenges. In this paper, we introduce the method of Belief (Bayesian) networks to the domain of genotype-to-phenotype analyses and provide an example application. RESULTS: A Belief network is a graphical model of a probabilistic nature that represents a joint multivariate probability distribution and reflects conditional independences between variables. Given the data, optimal network topology can be estimated with the assistance of heuristic search algorithms and scoring criteria. Statistical significance of edge strengths can be evaluated using Bayesian methods and bootstrapping. As an example application, the method of Belief networks was applied to 20 SNPs in the apolipoprotein (apo) E gene and plasma apoE levels in a sample of 702 individuals from Jackson, MS. Plasma apoE level was the primary target variable. These analyses indicate that the edge between SNP 4075, coding for the well-known epsilon2 allele, and plasma apoE level was strong. Belief networks can effectively describe complex uncertain processes and can both learn from data and incorporate prior knowledge. AVAILABILITY: Various alternative and supplemental networks (not given in the text) as well as source code extensions, are available from the authors. SUPPLEMENTARY INFORMATION: http://bioinformatics.oxfordjournals.org.
Authors: D A Nickerson; S L Taylor; S M Fullerton; K M Weiss; A G Clark; J H Stengård; V Salomaa; E Boerwinkle; C F Sing Journal: Genome Res Date: 2000-10 Impact factor: 9.043
Authors: S C Rall; Y M Newhouse; H R Clarke; K H Weisgraber; B J McCarthy; R W Mahley; T P Bersot Journal: J Clin Invest Date: 1989-04 Impact factor: 14.808
Authors: E R Martin; E H Lai; J R Gilbert; A R Rogala; A J Afshari; J Riley; K L Finch; J F Stevens; K J Livak; B D Slotterbeck; S H Slifer; L L Warren; P M Conneally; D E Schmechel; I Purvis; M A Pericak-Vance; A D Roses; J M Vance Journal: Am J Hum Genet Date: 2000-06-21 Impact factor: 11.025
Authors: Paola Sebastiani; Nadia Timofeev; Daniel A Dworkis; Thomas T Perls; Martin H Steinberg Journal: Am J Hematol Date: 2009-08 Impact factor: 10.047
Authors: Jacqueline N Milton; Victor R Gordeuk; James G Taylor; Mark T Gladwin; Martin H Steinberg; Paola Sebastiani Journal: Circ Cardiovasc Genet Date: 2014-03-01