Literature DB >> 21071795

Estimating haplotype frequencies by combining data from large DNA pools with database information.

Dario Gasbarra1, Sangita Kulathinal, Matti Pirinen, Mikko J Sillanpää.   

Abstract

We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.

Mesh:

Substances:

Year:  2011        PMID: 21071795     DOI: 10.1109/TCBB.2009.71

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  7 in total

1.  Bayesian adaptive Markov chain Monte Carlo estimation of genetic parameters.

Authors:  B Mathew; A M Bauer; P Koistinen; T C Reetz; J Léon; M J Sillanpää
Journal:  Heredity (Edinb)       Date:  2012-07-18       Impact factor: 3.821

2.  Estimating the effect of SNP genotype on quantitative traits from pooled DNA samples.

Authors:  John M Henshall; Rachel J Hawken; Sonja Dominik; William Barendse
Journal:  Genet Sel Evol       Date:  2012-04-17       Impact factor: 4.297

3.  Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA.

Authors:  Guido H Jajamovich; Alexandros Iliadis; Dimitris Anastassiou; Xiaodong Wang
Journal:  BMC Bioinformatics       Date:  2013-09-08       Impact factor: 3.169

4.  An efficient pipeline to generate data for studies in plastid population genomics and phylogeography.

Authors:  Brendan F Kohrn; Jessica M Persinger; Mitchell B Cruzan
Journal:  Appl Plant Sci       Date:  2017-11-14       Impact factor: 1.936

5.  Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data.

Authors:  Alexandros Iliadis; Dimitris Anastassiou; Xiaodong Wang
Journal:  BMC Genet       Date:  2012-10-30       Impact factor: 2.797

6.  Maximum likelihood estimation of frequencies of known haplotypes from pooled sequence data.

Authors:  Darren Kessner; Thomas L Turner; John Novembre
Journal:  Mol Biol Evol       Date:  2013-01-30       Impact factor: 16.240

7.  An EM algorithm based on an internal list for estimating haplotype distributions of rare variants from pooled genotype data.

Authors:  Anthony Y C Kuk; Xiang Li; Jinfeng Xu
Journal:  BMC Genet       Date:  2013-09-13       Impact factor: 2.797

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.