Literature DB >> 18573795

PoooL: an efficient method for estimating haplotype frequencies from large DNA pools.

Han Zhang1, Hsin-Chou Yang, Yaning Yang.   

Abstract

MOTIVATION: Pooling DNA is a cost-effective alternative to individual genotyping method. It is often used for initial screening in genome-wide association analysis. In some studies, large pools with sizes up to several hundreds were applied in order to significantly reduce genotyping cost. However, method for estimating haplotype frequencies from large DNA pools has not been available due to computational complexity involved.
METHODS: We propose a novel constrained EM algorithm, PoooL, to estimate frequencies of single-nucleotide polymorphism (SNP) haplotypes from DNA pools. A quantity called importance factor is introduced to measure the contribution of a haplotype to the likelihood. Under the assumption of asymptotic normality of the estimated allele frequencies and a system of linear constraints on haplotype frequencies the importance factor remains a constant in the iterative maximization process. The maximization problem in the EM algorithm is then formulated into a constrained maximum entropy model and solved by the improved iterative scaling method.
RESULTS: Simulation study shows that our algorithm can efficiently estimate haplotype frequencies from DNA pools with arbitrarily large sizes. The algorithm works equally well for large pools with sizes up to hundreds or thousands and for pools with sizes as small as one or two individuals. The computational complexity of the PoooL algorithm is independent of pool sizes, and the computational efficiency for large pools is thus substantially improved over existing estimating methods. Simulation results also show that the proposed method is robust to genotype errors and population admixture.

Mesh:

Substances:

Year:  2008        PMID: 18573795     DOI: 10.1093/bioinformatics/btn324

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Rapid inexpensive genome-wide association using pooled whole blood.

Authors:  Jamie E Craig; Alex W Hewitt; Amy E McMellon; Anjali K Henders; Lingjun Ma; Leanne Wallace; Shiwani Sharma; Kathryn P Burdon; Peter M Visscher; Grant W Montgomery; Stuart MacGregor
Journal:  Genome Res       Date:  2009-10-03       Impact factor: 9.043

2.  CSHAP: efficient haplotype frequency estimation based on sparse representation.

Authors:  Yinsheng Zhou; Han Zhang; Yaning Yang
Journal:  Bioinformatics       Date:  2019-08-15       Impact factor: 6.937

3.  The efficacy of detecting variants with small effects on the Affymetrix 6.0 platform using pooled DNA.

Authors:  Charleston W K Chiang; Zofia K Z Gajdos; Joshua M Korn; Johannah L Butler; Rachel Hackett; Candace Guiducci; Thutrang T Nguyen; Rainford Wilks; Terrence Forrester; Katherine D Henderson; Loic Le Marchand; Brian E Henderson; Christopher A Haiman; Richard S Cooper; Helen N Lyon; Xiaofeng Zhu; Colin A McKenzie; Mark R Palmert; Joel N Hirschhorn
Journal:  Hum Genet       Date:  2011-03-22       Impact factor: 4.132

4.  Rapid assessment of genetic ancestry in populations of unknown origin by genome-wide genotyping of pooled samples.

Authors:  Charleston W K Chiang; Zofia K Z Gajdos; Joshua M Korn; Finny G Kuruvilla; Johannah L Butler; Rachel Hackett; Candace Guiducci; Thutrang T Nguyen; Rainford Wilks; Terrence Forrester; Christopher A Haiman; Katherine D Henderson; Loic Le Marchand; Brian E Henderson; Mark R Palmert; Colin A McKenzie; Helen N Lyon; Richard S Cooper; Xiaofeng Zhu; Joel N Hirschhorn
Journal:  PLoS Genet       Date:  2010-03-05       Impact factor: 5.917

5.  Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA.

Authors:  Guido H Jajamovich; Alexandros Iliadis; Dimitris Anastassiou; Xiaodong Wang
Journal:  BMC Bioinformatics       Date:  2013-09-08       Impact factor: 3.169

6.  Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data.

Authors:  Alexandros Iliadis; Dimitris Anastassiou; Xiaodong Wang
Journal:  BMC Genet       Date:  2012-10-30       Impact factor: 2.797

7.  Cost-effective genome-wide estimation of allele frequencies from pooled DNA in Atlantic salmon (Salmo salar L.).

Authors:  Mikhail Ozerov; Anti Vasemägi; Vidar Wennevik; Eero Niemelä; Sergey Prusov; Matthew Kent; Juha-Pekka Vähä
Journal:  BMC Genomics       Date:  2013-01-16       Impact factor: 3.969

8.  Maximum likelihood estimation of frequencies of known haplotypes from pooled sequence data.

Authors:  Darren Kessner; Thomas L Turner; John Novembre
Journal:  Mol Biol Evol       Date:  2013-01-30       Impact factor: 16.240

9.  An EM algorithm based on an internal list for estimating haplotype distributions of rare variants from pooled genotype data.

Authors:  Anthony Y C Kuk; Xiang Li; Jinfeng Xu
Journal:  BMC Genet       Date:  2013-09-13       Impact factor: 2.797

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.