Literature DB >> 18451434

Highly scalable genotype phasing by entropy minimization.

Alexander Gusev1, Ion I Măndoiu, Bogdan Paşaniuc.   

Abstract

A Single Nucleotide Polymorphism (SNP) is a position in the genome at which two or more of the possible four nucleotides occur in a large percentage of the population. SNPs account for most of the genetic variability between individuals,and mapping SNPs in the human population has become the next high-priority in genomics after the completion of the Human Genome project. In diploid organisms such as humans, there are two non-identical copies of each autosomal chromosome. A description of the SNPs in a chromosome is called a haplotype. At present, it is prohibitively expensive to directly determine the haplotypes of an individual, but it is possible to obtain rather easily the conflated SNP information in the so called genotype. Computational methods for genotype phasing, i.e., inferring haplotypes from genotype data, have received much attention in recent years as haplotype information leads to increased statistical power of disease association tests. However, many of the existing algorithms have impractical running time for phasing large genotype data sets such as those generated by the international HapMap project. In this paper we propose a highly scalable algorithm based on entropy minimization. Our algorithm is capable of phasing both unrelated and related genotypes coming from complex pedigrees. Experimental results on both real and simulated datasets show that our algorithm achieves a phasing accuracy worse but close to that of best existing methods while being several orders of magnitude faster. The open source code implementation of the algorithm and a web interface are publicly available at http://dna.engr.uconn.edu/~software/ent/.

Entities:  

Mesh:

Year:  2008        PMID: 18451434     DOI: 10.1109/TCBB.2007.70223

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  8 in total

1.  Fast and robust association tests for untyped SNPs in case-control studies.

Authors:  Andrew S Allen; Glen A Satten; Sarah L Bray; Frank Dudbridge; Michael P Epstein
Journal:  Hum Hered       Date:  2010-07-30       Impact factor: 0.444

2.  Efficient whole-genome association mapping using local phylogenies for unphased genotype data.

Authors:  Zhihong Ding; Thomas Mailund; Yun S Song
Journal:  Bioinformatics       Date:  2008-07-30       Impact factor: 6.937

3.  Improved IBD detection using incomplete haplotype information.

Authors:  Giulio Genovese; Gregory Leibon; Martin R Pollak; Daniel N Rockmore
Journal:  BMC Genet       Date:  2010-06-30       Impact factor: 2.797

4.  A novel haplotype-sharing approach for genome-wide case-control association studies implicates the calpastatin gene in Parkinson's disease.

Authors:  Andrew S Allen; Glen A Satten
Journal:  Genet Epidemiol       Date:  2009-12       Impact factor: 2.135

5.  The National Institutes of Health Undiagnosed Diseases Program: insights into rare diseases.

Authors:  William A Gahl; Thomas C Markello; Camilo Toro; Karin Fuentes Fajardo; Murat Sincan; Fred Gill; Hannah Carlson-Donohoe; Andrea Gropman; Tyler Mark Pierson; Gretchen Golas; Lynne Wolfe; Catherine Groden; Rena Godfrey; Michele Nehrebecky; Colleen Wahl; Dennis M D Landis; Sandra Yang; Anne Madeo; James C Mullikin; Cornelius F Boerkoel; Cynthia J Tifft; David Adams
Journal:  Genet Med       Date:  2011-09-26       Impact factor: 8.822

6.  Linkage disequilibrium based genotype calling from low-coverage shotgun sequencing reads.

Authors:  Jorge Duitama; Justin Kennedy; Sanjiv Dinakar; Yözen Hernández; Yufeng Wu; Ion I Măndoiu
Journal:  BMC Bioinformatics       Date:  2011-02-15       Impact factor: 3.169

7.  Effective selection of informative SNPs and classification on the HapMap genotype data.

Authors:  Nina Zhou; Lipo Wang
Journal:  BMC Bioinformatics       Date:  2007-12-20       Impact factor: 3.169

8.  Genome-wide association analysis of rheumatoid arthritis data via haplotype sharing.

Authors:  Andrew S Allen; Glen A Satten
Journal:  BMC Proc       Date:  2009-12-15
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.