Literature DB >> 22697235

HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data.

Derek Aguiar1, Sorin Istrail.   

Abstract

Genome assembly methods produce haplotype phase ambiguous assemblies due to limitations in current sequencing technologies. Determining the haplotype phase of an individual is computationally challenging and experimentally expensive. However, haplotype phase information is crucial in many bioinformatics workflows such as genetic association studies and genomic imputation. Current computational methods of determining haplotype phase from sequence data--known as haplotype assembly--have difficulties producing accurate results for large (1000 genomes-type) data or operate on restricted optimizations that are unrealistic considering modern high-throughput sequencing technologies. We present a novel algorithm, HapCompass, for haplotype assembly of densely sequenced human genome data. The HapCompass algorithm operates on a graph where single nucleotide polymorphisms (SNPs) are nodes and edges are defined by sequence reads and viewed as supporting evidence of co-occurring SNP alleles in a haplotype. In our graph model, haplotype phasings correspond to spanning trees. We define the minimum weighted edge removal optimization on this graph and develop an algorithm based on cycle basis local optimizations for resolving conflicting evidence. We then estimate the amount of sequencing required to produce a complete haplotype assembly of a chromosome. Using these estimates together with metrics borrowed from genome assembly and haplotype phasing, we compare the accuracy of HapCompass, the Genome Analysis ToolKit, and HapCut for 1000 Genomes Project and simulated data. We show that HapCompass performs significantly better for a variety of data and metrics. HapCompass is freely available for download (www.brown.edu/Research/Istrail_Lab/).

Entities:  

Mesh:

Year:  2012        PMID: 22697235      PMCID: PMC3375639          DOI: 10.1089/cmb.2012.0084

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  16 in total

1.  Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem.

Authors:  Ross Lippert; Russell Schwartz; Giuseppe Lancia; Sorin Istrail
Journal:  Brief Bioinform       Date:  2002-03       Impact factor: 11.622

2.  Haplotype phasing by multi-assembly of shared haplotypes: phase-dependent interactions between rare variants.

Authors:  Bjarni V Halldórsson; Derek Aguiar; Sorin Istrail
Journal:  Pac Symp Biocomput       Date:  2011

3.  Haplotype assembly from aligned weighted SNP fragments.

Authors:  Yu-Ying Zhao; Ling-Yun Wu; Ji-Hong Zhang; Rui-Sheng Wang; Xiang-Sun Zhang
Journal:  Comput Biol Chem       Date:  2005-08       Impact factor: 2.877

4.  A comparison of phasing algorithms for trios and unrelated individuals.

Authors:  Jonathan Marchini; David Cutler; Nick Patterson; Matthew Stephens; Eleazar Eskin; Eran Halperin; Shin Lin; Zhaohui S Qin; Heather M Munro; Goncalo R Abecasis; Peter Donnelly
Journal:  Am J Hum Genet       Date:  2006-01-26       Impact factor: 11.025

5.  HapCUT: an efficient and accurate algorithm for the haplotype assembly problem.

Authors:  Vikas Bansal; Vineet Bafna
Journal:  Bioinformatics       Date:  2008-08-15       Impact factor: 6.937

6.  An MCMC algorithm for haplotype assembly from whole-genome sequence data.

Authors:  Vikas Bansal; Aaron L Halpern; Nelson Axelrod; Vineet Bafna
Journal:  Genome Res       Date:  2008-08       Impact factor: 9.043

7.  Identity-by-descent filtering of exome sequence data identifies PIGV mutations in hyperphosphatasia mental retardation syndrome.

Authors:  Peter M Krawitz; Michal R Schweiger; Christian Rödelsperger; Carlo Marcelis; Uwe Kölsch; Christian Meisel; Friederike Stephani; Taroh Kinoshita; Yoshiko Murakami; Sebastian Bauer; Melanie Isau; Axel Fischer; Andreas Dahl; Martin Kerick; Jochen Hecht; Sebastian Köhler; Marten Jäger; Johannes Grünhagen; Birgit Jonske de Condor; Sandra Doelken; Han G Brunner; Peter Meinecke; Eberhard Passarge; Miles D Thompson; David E Cole; Denise Horn; Tony Roscioli; Stefan Mundlos; Peter N Robinson
Journal:  Nat Genet       Date:  2010-08-29       Impact factor: 38.330

8.  A map of human genome variation from population-scale sequencing.

Authors:  Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean
Journal:  Nature       Date:  2010-10-28       Impact factor: 49.962

9.  Optimal algorithms for haplotype assembly from whole-genome sequence data.

Authors:  Dan He; Arthur Choi; Knot Pipatsrisawat; Adnan Darwiche; Eleazar Eskin
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

10.  The diploid genome sequence of an individual human.

Authors:  Samuel Levy; Granger Sutton; Pauline C Ng; Lars Feuk; Aaron L Halpern; Brian P Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A Kravitz; Dana A Busam; Karen Y Beeson; Tina C McIntosh; Karin A Remington; Josep F Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E Frazier; Stephen W Scherer; Robert L Strausberg; J Craig Venter
Journal:  PLoS Biol       Date:  2007-09-04       Impact factor: 8.029

View more
  37 in total

1.  Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.

Authors:  Wen-Yun Yang; Farhad Hormozdiari; Zhanyong Wang; Dan He; Bogdan Pasaniuc; Eleazar Eskin
Journal:  Bioinformatics       Date:  2013-07-03       Impact factor: 6.937

Review 2.  Haplotype-resolved genome sequencing: experimental methods and applications.

Authors:  Matthew W Snyder; Andrew Adey; Jacob O Kitzman; Jay Shendure
Journal:  Nat Rev Genet       Date:  2015-05-07       Impact factor: 53.242

3.  dipSPAdes: Assembler for Highly Polymorphic Diploid Genomes.

Authors:  Yana Safonova; Anton Bankevich; Pavel A Pevzner
Journal:  J Comput Biol       Date:  2015-03-03       Impact factor: 1.479

Review 4.  Sequence assembly demystified.

Authors:  Niranjan Nagarajan; Mihai Pop
Journal:  Nat Rev Genet       Date:  2013-01-29       Impact factor: 53.242

5.  [Reconstruction of tumor clonal haplotypes based on an improved spanning algorithm].

Authors:  Yu Geng; Zhongmeng Zhao; Jianye Liu
Journal:  Nan Fang Yi Ke Da Xue Xue Bao       Date:  2019-11-30

6.  HaploMaker: An improved algorithm for rapid haplotype assembly of genomic sequences.

Authors:  Mario Fruzangohar; William A Timmins; Olena Kravchuk; Julian Taylor
Journal:  Gigascience       Date:  2022-05-17       Impact factor: 7.658

7.  Tumor haplotype assembly algorithms for cancer genomics.

Authors:  Derek Aguiar; Wendy S W Wong; Sorin Istrail
Journal:  Pac Symp Biocomput       Date:  2014

8.  Haplotype assembly in polyploid genomes and identical by descent shared tracts.

Authors:  Derek Aguiar; Sorin Istrail
Journal:  Bioinformatics       Date:  2013-07-01       Impact factor: 6.937

9.  Unraveling overlapping deletions by agglomerative clustering.

Authors:  Roland Wittler
Journal:  BMC Genomics       Date:  2013-01-21       Impact factor: 3.969

10.  Odintifier--A computational method for identifying insertions of organellar origin from modern and ancient high-throughput sequencing data based on haplotype phasing.

Authors:  Jose Alfredo Samaniego Castruita; Marie Lisandra Zepeda Mendoza; Ross Barnett; Nathan Wales; M Thomas P Gilbert
Journal:  BMC Bioinformatics       Date:  2015-07-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.