Literature DB >> 25658651

WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads.

Murray Patterson1, Tobias Marschall2,3, Nadia Pisanti4,5, Leo van Iersel, Leen Stougie6,5, Gunnar W Klau6,5, Alexander Schönhuth.   

Abstract

The human genome is diploid, which requires assigning heterozygous single nucleotide polymorphisms (SNPs) to the two copies of the genome. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for downstream analyses in population genetics. Currently, statistical approaches, which are oblivious to direct read information, constitute the state-of-the-art. Haplotype assembly, which addresses phasing directly from sequencing reads, suffers from the fact that sequencing reads of the current generation are too short to serve the purposes of genome-wide phasing. While future-technology sequencing reads will contain sufficient amounts of SNPs per read for phasing, they are also likely to suffer from higher sequencing error rates. Currently, no haplotype assembly approaches exist that allow for taking both increasing read length and sequencing error information into account. Here, we suggest WhatsHap, the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs. WhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. We demonstrate that WhatsHap can handle datasets of coverage up to 20×, and that 15× are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates. We also find that the switch and flip error rates of the haplotypes we output are favorable when comparing them with state-of-the-art statistical phasers.

Entities:  

Keywords:  algorithms; combinatorial optimization; dynamic programming; haplotypes; next generation sequencing

Mesh:

Year:  2015        PMID: 25658651     DOI: 10.1089/cmb.2014.0157

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  74 in total

1.  Assessment of human diploid genome assembly with 10x Linked-Reads data.

Authors:  Lu Zhang; Xin Zhou; Ziming Weng; Arend Sidow
Journal:  Gigascience       Date:  2019-11-01       Impact factor: 6.524

2.  Differential gene expression associated with a floral scent polymorphism in the evening primrose Oenothera harringtonii (Onagraceae).

Authors:  Lindsey L Bechen; Matthew G Johnson; Geoffrey T Broadhead; Rachel A Levin; Rick P Overson; Tania Jogesh; Jeremie B Fant; Robert A Raguso; Krissa A Skogen; Norman J Wickett
Journal:  BMC Genomics       Date:  2022-02-12       Impact factor: 3.969

3.  Adaptive changes of the autosomal part of the genome in a dioecious clade of Silene.

Authors:  Jitka Zluvova; Zdenek Kubat; Roman Hobza; Bohuslav Janousek
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2022-03-21       Impact factor: 6.237

4.  Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.

Authors:  Arang Rhie; Brian P Walenz; Sergey Koren; Adam M Phillippy
Journal:  Genome Biol       Date:  2020-09-14       Impact factor: 13.583

5.  Defining Blood Group Gene Reference Alleles by Long-Read Sequencing: Proof of Concept in the ACKR1 Gene Encoding the Duffy Antigens.

Authors:  Yann Fichou; Isabelle Berlivet; Gaëlle Richard; Christophe Tournamille; Lilian Castilho; Claude Férec
Journal:  Transfus Med Hemother       Date:  2019-12-11       Impact factor: 3.747

6.  Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Authors:  Peter Ebert; Peter A Audano; Qihui Zhu; Bernardo Rodriguez-Martin; Charles Lee; Jan O Korbel; Tobias Marschall; Evan E Eichler; David Porubsky; Marc Jan Bonder; Arvis Sulovari; Jana Ebler; Weichen Zhou; Rebecca Serra Mari; Feyza Yilmaz; Xuefang Zhao; PingHsun Hsieh; Joyce Lee; Sushant Kumar; Jiadong Lin; Tobias Rausch; Yu Chen; Jingwen Ren; Martin Santamarina; Wolfram Höps; Hufsah Ashraf; Nelson T Chuang; Xiaofei Yang; Katherine M Munson; Alexandra P Lewis; Susan Fairley; Luke J Tallon; Wayne E Clarke; Anna O Basile; Marta Byrska-Bishop; André Corvelo; Uday S Evani; Tsung-Yu Lu; Mark J P Chaisson; Junjie Chen; Chong Li; Harrison Brand; Aaron M Wenger; Maryam Ghareghani; William T Harvey; Benjamin Raeder; Patrick Hasenfeld; Allison A Regier; Haley J Abel; Ira M Hall; Paul Flicek; Oliver Stegle; Mark B Gerstein; Jose M C Tubio; Zepeng Mu; Yang I Li; Xinghua Shi; Alex R Hastie; Kai Ye; Zechen Chong; Ashley D Sanders; Michael C Zody; Michael E Talkowski; Ryan E Mills; Scott E Devine
Journal:  Science       Date:  2021-02-25       Impact factor: 47.728

7.  Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes.

Authors:  Kishwar Shafin; Trevor Pesout; Ryan Lorig-Roach; Marina Haukness; Hugh E Olsen; Colleen Bosworth; Joel Armstrong; Kristof Tigyi; Nicholas Maurer; Sergey Koren; Fritz J Sedlazeck; Tobias Marschall; Simon Mayes; Vania Costa; Justin M Zook; Kelvin J Liu; Duncan Kilburn; Melanie Sorensen; Katy M Munson; Mitchell R Vollger; Jean Monlong; Erik Garrison; Evan E Eichler; Sofie Salama; David Haussler; Richard E Green; Mark Akeson; Adam Phillippy; Karen H Miga; Paolo Carnevali; Miten Jain; Benedict Paten
Journal:  Nat Biotechnol       Date:  2020-05-04       Impact factor: 54.908

8.  Overcoming uncollapsed haplotypes in long-read assemblies of non-model organisms.

Authors:  Nadège Guiglielmoni; Antoine Houtain; Alessandro Derzelle; Karine Van Doninck; Jean-François Flot
Journal:  BMC Bioinformatics       Date:  2021-06-05       Impact factor: 3.169

9.  Telomere-to-telomere assembly of a fish Y chromosome reveals the origin of a young sex chromosome pair.

Authors:  Lingzhan Xue; Yu Gao; Meiying Wu; Tian Tian; Haiping Fan; Yongji Huang; Zhen Huang; Dapeng Li; Luohao Xu
Journal:  Genome Biol       Date:  2021-07-12       Impact factor: 17.906

10.  An ancestral recombination graph of human, Neanderthal, and Denisovan genomes.

Authors:  Nathan K Schaefer; Beth Shapiro; Richard E Green
Journal:  Sci Adv       Date:  2021-07-16       Impact factor: 14.136

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.