Literature DB >> 29949971

Strand-seq enables reliable separation of long reads by chromosome via expectation maximization.

Maryam Ghareghani1,2,3, David Porubskỳ1,2, Ashley D Sanders4, Sascha Meiers4, Evan E Eichler5,6, Jan O Korbel4, Tobias Marschall1,2.   

Abstract

Motivation: Current sequencing technologies are able to produce reads orders of magnitude longer than ever possible before. Such long reads have sparked a new interest in de novo genome assembly, which removes reference biases inherent to re-sequencing approaches and allows for a direct characterization of complex genomic variants. However, even with latest algorithmic advances, assembling a mammalian genome from long error-prone reads incurs a significant computational burden and does not preclude occasional misassemblies. Both problems could potentially be mitigated if assembly could commence for each chromosome separately.
Results: To address this, we show how single-cell template strand sequencing (Strand-seq) data can be leveraged for this purpose. We introduce a novel latent variable model and a corresponding Expectation Maximization algorithm, termed SaaRclust, and demonstrates its ability to reliably cluster long reads by chromosome. For each long read, this approach produces a posterior probability distribution over all chromosomes of origin and read directionalities. In this way, it allows to assess the amount of uncertainty inherent to sparse Strand-seq data on the level of individual reads. Among the reads that our algorithm confidently assigns to a chromosome, we observed more than 99% correct assignments on a subset of Pacific Bioscience reads with 30.1× coverage. To our knowledge, SaaRclust is the first approach for the in silico separation of long reads by chromosome prior to assembly. Availability and implementation: https://github.com/daewoooo/SaaRclust.

Entities:  

Mesh:

Year:  2018        PMID: 29949971      PMCID: PMC6022540          DOI: 10.1093/bioinformatics/bty290

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  18 in total

Review 1.  The impact of third generation genomic technologies on plant genome assembly.

Authors:  Wen-Biao Jiao; Korbinian Schneeberger
Journal:  Curr Opin Plant Biol       Date:  2017-02-21       Impact factor: 7.834

2.  Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2016-03-19       Impact factor: 6.937

3.  Assembly of long error-prone reads using de Bruijn graphs.

Authors:  Yu Lin; Jeffrey Yuan; Mikhail Kolmogorov; Max W Shen; Mark Chaisson; Pavel A Pevzner
Journal:  Proc Natl Acad Sci U S A       Date:  2016-12-12       Impact factor: 11.205

Review 4.  Repetitive DNA and next-generation sequencing: computational challenges and solutions.

Authors:  Todd J Treangen; Steven L Salzberg
Journal:  Nat Rev Genet       Date:  2011-11-29       Impact factor: 53.242

5.  BAIT: Organizing genomes and mapping rearrangements in single cells.

Authors:  Mark Hills; Kieran O'Neill; Ester Falconer; Ryan Brinkman; Peter M Lansdorp
Journal:  Genome Med       Date:  2013-09-13       Impact factor: 11.117

6.  Direct chromosome-length haplotyping by single-cell sequencing.

Authors:  David Porubský; Ashley D Sanders; Niek van Wietmarschen; Ester Falconer; Mark Hills; Diana C J Spierings; Marianna R Bevova; Victor Guryev; Peter M Lansdorp
Journal:  Genome Res       Date:  2016-09-19       Impact factor: 9.043

7.  Genome-wide mapping of sister chromatid exchange events in single yeast cells using Strand-seq.

Authors:  Clémence Claussin; David Porubský; Diana Cj Spierings; Nancy Halsema; Stefan Rentas; Victor Guryev; Peter M Lansdorp; Michael Chang
Journal:  Elife       Date:  2017-12-12       Impact factor: 8.140

8.  Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Authors:  Wen-Biao Jiao; Gonzalo Garcia Accinelli; Benjamin Hartwig; Christiane Kiefer; David Baker; Edouard Severing; Eva-Maria Willing; Mathieu Piednoel; Stefan Woetzel; Eva Madrid-Herrero; Bruno Huettel; Ulrike Hümann; Richard Reinhard; Marcus A Koch; Daniel Swan; Bernardo Clavijo; George Coupland; Korbinian Schneeberger
Journal:  Genome Res       Date:  2017-02-03       Impact factor: 9.043

9.  Dense and accurate whole-chromosome haplotyping of individual genomes.

Authors:  David Porubsky; Shilpa Garg; Ashley D Sanders; Jan O Korbel; Victor Guryev; Peter M Lansdorp; Tobias Marschall
Journal:  Nat Commun       Date:  2017-11-03       Impact factor: 14.919

10.  BLM helicase suppresses recombination at G-quadruplex motifs in transcribed genes.

Authors:  Niek van Wietmarschen; Sarra Merzouk; Nancy Halsema; Diana C J Spierings; Victor Guryev; Peter M Lansdorp
Journal:  Nat Commun       Date:  2018-01-18       Impact factor: 14.919

View more
  7 in total

1.  Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.

Authors:  Jana Ebler; Peter Ebert; Wayne E Clarke; Tobias Rausch; Peter A Audano; Torsten Houwaart; Yafei Mao; Jan O Korbel; Evan E Eichler; Michael C Zody; Alexander T Dilthey; Tobias Marschall
Journal:  Nat Genet       Date:  2022-04-11       Impact factor: 38.330

2.  Semi-automated assembly of high-quality diploid human reference genomes.

Authors:  Erich D Jarvis; Giulio Formenti; Arang Rhie; Andrea Guarracino; Chentao Yang; Jonathan Wood; Alan Tracey; Francoise Thibaud-Nissen; Mitchell R Vollger; David Porubsky; Haoyu Cheng; Mobin Asri; Glennis A Logsdon; Paolo Carnevali; Mark J P Chaisson; Chen-Shan Chin; Sarah Cody; Joanna Collins; Peter Ebert; Merly Escalona; Olivier Fedrigo; Robert S Fulton; Lucinda L Fulton; Shilpa Garg; Jennifer L Gerton; Jay Ghurye; Anastasiya Granat; Richard E Green; William Harvey; Patrick Hasenfeld; Alex Hastie; Marina Haukness; Erich B Jaeger; Miten Jain; Melanie Kirsche; Mikhail Kolmogorov; Jan O Korbel; Sergey Koren; Jonas Korlach; Joyce Lee; Daofeng Li; Tina Lindsay; Julian Lucas; Feng Luo; Tobias Marschall; Matthew W Mitchell; Jennifer McDaniel; Fan Nie; Hugh E Olsen; Nathan D Olson; Trevor Pesout; Tamara Potapova; Daniela Puiu; Allison Regier; Jue Ruan; Steven L Salzberg; Ashley D Sanders; Michael C Schatz; Anthony Schmitt; Valerie A Schneider; Siddarth Selvaraj; Kishwar Shafin; Alaina Shumate; Nathan O Stitziel; Catherine Stober; James Torrance; Justin Wagner; Jianxin Wang; Aaron Wenger; Chuanle Xiao; Aleksey V Zimin; Guojie Zhang; Ting Wang; Heng Li; Erik Garrison; David Haussler; Ira Hall; Justin M Zook; Evan E Eichler; Adam M Phillippy; Benedict Paten; Kerstin Howe; Karen H Miga
Journal:  Nature       Date:  2022-10-19       Impact factor: 69.504

3.  Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.

Authors:  Arang Rhie; Brian P Walenz; Sergey Koren; Adam M Phillippy
Journal:  Genome Biol       Date:  2020-09-14       Impact factor: 13.583

4.  Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Authors:  Peter Ebert; Peter A Audano; Qihui Zhu; Bernardo Rodriguez-Martin; Charles Lee; Jan O Korbel; Tobias Marschall; Evan E Eichler; David Porubsky; Marc Jan Bonder; Arvis Sulovari; Jana Ebler; Weichen Zhou; Rebecca Serra Mari; Feyza Yilmaz; Xuefang Zhao; PingHsun Hsieh; Joyce Lee; Sushant Kumar; Jiadong Lin; Tobias Rausch; Yu Chen; Jingwen Ren; Martin Santamarina; Wolfram Höps; Hufsah Ashraf; Nelson T Chuang; Xiaofei Yang; Katherine M Munson; Alexandra P Lewis; Susan Fairley; Luke J Tallon; Wayne E Clarke; Anna O Basile; Marta Byrska-Bishop; André Corvelo; Uday S Evani; Tsung-Yu Lu; Mark J P Chaisson; Junjie Chen; Chong Li; Harrison Brand; Aaron M Wenger; Maryam Ghareghani; William T Harvey; Benjamin Raeder; Patrick Hasenfeld; Allison A Regier; Haley J Abel; Ira M Hall; Paul Flicek; Oliver Stegle; Mark B Gerstein; Jose M C Tubio; Zepeng Mu; Yang I Li; Xinghua Shi; Alex R Hastie; Kai Ye; Zechen Chong; Ashley D Sanders; Michael C Zody; Michael E Talkowski; Ryan E Mills; Scott E Devine
Journal:  Science       Date:  2021-02-25       Impact factor: 47.728

5.  The complete sequence of a human genome.

Authors:  Sergey Nurk; Sergey Koren; Arang Rhie; Mikko Rautiainen; Andrey V Bzikadze; Alla Mikheenko; Mitchell R Vollger; Nicolas Altemose; Lev Uralsky; Ariel Gershman; Sergey Aganezov; Savannah J Hoyt; Mark Diekhans; Glennis A Logsdon; Michael Alonge; Stylianos E Antonarakis; Matthew Borchers; Gerard G Bouffard; Shelise Y Brooks; Gina V Caldas; Nae-Chyun Chen; Haoyu Cheng; Chen-Shan Chin; William Chow; Leonardo G de Lima; Philip C Dishuck; Richard Durbin; Tatiana Dvorkina; Ian T Fiddes; Giulio Formenti; Robert S Fulton; Arkarachai Fungtammasan; Erik Garrison; Patrick G S Grady; Tina A Graves-Lindsay; Ira M Hall; Nancy F Hansen; Gabrielle A Hartley; Marina Haukness; Kerstin Howe; Michael W Hunkapiller; Chirag Jain; Miten Jain; Erich D Jarvis; Peter Kerpedjiev; Melanie Kirsche; Mikhail Kolmogorov; Jonas Korlach; Milinn Kremitzki; Heng Li; Valerie V Maduro; Tobias Marschall; Ann M McCartney; Jennifer McDaniel; Danny E Miller; James C Mullikin; Eugene W Myers; Nathan D Olson; Benedict Paten; Paul Peluso; Pavel A Pevzner; David Porubsky; Tamara Potapova; Evgeny I Rogaev; Jeffrey A Rosenfeld; Steven L Salzberg; Valerie A Schneider; Fritz J Sedlazeck; Kishwar Shafin; Colin J Shew; Alaina Shumate; Ying Sims; Arian F A Smit; Daniela C Soto; Ivan Sović; Jessica M Storer; Aaron Streets; Beth A Sullivan; Françoise Thibaud-Nissen; James Torrance; Justin Wagner; Brian P Walenz; Aaron Wenger; Jonathan M D Wood; Chunlin Xiao; Stephanie M Yan; Alice C Young; Samantha Zarate; Urvashi Surti; Rajiv C McCoy; Megan Y Dennis; Ivan A Alexandrov; Jennifer L Gerton; Rachel J O'Neill; Winston Timp; Justin M Zook; Michael C Schatz; Evan E Eichler; Karen H Miga; Adam M Phillippy
Journal:  Science       Date:  2022-03-31       Impact factor: 63.714

Review 6.  Long-Read Sequencing Emerging in Medical Genetics.

Authors:  Tuomo Mantere; Simone Kersten; Alexander Hoischen
Journal:  Front Genet       Date:  2019-05-07       Impact factor: 4.599

7.  The structure, function and evolution of a complete human chromosome 8.

Authors:  Glennis A Logsdon; Mitchell R Vollger; PingHsun Hsieh; Yafei Mao; Mikhail A Liskovykh; Sergey Koren; Sergey Nurk; Ludovica Mercuri; Philip C Dishuck; Arang Rhie; Leonardo G de Lima; Tatiana Dvorkina; David Porubsky; William T Harvey; Alla Mikheenko; Andrey V Bzikadze; Milinn Kremitzki; Tina A Graves-Lindsay; Chirag Jain; Kendra Hoekzema; Shwetha C Murali; Katherine M Munson; Carl Baker; Melanie Sorensen; Alexandra M Lewis; Urvashi Surti; Jennifer L Gerton; Vladimir Larionov; Mario Ventura; Karen H Miga; Adam M Phillippy; Evan E Eichler
Journal:  Nature       Date:  2021-04-07       Impact factor: 69.504

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.