Literature DB >> 25995232

When less is more: 'slicing' sequencing data improves read decoding accuracy and de novo assembly quality.

Stefano Lonardi1, Hamid Mirebrahim1, Steve Wanamaker1, Matthew Alpert1, Gianfranco Ciardo1, Denisa Duma2, Timothy J Close1.   

Abstract

MOTIVATION: As the invention of DNA sequencing in the 70s, computational biologists have had to deal with the problem of de novo genome assembly with limited (or insufficient) depth of sequencing. In this work, we investigate the opposite problem, that is, the challenge of dealing with excessive depth of sequencing.
RESULTS: We explore the effect of ultra-deep sequencing data in two domains: (i) the problem of decoding reads to bacterial artificial chromosome (BAC) clones (in the context of the combinatorial pooling design we have recently proposed), and (ii) the problem of de novo assembly of BAC clones. Using real ultra-deep sequencing data, we show that when the depth of sequencing increases over a certain threshold, sequencing errors make these two problems harder and harder (instead of easier, as one would expect with error-free data), and as a consequence the quality of the solution degrades with more and more data. For the first problem, we propose an effective solution based on 'divide and conquer': we 'slice' a large dataset into smaller samples of optimal size, decode each slice independently, and then merge the results. Experimental results on over 15 000 barley BACs and over 4000 cowpea BACs demonstrate a significant improvement in the quality of the decoding and the final assembly. For the second problem, we show for the first time that modern de novo assemblers cannot take advantage of ultra-deep sequencing data.
AVAILABILITY AND IMPLEMENTATION: Python scripts to process slices and resolve decoding conflicts are available from http://goo.gl/YXgdHT; software Hashfilter can be downloaded from http://goo.gl/MIyZHs CONTACT: stelo@cs.ucr.edu or timothy.close@ucr.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 25995232     DOI: 10.1093/bioinformatics/btv311

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

1.  Complete genome sequence of Pseudoalteromonas phage vB_PspS-H40/1 (formerly H40/1) that infects Pseudoalteromonas sp. strain H40 and is used as biological tracer in hydrological transport studies.

Authors:  René Kallies; Bärbel Kiesel; Matthias Schmidt; Johannes Kacza; Nawras Ghanem; Anja Narr; Jakob Zopfi; Lukas Y Wick; Jörg Hackermüller; Hauke Harms; Antonis Chatzinotas
Journal:  Stand Genomic Sci       Date:  2017-02-02

Review 2.  Interpreting Microbial Biosynthesis in the Genomic Age: Biological and Practical Considerations.

Authors:  Ian J Miller; Marc G Chevrette; Jason C Kwan
Journal:  Mar Drugs       Date:  2017-06-06       Impact factor: 5.118

3.  Comparative analysis of de novo assemblers for variation discovery in personal genomes.

Authors:  Shulan Tian; Huihuang Yan; Eric W Klee; Michael Kalmbach; Susan L Slager
Journal:  Brief Bioinform       Date:  2018-09-28       Impact factor: 11.622

4.  The Genome Sequence of the Octocoral Paramuricea clavata - A Key Resource To Study the Impact of Climate Change in the Mediterranean.

Authors:  Jean-Baptiste Ledoux; Fernando Cruz; Jèssica Gómez-Garrido; Regina Antoni; Julie Blanc; Daniel Gómez-Gras; Silvija Kipson; Paula López-Sendino; Agostinho Antunes; Cristina Linares; Marta Gut; Tyler Alioto; Joaquim Garrabou
Journal:  G3 (Bethesda)       Date:  2020-09-02       Impact factor: 3.154

5.  Complete genome sequence of bacteriophage P26218 infecting Rhodoferax sp. strain IMCC26218.

Authors:  Kira Moon; Ilnam Kang; Suhyun Kim; Jang-Cheon Cho; Sang-Jong Kim
Journal:  Stand Genomic Sci       Date:  2015-11-24

6.  Comparing Apples and Oranges?: Next Generation Sequencing and Its Impact on Microbiome Analysis.

Authors:  Adam G Clooney; Fiona Fouhy; Roy D Sleator; Aisling O' Driscoll; Catherine Stanton; Paul D Cotter; Marcus J Claesson
Journal:  PLoS One       Date:  2016-02-05       Impact factor: 3.240

7.  Sequencing of 15 622 gene-bearing BACs clarifies the gene-dense regions of the barley genome.

Authors:  María Muñoz-Amatriaín; Stefano Lonardi; MingCheng Luo; Kavitha Madishetty; Jan T Svensson; Matthew J Moscou; Steve Wanamaker; Tao Jiang; Andris Kleinhofs; Gary J Muehlbauer; Roger P Wise; Nils Stein; Yaqin Ma; Edmundo Rodriguez; Dave Kudrna; Prasanna R Bhat; Shiaoman Chao; Pascal Condamine; Shane Heinen; Josh Resnik; Rod Wing; Heather N Witt; Matthew Alpert; Marco Beccuti; Serdar Bozdag; Francesca Cordero; Hamid Mirebrahim; Rachid Ounit; Yonghui Wu; Frank You; Jie Zheng; Hana Simková; Jaroslav Dolezel; Jane Grimwood; Jeremy Schmutz; Denisa Duma; Lothar Altschmied; Tom Blake; Phil Bregitzer; Laurel Cooper; Muharrem Dilbirligi; Anders Falk; Leila Feiz; Andreas Graner; Perry Gustafson; Patrick M Hayes; Peggy Lemaux; Jafar Mammadov; Timothy J Close
Journal:  Plant J       Date:  2015-09-21       Impact factor: 6.417

Review 8.  Studying the gut virome in the metagenomic era: challenges and perspectives.

Authors:  Sanzhima Garmaeva; Trishla Sinha; Alexander Kurilshikov; Jingyuan Fu; Cisca Wijmenga; Alexandra Zhernakova
Journal:  BMC Biol       Date:  2019-10-28       Impact factor: 7.431

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.