Literature DB >> 23990416

The MaSuRCA genome assembler.

Aleksey V Zimin1, Guillaume Marçais, Daniela Puiu, Michael Roberts, Steven L Salzberg, James A Yorke.   

Abstract

MOTIVATION: Second-generation sequencing technologies produce high coverage of the genome by short reads at a low cost, which has prompted development of new assembly methods. In particular, multiple algorithms based on de Bruijn graphs have been shown to be effective for the assembly problem. In this article, we describe a new hybrid approach that has the computational efficiency of de Bruijn graph methods and the flexibility of overlap-based assembly strategies, and which allows variable read lengths while tolerating a significant level of sequencing error. Our method transforms large numbers of paired-end reads into a much smaller number of longer 'super-reads'. The use of super-reads allows us to assemble combinations of Illumina reads of differing lengths together with longer reads from 454 and Sanger sequencing technologies, making it one of the few assemblers capable of handling such mixtures. We call our system the Maryland Super-Read Celera Assembler (abbreviated MaSuRCA and pronounced 'mazurka').
RESULTS: We evaluate the performance of MaSuRCA against two of the most widely used assemblers for Illumina data, Allpaths-LG and SOAPdenovo2, on two datasets from organisms for which high-quality assemblies are available: the bacterium Rhodobacter sphaeroides and chromosome 16 of the mouse genome. We show that MaSuRCA performs on par or better than Allpaths-LG and significantly better than SOAPdenovo on these data, when evaluated against the finished sequence. We then show that MaSuRCA can significantly improve its assemblies when the original data are augmented with long reads. AVAILABILITY: MaSuRCA is available as open-source code at ftp://ftp.genome.umd.edu/pub/MaSuRCA/. Previous (pre-publication) releases have been publicly available for over a year. CONTACT: alekseyz@ipst.umd.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2013        PMID: 23990416      PMCID: PMC3799473          DOI: 10.1093/bioinformatics/btt476

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  33 in total

1.  A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors:  Guillaume Marçais; Carl Kingsford
Journal:  Bioinformatics       Date:  2011-01-07       Impact factor: 6.937

2.  Short read fragment assembly of bacterial genomes.

Authors:  Mark J Chaisson; Pavel A Pevzner
Journal:  Genome Res       Date:  2007-12-14       Impact factor: 9.043

3.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

4.  Genome analyses of three strains of Rhodobacter sphaeroides: evidence of rapid evolution of chromosome II.

Authors:  M Choudhary; Xie Zanhua; Y X Fu; S Kaplan
Journal:  J Bacteriol       Date:  2006-12-15       Impact factor: 3.490

5.  ARACHNE: a whole-genome shotgun assembler.

Authors:  Serafim Batzoglou; David B Jaffe; Ken Stanley; Jonathan Butler; Sante Gnerre; Evan Mauceli; Bonnie Berger; Jill P Mesirov; Eric S Lander
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

6.  The phusion assembler.

Authors:  James C Mullikin; Zemin Ning
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

7.  Quake: quality-aware detection and correction of sequencing errors.

Authors:  David R Kelley; Michael C Schatz; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-11-29       Impact factor: 13.583

8.  QuorUM: An Error Corrector for Illumina Reads.

Authors:  Guillaume Marçais; James A Yorke; Aleksey Zimin
Journal:  PLoS One       Date:  2015-06-17       Impact factor: 3.240

9.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

10.  Aggressive assembly of pyrosequencing reads with mates.

Authors:  Jason R Miller; Arthur L Delcher; Sergey Koren; Eli Venter; Brian P Walenz; Anushka Brownley; Justin Johnson; Kelvin Li; Clark Mobarry; Granger Sutton
Journal:  Bioinformatics       Date:  2008-10-24       Impact factor: 6.937

View more
  429 in total

1.  hybridSPAdes: an algorithm for hybrid assembly of short and long reads.

Authors:  Dmitry Antipov; Anton Korobeynikov; Jeffrey S McLean; Pavel A Pevzner
Journal:  Bioinformatics       Date:  2015-11-20       Impact factor: 6.937

2.  Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis.

Authors:  Carter T Butts; Jan C Bierma; Rachel W Martin
Journal:  Proteins       Date:  2016-07-13

3.  SMCHD1 terminates the first embryonic genome activation event in mouse two-cell embryos and contributes to a transcriptionally repressive state.

Authors:  Meghan L Ruebel; Kailey A Vincent; Peter Z Schall; Kai Wang; Keith E Latham
Journal:  Am J Physiol Cell Physiol       Date:  2019-07-31       Impact factor: 4.249

4.  Tracking the NGS revolution: managing life science research on shared high-performance computing clusters.

Authors:  Martin Dahlö; Douglas G Scofield; Wesley Schaal; Ola Spjuth
Journal:  Gigascience       Date:  2018-05-01       Impact factor: 6.524

5.  Development and evolution of age-dependent defenses in ant-acacias.

Authors:  Aaron R Leichty; R Scott Poethig
Journal:  Proc Natl Acad Sci U S A       Date:  2019-07-15       Impact factor: 11.205

6.  Comparative genomics of rice false smut fungi Ustilaginoidea virens Uv-Gvt strain from India reveals genetic diversity and phylogenetic divergence.

Authors:  Devanna Pramesh; Muthukapalli K Prasannakumar; Kondarajanahally M Muniraju; H B Mahesh; H D Pushpa; Channappa Manjunatha; Alase Saddamhusen; E Chidanandappa; Manoj K Yadav; Masalavada K Kumara; Huded Sharanabasav; B S Rohith; Gaurab Banerjee; Anupam J Das
Journal:  3 Biotech       Date:  2020-07-19       Impact factor: 2.406

7.  High-Throughput Genotyping Technologies in Plant Taxonomy.

Authors:  Monica F Danilevicz; Cassandria G Tay Fernandez; Jacob I Marsh; Philipp E Bayer; David Edwards
Journal:  Methods Mol Biol       Date:  2021

8.  Mixed transmission modes and dynamic genome evolution in an obligate animal-bacterial symbiosis.

Authors:  Shelbi L Russell; Russell B Corbett-Detig; Colleen M Cavanaugh
Journal:  ISME J       Date:  2017-02-24       Impact factor: 10.302

9.  The Draft Genome of a Flat Peach (Prunus persica L. cv. '124 Pan') Provides Insights into Its Good Fruit Flavor Traits.

Authors:  Aidi Zhang; Hui Zhou; Xiaohan Jiang; Yuepeng Han; Xiujun Zhang
Journal:  Plants (Basel)       Date:  2021-03-12

10.  Chromosome-scale genome assembly of sweet cherry (Prunus avium L.) cv. Tieton obtained using long-read and Hi-C sequencing.

Authors:  Jiawei Wang; Weizhen Liu; Dongzi Zhu; Po Hong; Shizhong Zhang; Shijun Xiao; Yue Tan; Xin Chen; Li Xu; Xiaojuan Zong; Lisi Zhang; Hairong Wei; Xiaohui Yuan; Qingzhong Liu
Journal:  Hortic Res       Date:  2020-08-01       Impact factor: 6.793

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.