Literature DB >> 23966565

Assembler for de novo assembly of large genomes.

Te-Chin Chu1, Chen-Hua Lu, Tsunglin Liu, Greg C Lee, Wen-Hsiung Li, Arthur Chun-Chieh Shih.   

Abstract

Assembling a large genome using next generation sequencing reads requires large computer memory and a long execution time. To reduce these requirements, we propose an extension-based assembler, called JR-Assembler, where J and R stand for "jumping" extension and read "remapping." First, it uses the read count to select good quality reads as seeds. Second, it extends each seed by a whole-read extension process, which expedites the extension process and can jump over short repeats. Third, it uses a dynamic back trimming process to avoid extension termination due to sequencing errors. Fourth, it remaps reads to each assembled sequence, and if an assembly error occurs by the presence of a repeat, it breaks the contig at the repeat boundaries. Fifth, it applies a less stringent extension criterion to connect low-coverage regions. Finally, it merges contigs by unused reads. An extensive comparison of JR-Assembler with current assemblers using datasets from small, medium, and large genomes shows that JR-Assembler achieves a better or comparable overall assembly quality and requires lower memory use and less central processing unit time, especially for large genomes. Finally, a simulation study shows that JR-Assembler achieves a superior performance on memory use and central processing unit time than most current assemblers when the read length is 150 bp or longer, indicating that the advantages of JR-Assembler over current assemblers will increase as the read length increases with advances in next generation sequencing technology.

Mesh:

Year:  2013        PMID: 23966565      PMCID: PMC3767511          DOI: 10.1073/pnas.1314090110

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  17 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  GAGE: A critical evaluation of genome assemblies and assembly algorithms.

Authors:  Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke
Journal:  Genome Res       Date:  2012-01-06       Impact factor: 9.043

3.  High-quality draft assemblies of mammalian genomes from massively parallel sequence data.

Authors:  Sante Gnerre; Iain Maccallum; Dariusz Przybylski; Filipe J Ribeiro; Joshua N Burton; Bruce J Walker; Ted Sharpe; Giles Hall; Terrance P Shea; Sean Sykes; Aaron M Berlin; Daniel Aird; Maura Costello; Riza Daza; Louise Williams; Robert Nicol; Andreas Gnirke; Chad Nusbaum; Eric S Lander; David B Jaffe
Journal:  Proc Natl Acad Sci U S A       Date:  2010-12-27       Impact factor: 11.205

4.  A fast and symmetric DUST implementation to mask low-complexity DNA sequences.

Authors:  Aleksandr Morgulis; E Michael Gertz; Alejandro A Schäffer; Richa Agarwala
Journal:  J Comput Biol       Date:  2006-06       Impact factor: 1.479

5.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors:  Daniel R Zerbino; Ewan Birney
Journal:  Genome Res       Date:  2008-03-18       Impact factor: 9.043

6.  De novo fragment assembly with short mate-paired reads: Does the read length matter?

Authors:  Mark J Chaisson; Dumitru Brinza; Pavel A Pevzner
Journal:  Genome Res       Date:  2008-12-03       Impact factor: 9.043

7.  ABySS: a parallel assembler for short read sequence data.

Authors:  Jared T Simpson; Kim Wong; Shaun D Jackman; Jacqueline E Schein; Steven J M Jones; Inanç Birol
Journal:  Genome Res       Date:  2009-02-27       Impact factor: 9.043

8.  Efficient de novo assembly of large genomes using compressed data structures.

Authors:  Jared T Simpson; Richard Durbin
Journal:  Genome Res       Date:  2011-12-07       Impact factor: 9.043

9.  Quake: quality-aware detection and correction of sequencing errors.

Authors:  David R Kelley; Michael C Schatz; Steven L Salzberg
Journal:  Genome Biol       Date:  2010-11-29       Impact factor: 13.583

10.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors:  Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal:  Gigascience       Date:  2012-12-27       Impact factor: 6.524

View more
  13 in total

1.  Identifying wrong assemblies in de novo short read primary sequence assembly contigs.

Authors:  Vandna Chawla; Rajnish Kumar; Ravi Shankar
Journal:  J Biosci       Date:  2016-09       Impact factor: 1.826

2.  America's red gold: multiple lineages of cultivated cochineal in Mexico.

Authors:  Michael G Campana; Nelly M Robles García; Noreen Tuross
Journal:  Ecol Evol       Date:  2015-01-08       Impact factor: 2.912

3.  Remarkably Divergent Regions Punctuate the Genome Assembly of the Caenorhabditis elegans Hawaiian Strain CB4856.

Authors:  Owen A Thompson; L Basten Snoek; Harm Nijveen; Mark G Sterken; Rita J M Volkers; Rachel Brenchley; Arjen Van't Hof; Roel P J Bevers; Andrew R Cossins; Itai Yanai; Alex Hajnal; Tobias Schmid; Jaryn D Perkins; David Spencer; Leonid Kruglyak; Erik C Andersen; Donald G Moerman; LaDeana W Hillier; Jan E Kammenga; Robert H Waterston
Journal:  Genetics       Date:  2015-05-19       Impact factor: 4.562

Review 4.  A field guide to whole-genome sequencing, assembly and annotation.

Authors:  Robert Ekblom; Jochen B W Wolf
Journal:  Evol Appl       Date:  2014-06-24       Impact factor: 5.183

5.  Comparative Genomics of Pathogens Causing Brown Spot Disease of Tobacco: Alternaria longipes and Alternaria alternata.

Authors:  Yujie Hou; Xiao Ma; Wenting Wan; Ni Long; Jing Zhang; Yuntao Tan; Shengchang Duan; Yan Zeng; Yang Dong
Journal:  PLoS One       Date:  2016-05-09       Impact factor: 3.240

6.  Draft genome sequence of Fusicladium effusum, cause of pecan scab.

Authors:  Clive H Bock; Chunxian Chen; Fahong Yu; Katherine L Stevenson; Bruce W Wood
Journal:  Stand Genomic Sci       Date:  2016-06-03

7.  Assembling the Setaria italica L. Beauv. genome into nine chromosomes and insights into regions affecting growth and drought tolerance.

Authors:  Kevin J Tsai; Mei-Yeh Jade Lu; Kai-Jung Yang; Mengyun Li; Yuchuan Teng; Shihmay Chen; Maurice S B Ku; Wen-Hsiung Li
Journal:  Sci Rep       Date:  2016-10-13       Impact factor: 4.379

8.  Empirical evaluation of methods for de novo genome assembly.

Authors:  Firaol Dida; Gangman Yi
Journal:  PeerJ Comput Sci       Date:  2021-07-09

9.  Comprehensive insights in the Mycobacterium avium subsp. paratuberculosis genome using new WGS data of sheep strain JIII-386 from Germany.

Authors:  Petra Möbius; Martin Hölzer; Marius Felder; Gabriele Nordsiek; Marco Groth; Heike Köhler; Kathrin Reichwald; Matthias Platzer; Manja Marz
Journal:  Genome Biol Evol       Date:  2015-09-17       Impact factor: 3.416

10.  Genomes and virulence difference between two physiological races of Phytophthora nicotianae.

Authors:  Hui Liu; Xiao Ma; Haiqin Yu; Dunhuang Fang; Yongping Li; Xiao Wang; Wen Wang; Yang Dong; Bingguang Xiao
Journal:  Gigascience       Date:  2016-01-28       Impact factor: 6.524

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.