Literature DB >> 21636596

Comparative studies of de novo assembly tools for next-generation sequencing technologies.

Yong Lin1, Jian Li, Hui Shen, Lei Zhang, Christopher J Papasian, Hong-Wen Deng.   

Abstract

MOTIVATION: Several new de novo assembly tools have been developed recently to assemble short sequencing reads generated by next-generation sequencing platforms. However, the performance of these tools under various conditions has not been fully investigated, and sufficient information is not currently available for informed decisions to be made regarding the tool that would be most likely to produce the best performance under a specific set of conditions.
RESULTS: We studied and compared the performance of commonly used de novo assembly tools specifically designed for next-generation sequencing data, including SSAKE, VCAKE, Euler-sr, Edena, Velvet, ABySS and SOAPdenovo. Tools were compared using several performance criteria, including N50 length, sequence coverage and assembly accuracy. Various properties of read data, including single-end/paired-end, sequence GC content, depth of coverage and base calling error rates, were investigated for their effects on the performance of different assembly tools. We also compared the computation time and memory usage of these seven tools. Based on the results of our comparison, the relative performance of individual tools are summarized and tentative guidelines for optimal selection of different assembly tools, under different conditions, are provided.

Mesh:

Year:  2011        PMID: 21636596      PMCID: PMC3137213          DOI: 10.1093/bioinformatics/btr319

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

Review 2.  Should the draft chimpanzee sequence be finished?

Authors:  Stefan Taudien; Ingo Ebersberger; Gernot Glöckner; Matthias Platzer
Journal:  Trends Genet       Date:  2006-01-10       Impact factor: 11.639

Review 3.  Whole-genome re-sequencing.

Authors:  David R Bentley
Journal:  Curr Opin Genet Dev       Date:  2006-10-18       Impact factor: 5.578

4.  SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.

Authors:  Juliane C Dohm; Claudio Lottaz; Tatiana Borodina; Heinz Himmelbauer
Journal:  Genome Res       Date:  2007-10-01       Impact factor: 9.043

5.  Extending assembly of short DNA sequences to handle error.

Authors:  William R Jeck; Josephine A Reinhardt; David A Baltrus; Matthew T Hickenbotham; Vincent Magrini; Elaine R Mardis; Jeffery L Dangl; Corbin D Jones
Journal:  Bioinformatics       Date:  2007-09-24       Impact factor: 6.937

Review 6.  Next-generation sequencing transforms today's biology.

Authors:  Stephan C Schuster
Journal:  Nat Methods       Date:  2007-12-19       Impact factor: 28.547

7.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

8.  DNA sequencing with chain-terminating inhibitors.

Authors:  F Sanger; S Nicklen; A R Coulson
Journal:  Proc Natl Acad Sci U S A       Date:  1977-12       Impact factor: 11.205

9.  Human-mouse alignments with BLASTZ.

Authors:  Scott Schwartz; W James Kent; Arian Smit; Zheng Zhang; Robert Baertsch; Ross C Hardison; David Haussler; Webb Miller
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

10.  Assembling millions of short DNA sequences using SSAKE.

Authors:  René L Warren; Granger G Sutton; Steven J M Jones; Robert A Holt
Journal:  Bioinformatics       Date:  2006-12-08       Impact factor: 6.937

View more
  56 in total

1.  Population clustering based on copy number variations detected from next generation sequencing data.

Authors:  Junbo Duan; Ji-Gang Zhang; Mingxi Wan; Hong-Wen Deng; Yu-Ping Wang
Journal:  J Bioinform Comput Biol       Date:  2014-08-19       Impact factor: 1.122

2.  Assemblathon 1: a competitive assessment of de novo short read assembly methods.

Authors:  Dent Earl; Keith Bradnam; John St John; Aaron Darling; Dawei Lin; Joseph Fass; Hung On Ken Yu; Vince Buffalo; Daniel R Zerbino; Mark Diekhans; Ngan Nguyen; Pramila Nuwantha Ariyaratne; Wing-Kin Sung; Zemin Ning; Matthias Haimel; Jared T Simpson; Nuno A Fonseca; İnanç Birol; T Roderick Docking; Isaac Y Ho; Daniel S Rokhsar; Rayan Chikhi; Dominique Lavenier; Guillaume Chapuis; Delphine Naquin; Nicolas Maillet; Michael C Schatz; David R Kelley; Adam M Phillippy; Sergey Koren; Shiaw-Pyng Yang; Wei Wu; Wen-Chi Chou; Anuj Srivastava; Timothy I Shaw; J Graham Ruby; Peter Skewes-Cox; Miguel Betegon; Michelle T Dimon; Victor Solovyev; Igor Seledtsov; Petr Kosarev; Denis Vorobyev; Ricardo Ramirez-Gonzalez; Richard Leggett; Dan MacLean; Fangfang Xia; Ruibang Luo; Zhenyu Li; Yinlong Xie; Binghang Liu; Sante Gnerre; Iain MacCallum; Dariusz Przybylski; Filipe J Ribeiro; Shuangye Yin; Ted Sharpe; Giles Hall; Paul J Kersey; Richard Durbin; Shaun D Jackman; Jarrod A Chapman; Xiaoqiu Huang; Joseph L DeRisi; Mario Caccamo; Yingrui Li; David B Jaffe; Richard E Green; David Haussler; Ian Korf; Benedict Paten
Journal:  Genome Res       Date:  2011-09-16       Impact factor: 9.043

3.  Non-referenced genome assembly from epigenomic short-read data.

Authors:  Antony Kaspi; Mark Ziemann; Samuel T Keating; Ishant Khurana; Timothy Connor; Briana Spolding; Adrian Cooper; Ross Lazarus; Ken Walder; Paul Zimmet; Assam El-Osta
Journal:  Epigenetics       Date:  2014-10       Impact factor: 4.528

Review 4.  Sequence assembly demystified.

Authors:  Niranjan Nagarajan; Mihai Pop
Journal:  Nat Rev Genet       Date:  2013-01-29       Impact factor: 53.242

5.  Detection of common copy number variation with application to population clustering from next generation sequencing data.

Authors:  Junbo Duan; Ji-Gang Zhang; Hong-Wen Deng; Yu-Ping Wang
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2012

6.  Analysis of DNA sequence variants detected by high-throughput sequencing.

Authors:  David R Adams; Murat Sincan; Karin Fuentes Fajardo; James C Mullikin; Tyler M Pierson; Camilo Toro; Cornelius F Boerkoel; Cynthia J Tifft; William A Gahl; Tom C Markello
Journal:  Hum Mutat       Date:  2012-02-28       Impact factor: 4.878

Review 7.  Next-generation sequencing and large genome assemblies.

Authors:  Joseph Henson; German Tischler; Zemin Ning
Journal:  Pharmacogenomics       Date:  2012-06       Impact factor: 2.533

8.  Common copy number variation detection from multiple sequenced samples.

Authors:  Junbo Duan; Hong-Wen Deng; Yu-Ping Wang
Journal:  IEEE Trans Biomed Eng       Date:  2014-03       Impact factor: 4.538

9.  Library preparation and data analysis packages for rapid genome sequencing.

Authors:  Kyle R Pomraning; Kristina M Smith; Erin L Bredeweg; Lanelle R Connolly; Pallavi A Phatale; Michael Freitag
Journal:  Methods Mol Biol       Date:  2012

10.  Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome.

Authors:  Yajun Wang; Yao Yu; Bohu Pan; Pei Hao; Yixue Li; Zhifeng Shao; Xiaogang Xu; Xuan Li
Journal:  BMC Syst Biol       Date:  2012-12-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.