Literature DB >> 11779843

ARACHNE: a whole-genome shotgun assembler.

Serafim Batzoglou1, David B Jaffe, Ken Stanley, Jonathan Butler, Sante Gnerre, Evan Mauceli, Bonnie Berger, Jill P Mesirov, Eric S Lander.   

Abstract

We describe a new computer system, called ARACHNE, for assembling genome sequence using paired-end whole-genome shotgun reads. ARACHNE has several key features, including an efficient and sensitive procedure for finding read overlaps, a procedure for scoring overlaps that achieves high accuracy by correcting errors before assembly, read merger based on forward-reverse links, and detection of repeat contigs by forward-reverse link inconsistency. To test ARACHNE, we created simulated reads providing approximately 10-fold coverage of the genomes of H. influenzae, S. cerevisiae, and D. melanogaster, as well as human chromosomes 21 and 22. The assemblies of these simulated reads yielded nearly complete coverage of the respective genomes, with a small number of contigs joined into a smaller number of supercontigs (or scaffolds). For example, analysis of the D. melanogaster genome yielded approximately 98% coverage with an N50 contig length of 324 kb and an N50 supercontig length of 5143 kb. The assembly accuracy was high, although not perfect: small errors occurred at a frequency of roughly 1 per 1 Mb (typically, deletion of approximately 1 kb in size), with a very small number of other misassemblies. The assembly was rapid: the Drosophila assembly required only 21 hours on a single 667 MHz processor and used 8.4 Gb of memory.

Entities:  

Mesh:

Year:  2002        PMID: 11779843      PMCID: PMC155255          DOI: 10.1101/gr.208902

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  20 in total

1.  CAP3: A DNA sequence assembly program.

Authors:  X Huang; A Madan
Journal:  Genome Res       Date:  1999-09       Impact factor: 9.043

2.  The DNA sequence of human chromosome 21.

Authors:  M Hattori; A Fujiyama; T D Taylor; H Watanabe; T Yada; H S Park; A Toyoda; K Ishii; Y Totoki; D K Choi; Y Groner; E Soeda; M Ohki; T Takagi; Y Sakaki; S Taudien; K Blechschmidt; A Polley; U Menzel; J Delabar; K Kumpf; R Lehmann; D Patterson; K Reichwald; A Rump; M Schillhabel; A Schudy; W Zimmermann; A Rosenthal; J Kudoh; K Schibuya; K Kawasaki; S Asakawa; A Shintani; T Sasaki; K Nagamine; S Mitsuyama; S E Antonarakis; S Minoshima; N Shimizu; G Nordsiek; K Hornischer; P Brant; M Scharfe; O Schon; A Desario; J Reichelt; G Kauer; H Blocker; J Ramser; A Beck; S Klages; S Hennig; L Riesselmann; E Dagand; T Haaf; S Wehrmeyer; K Borzym; K Gardiner; D Nizetic; F Francis; H Lehrach; R Reinhardt; M L Yaspo
Journal:  Nature       Date:  2000-05-18       Impact factor: 49.962

3.  AMASS: a structured pattern matching approach to shotgun sequence assembly.

Authors:  S Kim; A M Segre
Journal:  J Comput Biol       Date:  1999       Impact factor: 1.479

4.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

5.  A whole-genome assembly of Drosophila.

Authors:  E W Myers; G G Sutton; A L Delcher; I M Dew; D P Fasulo; M J Flanigan; S A Kravitz; C M Mobarry; K H Reinert; K A Remington; E L Anson; R A Bolanos; H H Chou; C M Jordan; A L Halpern; S Lonardi; E M Beasley; R C Brandon; L Chen; P J Dunn; Z Lai; Y Liang; D R Nusskern; M Zhan; Q Zhang; X Zheng; G M Rubin; M D Adams; J C Venter
Journal:  Science       Date:  2000-03-24       Impact factor: 47.728

6.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

7.  Nucleotide sequence of bacteriophage lambda DNA.

Authors:  F Sanger; A R Coulson; G F Hong; D F Hill; G B Petersen
Journal:  J Mol Biol       Date:  1982-12-25       Impact factor: 5.469

8.  Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.

Authors: 
Journal:  Nature       Date:  2000-12-14       Impact factor: 49.962

9.  DNA sequencing with chain-terminating inhibitors.

Authors:  F Sanger; S Nicklen; A R Coulson
Journal:  Proc Natl Acad Sci U S A       Date:  1977-12       Impact factor: 11.205

10.  The genome sequence of Drosophila melanogaster.

Authors:  M D Adams; S E Celniker; R A Holt; C A Evans; J D Gocayne; P G Amanatides; S E Scherer; P W Li; R A Hoskins; R F Galle; R A George; S E Lewis; S Richards; M Ashburner; S N Henderson; G G Sutton; J R Wortman; M D Yandell; Q Zhang; L X Chen; R C Brandon; Y H Rogers; R G Blazej; M Champe; B D Pfeiffer; K H Wan; C Doyle; E G Baxter; G Helt; C R Nelson; G L Gabor; J F Abril; A Agbayani; H J An; C Andrews-Pfannkoch; D Baldwin; R M Ballew; A Basu; J Baxendale; L Bayraktaroglu; E M Beasley; K Y Beeson; P V Benos; B P Berman; D Bhandari; S Bolshakov; D Borkova; M R Botchan; J Bouck; P Brokstein; P Brottier; K C Burtis; D A Busam; H Butler; E Cadieu; A Center; I Chandra; J M Cherry; S Cawley; C Dahlke; L B Davenport; P Davies; B de Pablos; A Delcher; Z Deng; A D Mays; I Dew; S M Dietz; K Dodson; L E Doup; M Downes; S Dugan-Rocha; B C Dunkov; P Dunn; K J Durbin; C C Evangelista; C Ferraz; S Ferriera; W Fleischmann; C Fosler; A E Gabrielian; N S Garg; W M Gelbart; K Glasser; A Glodek; F Gong; J H Gorrell; Z Gu; P Guan; M Harris; N L Harris; D Harvey; T J Heiman; J R Hernandez; J Houck; D Hostin; K A Houston; T J Howland; M H Wei; C Ibegwam; M Jalali; F Kalush; G H Karpen; Z Ke; J A Kennison; K A Ketchum; B E Kimmel; C D Kodira; C Kraft; S Kravitz; D Kulp; Z Lai; P Lasko; Y Lei; A A Levitsky; J Li; Z Li; Y Liang; X Lin; X Liu; B Mattei; T C McIntosh; M P McLeod; D McPherson; G Merkulov; N V Milshina; C Mobarry; J Morris; A Moshrefi; S M Mount; M Moy; B Murphy; L Murphy; D M Muzny; D L Nelson; D R Nelson; K A Nelson; K Nixon; D R Nusskern; J M Pacleb; M Palazzolo; G S Pittman; S Pan; J Pollard; V Puri; M G Reese; K Reinert; K Remington; R D Saunders; F Scheeler; H Shen; B C Shue; I Sidén-Kiamos; M Simpson; M P Skupski; T Smith; E Spier; A C Spradling; M Stapleton; R Strong; E Sun; R Svirskas; C Tector; R Turner; E Venter; A H Wang; X Wang; Z Y Wang; D A Wassarman; G M Weinstock; J Weissenbach; S M Williams; K C Worley; D Wu; S Yang; Q A Yao; J Ye; R F Yeh; J S Zaveri; M Zhan; G Zhang; Q Zhao; L Zheng; X H Zheng; F N Zhong; W Zhong; X Zhou; S Zhu; X Zhu; H O Smith; R A Gibbs; E W Myers; G M Rubin; J C Venter
Journal:  Science       Date:  2000-03-24       Impact factor: 47.728

View more
  175 in total

1.  RePS: a sequence assembler that masks exact repeats identified from the shotgun data.

Authors:  Jun Wang; Gane Ka-Shu Wong; Peixiang Ni; Yujun Han; Xiangang Huang; Jianguo Zhang; Chen Ye; Yong Zhang; Jianfei Hu; Kunlin Zhang; Xin Xu; Lijuan Cong; Hong Lu; Xide Ren; Xiaoyu Ren; Jun He; Lin Tao; Douglas A Passey; Jian Wang; Huanming Yang; Jun Yu; Songgang Li
Journal:  Genome Res       Date:  2002-05       Impact factor: 9.043

2.  More on the sequencing of the human genome.

Authors:  Robert H Waterston; Eric S Lander; John E Sulston
Journal:  Proc Natl Acad Sci U S A       Date:  2003-03-11       Impact factor: 11.205

3.  Automated correction of genome sequence errors.

Authors:  Pawel Gajer; Michael Schatz; Steven L Salzberg
Journal:  Nucleic Acids Res       Date:  2004-01-26       Impact factor: 16.971

4.  Hierarchical scaffolding with Bambus.

Authors:  Mihai Pop; Daniel S Kosack; Steven L Salzberg
Journal:  Genome Res       Date:  2004-01       Impact factor: 9.043

5.  Correcting errors in shotgun sequences.

Authors:  Martti T Tammi; Erik Arner; Ellen Kindlund; Björn Andersson
Journal:  Nucleic Acids Res       Date:  2003-08-01       Impact factor: 16.971

6.  PCAP: a whole-genome assembly program.

Authors:  Xiaoqiu Huang; Jianmin Wang; Srinivas Aluru; Shiaw-Pyng Yang; LaDeana Hillier
Journal:  Genome Res       Date:  2003-09       Impact factor: 9.043

7.  Fosmid-based physical mapping of the Histoplasma capsulatum genome.

Authors:  Vincent Magrini; Wesley C Warren; John Wallis; William E Goldman; Jian Xu; Elaine R Mardis; John D McPherson
Journal:  Genome Res       Date:  2004-08       Impact factor: 9.043

8.  De novo repeat classification and fragment assembly.

Authors:  Pavel A Pevzner; Paul A Pevzner; Haixu Tang; Glenn Tesler
Journal:  Genome Res       Date:  2004-09       Impact factor: 9.043

9.  The Atlas genome assembly system.

Authors:  Paul Havlak; Rui Chen; K James Durbin; Amy Egan; Yanru Ren; Xing-Zhi Song; George M Weinstock; Richard A Gibbs
Journal:  Genome Res       Date:  2004-04       Impact factor: 9.043

10.  Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies.

Authors:  Michael C Schatz; Adam M Phillippy; Daniel D Sommer; Arthur L Delcher; Daniela Puiu; Giuseppe Narzisi; Steven L Salzberg; Mihai Pop
Journal:  Brief Bioinform       Date:  2011-12-23       Impact factor: 11.622

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.