Literature DB >> 9750196

Automated sequence preprocessing in a large-scale sequencing environment.

M C Wendl1, S Dear, D Hodgson, L Hillier.   

Abstract

A software system for transforming fragments from four-color fluorescence-based gel electrophoresis experiments into assembled sequence is described. It has been developed for large-scale processing of all trace data, including shotgun and finishing reads, regardless of clone origin. Design considerations are discussed in detail, as are programming implementation and graphic tools. The importance of input validation, record tracking, and use of base quality values is emphasized. Several quality analysis metrics are proposed and applied to sample results from recently sequenced clones. Such quantities prove to be a valuable aid in evaluating modifications of sequencing protocol. The system is in full production use at both the Genome Sequencing Center and the Sanger Centre, for which combined weekly production is approximately 100, 000 sequencing reads per week.

Mesh:

Year:  1998        PMID: 9750196      PMCID: PMC310779          DOI: 10.1101/gr.8.9.975

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  14 in total

1.  A standard file format for data from DNA sequencing instruments.

Authors:  S Dear; R Staden
Journal:  DNA Seq       Date:  1992

2.  A trace display and editing program for data from fluorescence based sequencing machines.

Authors:  T Gleeson; L Hillier
Journal:  Nucleic Acids Res       Date:  1991-12-11       Impact factor: 16.971

Review 3.  The Staden sequence analysis package.

Authors:  R Staden
Journal:  Mol Biotechnol       Date:  1996-06       Impact factor: 2.695

Review 4.  The human genome project: past, present, and future.

Authors:  J D Watson
Journal:  Science       Date:  1990-04-06       Impact factor: 47.728

5.  Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

Authors:  B Ewing; L Hillier; M C Wendl; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

6.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

7.  Experiment files and their application during large-scale sequencing projects.

Authors:  J K Bonfield; R Staden
Journal:  DNA Seq       Date:  1996

8.  Lane tracking software for four-color fluorescence-based electrophoretic gel images.

Authors:  M L Cooper; D R Maffitt; J D Parsons; L Hillier; D J States
Journal:  Genome Res       Date:  1996-11       Impact factor: 9.043

9.  Hopper: software for automating data tracking and flow in DNA sequencing.

Authors:  T M Smith; C Abajian; L Hood
Journal:  Comput Appl Biosci       Date:  1997-04

10.  NIH launches the final push to sequence the genome.

Authors:  E Marshall; E Pennisi
Journal:  Science       Date:  1996-04-12       Impact factor: 47.728

View more
  11 in total

1.  PipeOnline 2.0: automated EST processing and functional data sorting.

Authors:  Patricia Ayoubi; Xiaojing Jin; Saul Leite; Xianghui Liu; Jeson Martajaja; Abdurashid Abduraham; Qiaolan Wan; Wei Yan; Eduardo Misawa; Rolf A Prade
Journal:  Nucleic Acids Res       Date:  2002-11-01       Impact factor: 16.971

Review 2.  The Ensembl core software libraries.

Authors:  Arne Stabenau; Graham McVicker; Craig Melsopp; Glenn Proctor; Michele Clamp; Ewan Birney
Journal:  Genome Res       Date:  2004-05       Impact factor: 9.043

3.  Comparisons among two fertile and three male-sterile mitochondrial genomes of maize.

Authors:  James O Allen; Christiane M Fauron; Patrick Minx; Leah Roark; Swetha Oddiraju; Guan Ning Lin; Louis Meyer; Hui Sun; Kyung Kim; Chunyan Wang; Feiyu Du; Dong Xu; Michael Gibson; Jill Cifrese; Sandra W Clifton; Kathleen J Newton
Journal:  Genetics       Date:  2007-07-29       Impact factor: 4.562

4.  The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics.

Authors:  Lincoln D Stein; Zhirong Bao; Darin Blasiar; Thomas Blumenthal; Michael R Brent; Nansheng Chen; Asif Chinwalla; Laura Clarke; Chris Clee; Avril Coghlan; Alan Coulson; Peter D'Eustachio; David H A Fitch; Lucinda A Fulton; Robert E Fulton; Sam Griffiths-Jones; Todd W Harris; LaDeana W Hillier; Ravi Kamath; Patricia E Kuwabara; Elaine R Mardis; Marco A Marra; Tracie L Miner; Patrick Minx; James C Mullikin; Robert W Plumb; Jane Rogers; Jacqueline E Schein; Marc Sohrmann; John Spieth; Jason E Stajich; C Wei; David Willey; Richard K Wilson; Richard Durbin; Robert H Waterston
Journal:  PLoS Biol       Date:  2003-11-17       Impact factor: 8.029

5.  Sequence and comparative analysis of the maize NB mitochondrial genome.

Authors:  Sandra W Clifton; Patrick Minx; Christiane M-R Fauron; Michael Gibson; James O Allen; Hui Sun; Melissa Thompson; W Brad Barbazuk; Suman Kanuganti; Catherine Tayloe; Louis Meyer; Richard K Wilson; Kathleen J Newton
Journal:  Plant Physiol       Date:  2004-11       Impact factor: 8.340

6.  Genome-wide end-sequenced BAC resources for the NOD/MrkTac() and NOD/ShiLtJ() mouse genomes.

Authors:  Charles A Steward; Sean Humphray; Bob Plumb; Matthew C Jones; Michael A Quail; Stephen Rice; Tony Cox; Rob Davies; James Bonfield; Thomas M Keane; Michael Nefedov; Pieter J de Jong; Paul Lyons; Linda Wicker; John Todd; Yoshihide Hayashizaki; Omid Gulban; Jayne Danska; Jen Harrow; Tim Hubbard; Jane Rogers; David J Adams
Journal:  Genomics       Date:  2009-11-10       Impact factor: 5.736

7.  The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes.

Authors:  Charles A Steward; Jose M Gonzalez; Steve Trevanion; Dan Sheppard; Giselle Kerry; James G R Gilbert; Linda S Wicker; Jane Rogers; Jennifer L Harrow
Journal:  Database (Oxford)       Date:  2013-05-31       Impact factor: 3.451

8.  MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools.

Authors:  Chun Liang; Feng Sun; Haiming Wang; Junfeng Qu; Robert M Freeman; Lee H Pratt; Marie-Michèle Cordonnier-Pratt
Journal:  BMC Bioinformatics       Date:  2006-03-07       Impact factor: 3.169

9.  Design and implementation of a generalized laboratory data model.

Authors:  Michael C Wendl; Scott Smith; Craig S Pohl; David J Dooling; Asif T Chinwalla; Kevin Crouse; Todd Hepler; Shin Leong; Lynn Carmichael; Mike Nhan; Benjamin J Oberkfell; Elaine R Mardis; LaDeana W Hillier; Richard K Wilson
Journal:  BMC Bioinformatics       Date:  2007-09-26       Impact factor: 3.169

10.  A novel approach to sequence validating protein expression clones with automated decision making.

Authors:  Elena Taycher; Andreas Rolfs; Yanhui Hu; Dongmei Zuo; Stephanie E Mohr; Janice Williamson; Joshua Labaer
Journal:  BMC Bioinformatics       Date:  2007-06-13       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.