| Literature DB >> 17490482 |
Qunfeng Dong1, Matthew D Wilkerson, Volker Brendel.
Abstract
BACKGROUND: Whole genome shotgun sequencing produces increasingly higher coverage of a genome with random sequence reads. Progressive whole genome assembly and eventual finishing sequencing is a process that typically takes several years for large eukaryotic genomes. In the interim, all sequence reads of public sequencing projects are made available in repositories such as the NCBI Trace Archive. For a particular locus, sequencing coverage may be high enough early on to produce a reliable local genome assembly. We have developed software, Tracembler, that facilitates in silico chromosome walking by recursively assembling reads of a selected species from the NCBI Trace Archive starting with reads that significantly match sequence seeds supplied by the user.Entities:
Mesh:
Year: 2007 PMID: 17490482 PMCID: PMC1876249 DOI: 10.1186/1471-2105-8-151
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Schematic overview of the . Tracembler accepts as input one or more user-supplied query sequences and parameter specifications. The query sequence(s) and associated parameters are submitted using the QBLAST URL API to NCBI [25] (1). Tracembler analyzes these results, and if there are new sequences matching at a significance level below the user supplied E-value parameter, these sequences are used as queries in a new BLAST search (2). One round consists of BLAST searches of all acceptable matching sequences from the previous round. This process is repeated in a recursive manner until either all matching sequences are exhausted, a user-defined maximum round of recursion is reached, or a user defined maximum total number of BLAST queries is reached. For the final set of sequences, quality score and mate-pair distance constraint information is retrieved from NCBI. These sequences are assembled using CAP3 (3). Finally, novel contigs are compared to the query sequences using BLAST for local alignment and GenomeThreader for spliced-alignment (4).